E-commerce Funnel Analysis with StarRocks: 87 Million Records, Apache Hudi, Apache Iceberg, Delta Lake (MinIO, Apache HMS, Apache xTable)
By the way, I’m Albert, and I’m Head of Community and Developer Relations at CelerData. CelerData is a $60 million VC-funded startup that is building StarRocks, an open-source replacement for Snowflake, Big Query, RedShift and Databricks SQL warehouse.
In this tutorial, I’ll guide you through working with an 87 million record e-commerce dataset. The data is initially stored in Hudi format and resides on MinIO S3 storage. We’ll then use StarRocks to perform queries involving JOINs (3 queries). Additionally, the tutorial covers creating mirrored datasets in Iceberg and Delta Lake formats using the xtable library.
See more at https://github.com/StarRocks/demo/tree/master/documentation-samples/datalakehouse