As a Data Warehouse user, which open table format should I use, and who is backing who?

Albert Wong
2 min readFeb 24, 2024

Spicy info. Word on the street. That’s what I deal in.

By the way, I’m Albert, and I’m Head of Community and Developer Relations at CelerData. CelerData is a $60 million VC-funded startup that is building StarRocks, an open-source replacement for Snowflake, Big Query, RedShift, and Databricks SQL Warehouse.

Links for reference

What are my takeaways?

Interesting is who is backing who.

I am not that surprised by Databricks (primary sponsor of Delta Lake) and Azure. As always, follow the money (and investments).

Vendor preferences for Open Table Formats.

I’m a bit surprised. I thought Iceberg would have more contributions outside of Tabular. I also wish that Onetable.dev was part of this matrix. Maybe I should make one :-)

Which reminds me, I’m a big fan of Onetable.dev. If you’re not familiar, Onetable allows you to convert one open table format to another without database re-ingestion. Another big ability is to help register Delta Lake, Hudi and Iceberg files into Apache Hive Metastore (HMS) so that databases like StarRocks and Trino can query the data.

I’ve got a demo of writing to hudi and then converting the data to iceberg and delta lake. Check it out at https://github.com/StarRocks/demo/tree/master/documentation-samples/datalakehouse.

--

--

Albert Wong

#eCommerce #Java #Database #k8s #Automation. Hobbies: #BoardGames #Comics #Skeet #VideoGames #Pinball #Magic #YelpElite #Travel #Candy