Origin story for Apache Iceberg, Apache Hudi and Delta Lake with vendor control
#Iceberg came from Netflix to support BI and dashboards. Designed for read heavy workloads (90% reads and 10% writes). Per https://tableformats.sundeck.io/, Tabular employs 36% of the committers and wrote about ~60% of the codebase.
#Hudi came from Uber to store receipts. Designed for read heavy workloads (90% read and 10% writes via copy-on-write tables) and balanced read and write workloads (50% reads and 50% writes via merge-on-read tables). Per https://tableformats.sundeck.io/, Onehouse employs 19% of the committers and wrote about ~20% of the codebase.
#Delta Lake came from databricks. Designed for API/ML and spark pipelines. Per https://tableformats.sundeck.io/, Databricks employs 100% of the committers and wrote about ~100% of the codebase.
Will Iceberg win? I think the answer is interoperability. You need a data lakehouse that supports all 3 formats due to different data workloads. Onehouse.ai and others can provide this by storing and servicing the data in all 3 formats (no duplication).