Why not KISS for OLAP databases? Why have a database for low-latency real-time analytics, another for user-facing and other for hybrid table view?

Albert Wong
2 min readNov 17, 2023

--

Why not KISS for OLAP databases

So there is a comment from a database vendor that says you need all these different types of databases for all these different types of use cases. Read more at https://www.reddit.com/r/dataengineering/comments/164x1qp/comment/jyl7w0n/?utm_source=share&utm_medium=web2x&context=3.

Maybe.

Yes, pick the right tool for the job, but no one has a large data engineering team that can run every single type of database out there. There is a reason why people just run product X for everything. It’s KISS (https://en.wikipedia.org/wiki/KISS_principle). It’s the same reason why you don’t build an application with 5 different programming languages just to optimize (there are exceptions like micro-services***).

So what would I pick for:

  • When a user want low-latency real-time-analytics
  • When a user need high concurrency, “user-facing analytics.”
  • When a user wishes to combine real-time streaming data with historical (batch or “offline”) data in a hybrid table view.

Just use StarRocks. It can do all those use cases and do them well. Why? It was originally designed to meet the advertising ad business before open sourcing the database. As you can imagine, you want the latest, freshest data (on ad costs and clicks) + all of this is user-facing (100s if not 1000s all doing ad-hoc queries on various ad terms) + you want to see historical data on the cost and clicks for a ad term.

Query data on top of the lake, support performant JOINS at scale, support 1000s of users doing adhoc queries
Run StarRocks on top of raw data and then create views or materialized views as needed.
AirBnB with StarRocks: 4 JOINS with billions of rows in under 4 seconds
Tencent Games with StarRocks: 400+ users doing ad hoc queries on xx+ petabytes of data on Apache Iceberg files.

Read more about StarRocks at http://starrocks.io

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. StarRocks received InfoWorld’s 2023 BOSSIE Award for best open source software.

--

--

Albert Wong

#eCommerce #Java #Database #k8s #Automation. Hobbies: #BoardGames #Comics #Skeet #VideoGames #Pinball #Magic #YelpElite #Travel #Candy