Database cubes are dead; what is their replacement?

Albert Wong
3 min readNov 11, 2023

--

Database cubes are dead (Change my mind)

A database cube, also known as a data cube, is a multidimensional data structure that is used for data analysis and reporting. It organizes data into dimensions and measures, which allows users to slice and dice the data to gain insights into their business.

Database cubes were popular in the early days of data warehousing, but they have largely been replaced by other technologies, such as columnar databases (Snowflake, StarRocks) and distributed computing frameworks (Apache Spark).

There are a few reasons for the decline of database cubes:

  • They are complex and expensive to set up and maintain. Database cubes require specialized hardware and software, and they can be difficult to scale to large datasets.
  • They are not well-suited for real-time analytics. Database cubes are typically used for batch processing, which means that they cannot provide real-time insights into data.
  • They are not as flexible as other technologies. Database cubes are typically designed for specific use cases, such as sales analysis or financial reporting. They can be difficult to adapt to new requirements or to support user-facing analytics (self-service analytics or ad-hoc queries).

Other technologies, such as columnar databases and distributed computing frameworks, offer a number of advantages over database cubes:

  • They are more scalable and cost-effective. Columnar databases and distributed computing frameworks can be used to process large datasets efficiently and at a lower cost.
  • They are more flexible. Columnar databases and distributed computing frameworks can be used for a variety of use cases, including real-time analytics and machine learning.
  • They are easier to use. Columnar databases and distributed computing frameworks are more user-friendly than database cubes, and they can be used with a variety of programming languages.

Overall, database cubes are a legacy technology that has been largely replaced by other technologies. Other technologies offer a number of advantages over database cubes, including scalability, flexibility, cost-effectiveness, and ease of use.

Here are some examples of technologies that can be used to replace database cubes:

  • Columnar databases, such as Apache Parquet and open source OLAP databases like StarRocks
  • Distributed computing frameworks, such as Apache Spark and Hadoop
  • In-memory databases, such as Redis and Memcached
  • Cloud-based data warehouses, such as Google BigQuery and Amazon Redshift

If you are considering using a database cube, I recommend that you evaluate the alternatives carefully. Other technologies may offer a better fit for your needs.

Query data on top of the lake, support performant JOINS at scale, support 1000s of users doing adhoc queries
Run StarRocks on top of raw data and then create views or materialized views as needed.
AirBnB with StarRocks: 4 JOINS with billions of rows in under 4 seconds
Tencent Games with StarRocks: 400+ users doing ad hoc queries on xx+ petabytes of data on Apache Iceberg files.

Read more about StarRocks at http://starrocks.io

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. StarRocks received InfoWorld’s 2023 BOSSIE Award for best open source software.

--

--

Albert Wong
Albert Wong

Written by Albert Wong

#eCommerce #Java #Database #k8s #Automation. Hobbies: #BoardGames #Comics #Skeet #VideoGames #Pinball #Magic #YelpElite #Travel #Candy

Responses (6)