Kylin Cube Optimization Guide

2 years ago

Olivia Parker

1 minute

In Kylin, you can optimize Cube design using the following methods:

Simplify dimensions and metrics: only include the most commonly used dimensions and metrics, avoiding unnecessary ones, which can reduce the size and computational complexity of the cube.
Utilize a dimension dictionary: map dimension values to a dictionary table to avoid storing duplicate dimension values in the cube, reducing the size of the cube.
Opt for efficient storage formats like Parquet and ORC which are suitable for data querying and compression purposes, as they can help reduce storage space and increase query speed.
Utilize hierarchical aggregation: by aggregating the Cube hierarchically, cache the calculation results of higher-level Cubes to reduce the computational load during querying.
Utilize precomputed metrics: precalculate metrics for frequent queries and store the results in a Cube to reduce the calculation time during queries.
Use appropriate data partitioning: Set up data partitions based on the characteristics of the data and query requirements in order to make queries more efficient.
Regular maintenance of the cube: It is important to optimize and maintain the cube regularly, which includes tasks such as data cleaning, data compression, and index rebuilding, to ensure the stability and reliability of the cube’s performance.