Dimension table in Data Warehousing

What is the definition of a dimension table in data warehousing models? In business intelligence, dimensions are axes of analysis. The same dimensions often come up in the same areas. A dimension is a tree, the objective is to join facts to a leaf. With this join, analysts can calculate an aggregate with a node, a branch or the whole tree.

A dimension in a star schema is a table that BI engineers linke to a fact table by foreign keys, at the finest level of granularity.

Definition of a dimension table in a Data Warehouse database model

Most of the time, in data warehousing, a dimension table is in a denormalized form. But always take care to respect the first normal form. It is preferable to use a generic physical key without any meaning of the auto-increment type. And this is for performance reasons.

Another approach, with a logical key, makes it easier to read and maintain due to the evolution of dimensions. It also simplifies joins with fact tables and reduces its data volume.

To update a dimension table, different types of processes are available. However, it is not advisable to delete a row. If you don’t want to see non-useful dimension elements displayed, then use a “status” field with, for example, statuses such as “Current”, “Obsolete”, etc.

Slowly changing dimensions (SCDs) are used to manage the evolution of dimension tables in a controlled manner.

In a Business Intelligence DWH a dimension table is an axis with hierarchies

In order to simplify the calculation of aggregates and the production of reports, create dimensions that respect the functional hierarchies in place in the company and the activity being modelled.

For example, the time dimension can have a hierarchy with years, quarters, months and days.

In the most common data warehousing models, like Ralph Kimball’s is the star schema where the fact tables have a central place and the dimension tables are linked with keys. Dimensions are widely used in Decision Support Systems.

In general, delimit the hierarchy of dimensions in a strict manner. On the other hand, the multidimensional approach is rather natural because it should reflect the actual structure of the company.

An organisation can have up to dozens of dimensions. With too many dimensions, the model becomes too complex.

For example, use attributes on the dimensions to distinguish certain elements and create alternative hierarchies.

Always keep in mind the added value of a multidimensional system. The relevance of the dimensions is the basis for the effectiveness of an information and decision support system. The intersection of these dimensions makes it possible to analyse the data stored in the fact tables and other KPIs.

Use dimensions and tables are used to build OLAP cubes.

ROLAP HOLAP and MOLAP differences

Leave a Comment

Your email address will not be published. Required fields are marked *