The next question is, of course: What’s the difference between dimension
and fact tables, and what makes something a fact table? As you will see
later, the distinction is not always clear, and there are even occasions when
a dimension table in one star schema can become a fact table in another star
schema. A simple explanation is that dimension tables contain information
about business entities (customers, products, stores) and fact tables about
business events (sales, shipments, orders). The most notable difference is in
the measureable columns such as revenue, cost, and items, which are part of
the fact tables. Also, all of the different angles and attributes that are required
to summarize these facts are stored in the dimension tables. It’s actually pretty
simple if you translate a typical report request such as ‘‘Show me the total
order value per month per product group’’ into a dimensional model.
The calculated item of interest (sum of order value) is a fact, month is a ‘‘time’’ attribute belonging to the time dimension, and ‘‘product group’’ is a product attribute belonging to the product dimension. A fact table therefore contains only foreign keys pointing to the dimension tables and attributes that can be aggregated (quantitative elements). Dimension tables contain all attributes that describe a certain organization perspective (qualitative elements).
The calculated item of interest (sum of order value) is a fact, month is a ‘‘time’’ attribute belonging to the time dimension, and ‘‘product group’’ is a product attribute belonging to the product dimension. A fact table therefore contains only foreign keys pointing to the dimension tables and attributes that can be aggregated (quantitative elements). Dimension tables contain all attributes that describe a certain organization perspective (qualitative elements).
This data mart gives you very little information because it contains only
a few measurements and two dimensions. Nevertheless, it’s a good starting
point to illustrate how exactly a dimensional model works:
You could argue that an order is also a business entity and only the order lines with the individual transactions are facts. In a way, that assumption is correct. This is a case where special attention is needed in modeling the data marts. Later in this chapter we’ll explain what special attention is needed and describe other more advanced dimensional modeling topics. For now, it suffices to keep the simple definition in mind.
- All fact rows are stored at the lowest possible granularity level. By granularity, we mean the grain of the data; a fact table at the date level has a lower granularity than a fact table at the month level.
- The granularity of the fact table is determined by the combination of the granularity of the dimension tables. In the example, this means that for each combination of product and date there is a possible fact row. Not all combinations need to be present, of course!
- All measures in the fact table can be rolled up or grouped by the elements of the dimension tables, so this little data mart can calculate revenue by year, month and product group, or by year and product type, or any other desired combination.
You could argue that an order is also a business entity and only the order lines with the individual transactions are facts. In a way, that assumption is correct. This is a case where special attention is needed in modeling the data marts. Later in this chapter we’ll explain what special attention is needed and describe other more advanced dimensional modeling topics. For now, it suffices to keep the simple definition in mind.