пятница, 19 сентября 2014 г.

Dimension Tables and Fact Tables


The next question is, of course: What’s the difference between dimension and fact tables, and what makes something a fact table? As you will see later, the distinction is not always clear, and there are even occasions when a dimension table in one star schema can become a fact table in another star schema. A simple explanation is that dimension tables contain information about business entities (customers, products, stores) and fact tables about business events (sales, shipments, orders). The most notable difference is in the measureable columns such as revenue, cost, and items, which are part of the fact tables. Also, all of the different angles and attributes that are required to summarize these facts are stored in the dimension tables. It’s actually pretty simple if you translate a typical report request such as ‘‘Show me the total order value per month per product group’’ into a dimensional model.

 The calculated item of interest (sum of order value) is a fact, month is a ‘‘time’’ attribute belonging to the time dimension, and ‘‘product group’’ is a product attribute belonging to the product dimension. A fact table therefore contains only foreign keys pointing to the dimension tables and attributes that can be aggregated (quantitative elements). Dimension tables contain all attributes that describe a certain organization perspective (qualitative elements).
 

This data mart gives you very little information because it contains only a few measurements and two dimensions. Nevertheless, it’s a good starting point to illustrate how exactly a dimensional model works:

  • All fact rows are stored at the lowest possible granularity level. By granularity, we mean the grain of the data; a fact table at the date level has a lower granularity than a fact table at the month level.
  • The granularity of the fact table is determined by the combination of the granularity of the dimension tables. In the example, this means that for each combination of product and date there is a possible fact row. Not all combinations need to be present, of course!
  • All measures in the fact table can be rolled up or grouped by the elements of the dimension tables, so this little data mart can calculate revenue by year, month and product group, or by year and product type, or any other desired combination.

You could argue that an order is also a business entity and only the order lines with the individual transactions are facts. In a way, that assumption is correct. This is a case where special attention is needed in modeling the data marts. Later in this chapter we’ll explain what special attention is needed and describe other more advanced dimensional modeling topics. For now, it suffices to keep the simple definition in mind.