пятница, 19 сентября 2014 г.

Microstrategy Best Practices for Data Modeling and Deployment

Third Normal Form and Dimensional Modeling
Teradata strongly advocates the use of fully normalized physical schemas or “3NF” schemas. The merits of this approach are fully discussed elsewhere, but the primary motivation is that a fully normalized schema provides the greatest flexibility in answering the questions an enterprise will ask now and in the future. Further, Teradata’s unique technical characteristics allow it to provide acceptable query performance that other RDBMS products may not be able to provide without using denormalization as a performance optimization technique.
Dimensional modeling is a logical modeling technique that is popular in the BI community. Sometimes Teradata’s recommendation to use a normalized schema is misinterpreted as a mutually exclusive alternative to dimensional modeling. This is unfortunate, because a dimensional model does not imply denormalization. Many dimensional modeling practitioners assume you have to denormalize in order to achieve acceptable query performance and consequently introduce denormalization by default in to their models. But in reality, it is entirely possible to have dimensional models that are also normalized. The classic “snowflake schema” is a perfect example.
The main point is that dimensional modeling does not imply denormalization; they are two independent topics of conversation.
MicroStrategy Preferences
MicroStrategy is mostly agnostic towards the use of denormalization. The MicroStrategy SQL Engine works with the full spectrum of highly normalized or denormalized schemas. When working with Teradata systems, denormalization techniques are often unnecessary.
MicroStrategy generally recommends logical modeling techniques based on the principles of dimensional modeling. The MicroStrategy SQL Engine itself is somewhat agnostic towards dimensional models versus general ER models. The SQL Engine supports dimensional models very well, and is able to support virtually any type of star, snowflake, or hybrid physical design. But the SQL Engine does not require a dimensional model and has built-in support for many constructs that are typical in ER schemas, but are not as common in dimensional models. Examples include support for compound keys, many-to-many relationships, fact extensions, joint attribute relationships, etc. (These are discussed at length elsewhere.) There are some data modeling constructs found in ER models that are not found in dimensional models that are difficult to model to the SQL Engine. Some tips for handling even these cases are provided in this paper. Overall, the SQL Engine supports a broad set of schemas along the spectrum of dimensional models to general ER models.
However, based on experience with large scale BI systems, MicroStrategy sees the practical benefits of dimensional modeling in terms of presenting a simpler view to end users than what is possible with a generic ER model. The concept of a cube of facts, accessible by varying levels of independent dimensions, is proven as an intuitive paradigm for formulating business questions. In short, 3NF ER models are good for flexibility, but can be incomprehensible to business users, even when modeled through a BI platform like MicroStrategy. Dimensional models typically are easier for end users to understand and map more closely to the way business users think about their business.
Best Of Both Worlds
When using MicroStrategy and Teradata, the best approach is to define a layer of application views in Teradata. MicroStrategy applications are then defined entirely in terms of the views, rather than the physical tables themselves. This provides the best of both worlds by implementing a 3NF physical schema and using the views to present a more dimensional model to MicroStrategy. The physical model is resilient to updates and changes in the enterprise and provides acceptable query performance. The business model presented to end users through MicroStrategy is more intuitive because it is based on an application model with more dimensional characteristics.
A good reference document is the Teradata white paper “Data Model Overview: Modeling for the Enterprise While Serving the Individual.”
Using Database Views
As described in theory above, many practical MicroStrategy-Teradata implementations do feature a set of database views on top of the physical tables in Teradata. MicroStrategy applications are then defined entirely in terms of the views, rather than the physical tables themselves. This additional layer of abstraction provides flexibility for the MicroStrategy Architect, who can tune the MicroStrategy model by making changes to the views rather than physical changes to the underlying tables.
Often the first set of views in this layer is a set of Base Views on the tables that simply select all of the data from the base tables using the LOCKING FOR ACCESS modifier. On top of the base views, customers can create a second layer of application views that are used to facilitate or simplify the application.
Using Teradata Datatypes
MicroStrategy supports the following Teradata data types:
BYTE, BYTEINT, CHARACTER, DATE, DECIMAL/NUMERIC, FLOAT/REAL/DOUBLE PRECISION,
INTEGER, LONG VARCHAR, SMALLINT, TIME, TIMESTAMP, VARBYTE, VARCHAR
MicroStrategy does not currently support the INTERVAL data types.
Using User-Defined Functions
Teradata supports user-defined functions (UDFs), which allow users to extend the SQL language by writing their own functions or installing UDF packages from third-parties. MicroStrategy supports any UDF defined on Teradata through the use of passthrough functions, such as ApplySimple. Using these functions, MicroStrategy generated SQL will call UDF’s just like any other function provided by Teradata.
The passthrough functions in MicroStrategy include ApplySimple, ApplyAgg, ApplyOLAP, ApplyLogical, and ApplyComparison. Different passthrough functions are used depending on the context in which it is used in MicroStrategy. For example, ApplySimple and ApplyAgg can be used in Metric expressions, while ApplyComparison is used in Filter expressions. Each passthrough function takes a string of SQL syntax as an input and a variable number of additional values that are substituted into the SQL string.
For example, consider a UDF Find_Text that takes two string arguments as input. This function can be used in a MicroStrategy expression using the ApplySimple function as follows:
ApplySimple(‘Find_Text(#0, #1)’, [Search String], [Pattern])