пятница, 19 сентября 2014 г.

Naming and Type Conventions


A production data warehouse can contain many tables and views, all with many columns of different sizes and types. As a best practice, use meaningful names, prefixes, and postfixes for all database objects. Furthermore, set up guidelines for the data types used in the data warehouse to enforce stan- dardization and prevent possible data loss due to conversion or truncation of values. Most organizations already have a set of database design guidelines readily available and you can either enhance those or develop your own. At a minimum, you should adhere to prescribed naming conventions if they are available.

All tables get a prefix (followed by an underscore) indicating their role and function in the data warehouse:

  • STG_ for staging tables
  • HIS_ for historical archive tables DIM_ for dimension tables
  • FCT_ for fact tables
  • AGG_ for aggregate tables
  • LKP_ for lookup tables

All dimension key columns are named after the table they belong to with- out the postfix and get a _key postfix (so the key column of dim_product is named product_key, and so on).
All dimension key columns are of the smallest unsigned integer type possible. MySQL has five different integer types, ranging from tiny to big. Integers can be defined as SIGNED or UNSIGNED, indicating whether they take negative and positive values, or positive values alone. For key columns, use unsigned integers.

  • TINYINT—1 byte, 2 ˆ 8 values (0–255)
  • SMALLINT—2 bytes, 2 ˆ 16 values (0–65532)
  • MEDIUMINT—3 bytes, 2 ˆ 24 values (0–16,777,215)
  • INT or INTEGER—4 bytes, 2 ˆ 32 values (0–4,294,967,295) BIGINT—8 bytes, 2 ˆ 64 values (0–18,446,744,073,709,551,615) 


  • Use meaningful names for the columns. We try to prefix all column names with the table name, unless it becomes impractical to do so due to extremely long column names.
  • Use standard names for audit columns. These are the columns indicating when and who or which process inserted the record or did the last update.
  • Avoid the use of reserved words for database objects as tables, columns, and views. Using reserved words such as group, time, view, order, field, update, and so on makes it harder to work with these objects in queries because they must be quoted to distinguish them from the same words in the SQL language.