28 Oct 2009
The evolution of the data warehouse data model
When Ralph Kimball wrote “The Data Warehouse Toolkit” (published 1996) it defined Dimensional Modelling in a way that immediately demanded attention by data warehouse practitioners worldwide. The book and the techniques it described were not new and were common the approach we had used for the better part of a decade, what the book did do that was foundational was to describe the approach in a consistent and considered with a terminology that could be used by everyone.
There are many similar challenges that data warehouse designers face on every project. For instance two challenges we are often called upon to decide how to handle changes to source system models and the proper handling of changes to reference and master data.
The former is usually handled by splitting logical entities when creating physical tables separating attributes and relationships that have a higher probability of changing. The latter is commonly handled in one of three ways. Method one sees non volatile and volatile attributes are split into two tables (with a one to many relationship) Method two has the current attribute values are held in one table with changes over time maintained in a second table (again one to many). Finally, method three has changes across a number of concepts tracked in an audit table which is only intended for forensic purposes.
On recent data warehouse projects, we are using a variant of method one that has been formalised as “The Data Vault”. The Data Vault techniques put forward by Dan Linstedt formalises both of these issues and makes sensible design recommendations. In particular, it adopts an approach using “hub”, “link” and “satellite” tables.
Originally, Linstedt attempted to patent these concepts, but this application was rejected and he has now adopted a free approach and is promoting his concepts through books, training and his web site: http://www.danlinstedt.com/


October 28th, 2009 at 10:14 pm
We have developed data modeling patterns optimized for Master Data Modeling. It would be an interesting exercise to merge the approaches: 3NF, DW modeling approaches, and MDM modeling approaches
October 30th, 2009 at 9:25 am
Larry, I am very interested in anything that combines approaches. While we all often use different techniques to solve problems we face, the biggest challenge we seem to face is standardisation of terminology and approach.
March 16th, 2010 at 9:16 pm
Hi Folks,
You can find out more about the Data Vault and why organizations are standardizing on the approach here: http://www.b-eye-network.com/blogs/linstedt you can also join (for free) the forums we have here: http://www.DataVaultInstitute.com
I am trying to get more information out for everyone to use, please let me know if you might be interested in a Data Vault Modeling book. Feel free to contact me directly.
Cheers,
Dan L
DanL@DanLinstedt.com
March 23rd, 2010 at 9:33 pm
Would love to answer your questions, I just refreshed my web-site to bring it up to date – make it more dynamic in nature. I’ve now got a wordpress blog going on. Please let me know how I can assist in spreading the knowledge.
By the way, the patent on the Data Vault was an attempt by my old company, and has been long since denied (much to my delight). I always believed that the Data Vault Modeling and Methodology techniques should be “open to everyone”
Cheers,
Dan Linstedt
March 27th, 2010 at 9:29 am
Dan,
Thank-you very much for your comment and your new site looks great. I’d be very keen to see a detailed Data Vault article up on MIKE2.0 which links back to your site and a similar link from your site. Would you be able to prepare something (doesn’t need to be long)? Interesting to note your comment about the patent and shows how complex IP issues can be!
Best regards,
Rob