The Open Source Standard for Information Management
Members
Refresh Collapse Expand Close

To join, please contact us.

Improve MIKE2.0
Refresh Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.
Add Portlet Add Portlet

Archive for the ‘Business Intelligence’ Category

Is your organisation really unique?

Wednesday, January 13th, 2010

While much of the discussion about information management centres on things that are new and exciting, it is easy to neglect some of the basic principles that the profession has learnt over the last decade.  Here are just five things that I think are among the most important to consider if your project is to be a success.

First, use a standard project plan.  MIKE2.0 has been available for some years now and provides a work breakdown structure which is comprehensive.  Such an approach allows you to involve contractors and multiple service providers without being locked into anyone’s proprietary method.

Second, use data models that have been published.  There are many of them around ranging from low cost publications by authors such as Len Silverston through to enterprise models provided by the major software vendors.  Even the most expensive model is typically much cheaper than the labour cost that it can save.

Third, borrow from Don Rumsfeld: “There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.”  The data warehouse is trying to manage the complexity of the entire business.  You can’t possibly know everything and hence requirements analysis should focus on the fundamental principles of the organisation and those things that are hard to undo later.

Fourth, the foundation of tomorrow’s enterprise data warehouse is unlikely to be today’s tactical solution.  Avoid the temptation to make the first iteration self-funding, the organisation has to be prepared to make an investment otherwise there are always cheaper short term solutions.

Finally, ask yourself whether your organisation is really as unique as your stakeholders think it is.  One of the most common reasons given for the use of unusual architectures or data models that don’t borrow from published materials is that the business is unique.  Everyone is looking for a point of differentiation but that doesn’t mean that you shouldn’t adopt standards where possible.  It is unlikely that the use of an unusual data warehouse architecture is going to enable a store to sell more toothpaste.  That same store, might, however, gain a real edge by combining consumer and supplier data in a new and novel way building on existing approaches to modelling the data.

The evolution of the data warehouse data model

Wednesday, October 28th, 2009

When Ralph Kimball wrote “The Data Warehouse Toolkit” (published 1996) it defined Dimensional Modelling in a way that immediately demanded attention by data warehouse practitioners worldwide. The book and the techniques it described were not new and were common the approach we had used for the better part of a decade, what the book did do that was foundational was to describe the approach in a consistent and considered with a terminology that could be used by everyone.

There are many similar challenges that data warehouse designers face on every project. For instance two challenges we are often called upon to decide how to handle changes to source system models and the proper handling of changes to reference and master data.

The former is usually handled by splitting logical entities when creating physical tables separating attributes and relationships that have a higher probability of changing. The latter is commonly handled in one of three ways. Method one sees non volatile and volatile attributes are split into two tables (with a one to many relationship) Method two has the current attribute values are held in one table with changes over time maintained in a second table (again one to many). Finally, method three has changes across a number of concepts tracked in an audit table which is only intended for forensic purposes.

On recent data warehouse projects, we are using a variant of method one that has been formalised as “The Data Vault”.  The Data Vault techniques put forward by Dan Linstedt formalises both of these issues and makes sensible design recommendations. In particular, it adopts an approach using “hub”, “link” and “satellite” tables.

Originally, Linstedt attempted to patent these concepts, but this application was rejected and he has now adopted a free approach and is promoting his concepts through books, training and his web site: http://www.danlinstedt.com/

Reporting from the future

Wednesday, July 15th, 2009

Jason Kolb provides a great post on predictive analytics. Its powerful stuff, but technology is values-neutral and can cause issues just as it can help solve them. To see analytics applied for good, check out this presentation by Hans Rosling.

As with any complex system, its easy to get things wrong and you need to be really smart to really screw up. But investment banks won’t try and stop predicting the future and people won’t stop playing on the edge. The predictive analytics genie is definitely out of the bottle, the goal is make sure these predictions have some controls, and understanding of data lineage and regulations around risk. That’s one of the reasons why I think Information Development is so important.

Add a portlet to your desktop
Close