In Small Worlds Data Transformation Measures, Rob wrote about the challenges of data modelling in today’s complex, federated enterprise. This is explained through an overview on Graph Theory, which provides the foundation for the relational modelling techniques first developed by Ted Cod over 30 years ago.
Relational Theory has been one of the stalwarts of software engineering. It is governed by a Codd’s rules, which have fundamentally stayed intact despite the rapid advances in other areas of software engineering – a testament to their effectiveness and simplicity.
While evolutions have taken place over time and there have been some variations to approach (e.g. dimensional modelling), the changes have built on the relational theory foundation and abided by its design principles.
But is it time for a change? Are some of the issues we are seeing today the result of the foundation starting to crumble due to complexity? Or is it that there are so many violations of Codd’s Rules? While the latter is certainly a contributing factor, it may be that relational theory is starting to wear under the weight of our modern systems infrastructures – and the issues will continue to get worse. Whereas there does not appear to be an equivalent approach to relational theory that will address the issues we see today, we think Small Worlds Theory and Web 2.0 may provide some ideas for a new approach.
Small Worlds Theory helps provide rationale for a different approach to modelling information. Small Worlds Theory tells us that for a complex system to be manageable it must be designed as an efficient network and that many systems (biological, social or technological) follow this approach. Although the information across organizations is highly federated, it does not inter-relate through an efficient network. As opposed to building a single enterprise data model, it is the services model that includes the modelling of “data in motion” that should be incopoporated into the comprehensive approach.
In addition to better modelling of federated data, new techniques should also to bring in unstructured content. This includes the information from the “informal network” such as that developed in wikis and blogs. While there are standards to add structure to unstructured content, their uptake has been slow. People prefer a quick and easy approach to classification, especially for content that is more informal in nature.
Therefore, the approach may involve the use of categories and taxonomies to bring together collaborative forms of communications and link it to the formal network. Both Andi Rindler and Jeremy Thomas have discussed some work we are doing in their area on the MIKE2.0 project on their blog posts. We’re also starting to see the implementation of some very cool ideas for dynamically bringing together tagging concepts such as the Tagline Generator.
In summary, whereas an approach based on a mathematical foundation is a required to provide a solution equivalent to Codd’s and there is a grand vision for a “semantic web”, we may chip away at the problem through a variety of techniques. Just as Search is already providing a common for mechanism for data access, other techniques may help with information federation and unstructured content.