Calendar
November 2008
| M |
T |
W |
T |
F |
S |
S |
| « Oct |
|
|
| | 1 | 2 |
| 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| 17 | 18 | 19 | 20 | 21 | 22 | 23 |
| 24 | 25 | 26 | 27 | 28 | 29 | 30 |
|
Archive for the ‘Information Development’ Category
Monday, September 10th, 2007
In Small Worlds Data Transformation Measures, Rob wrote about the challenges of data modelling in today’s complex, federated enterprise. This is explained through an overview on Graph Theory, which provides the foundation for the relational modelling techniques first developed by Ted Cod over 30 years ago.
Relational Theory has been one of the stalwarts of software engineering. It is governed by a Codd’s rules, which have fundamentally stayed intact despite the rapid advances in other areas of software engineering – a testament to their effectiveness and simplicity.
While evolutions have taken place over time and there have been some variations to approach (e.g. dimensional modelling), the changes have built on the relational theory foundation and abided by its design principles.
But is it time for a change? Are some of the issues we are seeing today the result of the foundation starting to crumble due to complexity? Or is it that there are so many violations of Codd’s Rules? While the latter is certainly a contributing factor, it may be that relational theory is starting to wear under the weight of our modern systems infrastructures – and the issues will continue to get worse. Whereas there does not appear to be an equivalent approach to relational theory that will address the issues we see today, we think Small Worlds Theory and Web 2.0 may provide some ideas for a new approach.
Small Worlds Theory helps provide rationale for a different approach to modelling information. Small Worlds Theory tells us that for a complex system to be manageable it must be designed as an efficient network and that many systems (biological, social or technological) follow this approach. Although the information across organizations is highly federated, it does not inter-relate through an efficient network. As opposed to building a single enterprise data model, it is the services model that includes the modelling of “data in motion” that should be incopoporated into the comprehensive approach.
In addition to better modelling of federated data, new techniques should also to bring in unstructured content. This includes the information from the “informal network” such as that developed in wikis and blogs. While there are standards to add structure to unstructured content, their uptake has been slow. People prefer a quick and easy approach to classification, especially for content that is more informal in nature.
Therefore, the approach may involve the use of categories and taxonomies to bring together collaborative forms of communications and link it to the formal network. Both Andi Rindler and Jeremy Thomas have discussed some work we are doing in their area on the MIKE2.0 project on their blog posts. We’re also starting to see the implementation of some very cool ideas for dynamically bringing together tagging concepts such as the Tagline Generator.
In summary, whereas an approach based on a mathematical foundation is a required to provide a solution equivalent to Codd’s and there is a grand vision for a “semantic web”, we may chip away at the problem through a variety of techniques. Just as Search is already providing a common for mechanism for data access, other techniques may help with information federation and unstructured content.
Posted in Information Development, Web2.0, data modelling | No Comments »
Thursday, September 6th, 2007
If an organization is going to move towards a Center of Excellence model for Information Development, we are some asked: “does it makes sense for this to be done offshore?” This seems a logical question, as organizations are increasingly moving their delivery capabilities offshore, especially for large application development and systems integration projects.
Although we encourage organizations to think about Information Development as a competency analogous to application development, it isn’t just something you can give to a separate group – it is a cultural change that must go across the company. While expertise can certainly be brought in from the outside, it’s also a capability that must exist internally.
Offshore Information Development should incorporate the following principles:
- It is the governance standards, policies and processes that enable an Information Development approach. These are the same in an offshore or onshore model.
- An Information Development team can be a physical (i.e. a dedicated team) or a virtual (i.e. members have other significant roles). In most cases there is a combination of dedicated and shared resources.
- For any sizeable offshore team, it will need to contain representation as part of the Information Development Center of Excellence.
- Information Development crosses business boundaries and requires participation from senior execs to line staff. Therefore, it is not a delivery capability that can be built completely offshore.
- The organizational model will evolve over time and individuals in assigned roles are typically needed to drive the transition to new organizational models.
In summary, organizations should make sure they have a strong onshore capability for Information Development, even if much of their development occurs offshore. Whatever the delivery model, the key to success is Information Governance through open and common standards, architectures and policies.
Posted in Information Development, information strategy | No Comments »
Friday, August 31st, 2007
With a maturing in understanding by both business and government of the role of information in good governance has come the question “what’s next”. The answer, of course, is to turn a reactive culture which deals with data issues into a proactive one. In the MIKE2.0 community we talk a lot about Information Development which is an approach to think about data as part of the business function and application rather than as an afterthought when someone starts looking for reports.
Most users of this site would be familiar with the Maturity Model which describes the attributes of an organization that has evolved its capabilities to proactive and beyond. The obvious question then is, “how do I organize to optimize”? The answer is not, I believe, to adopt the organization model of an optimized organization, rather it is to evolve to the structure of the next level up. The Centre of Excellence articles make a variety of organization recommendations. Do the Information Management self-assessment and then pick the model that is a stretch but not too far from where you are now.
My observation is that there is a tight relationship between information maturity and the governance structure that is in place. Like the chicken and the egg though, some organisations mature and hence naturally evolve to the right model. Others take direct action and put the these roles and responsibilities in place which results in the corresponding level of maturity. Cause and effect don’t matter, but if you take the former path it is very helpful to have the “back of the book” answers handy.
Posted in Information Development, Information Governance | 2 Comments »
Wednesday, August 22nd, 2007
We’re often asked to compare approaches to managing structured and unstructured data and attempts to bridge the gap between the two. Traditionally, technology practitioners who worried about unstructured data have been entirely different group to those that worried about structured data.
In fact, there are three types of data, structured, unstructured and a hybrid (records-oriented) grouping of semi-structured. They have much in common and are all part of the enterprise information landscape. In order to look at ways to leverage the relative strengths of the different types of data, it is important to first understand how they are used.
There are three primary applications of data within most enterprises.
The first is in support of operational processes. In the case of structured data, these processes are usually complex from a system perspective but often quite transactional from a human perspective. In the case of semi-structured and unstructured data, there is often less system intervention or interpretation of the data with a heavy reliance on human interpretation.
Secondly, each of the three is used for analysis. In the case of structured, it is easy to understand how the analysis is undertaken. With semi-structured/record data, analysis can be divided into aggregation of the structured components and a manual analysis of the free-text. With unstructured, analysis is usually restricted to searching for like terms and manually evaluating the documents.
Finally, all three types of data are used as a reference to back-up decisions and provide an audit trail for operational processes.
MIKE2.0 recommends approaches to governance, architecture and integration which are independent of the structure of the data itself.
The majority of effort associated with all data, regardless of its form, is gaining access to it at the time when it’s needed. In all three cases, there are processes to lookup or search the data. SQL for structured data, lookups for semi-structured and tree-oriented folders for unstructured. Increasing, the techniques for finding all three types are converging in one set of processes called Enterprise Search.
Ironically, despite the power of search, successful implementations are really mandating the implementation of common metadata and the use of a single enterprise metadata model. Again, MIKE2.0 takes the information architect through these requirements in a lot of detail.
In the future, organisations can expect to keep all three forms of data (structured, semi-structured records and unstructured documents) in the same repositories. However, there is no need to wait for this future utopia to begin leveraging all three in the same applications and managing them in a common way.
Posted in Information Development, Information Management, MIKE2.0, information strategy | No Comments »
Saturday, August 11th, 2007
I’ve been reading a book called Why Not? by Yale professors Barry Nalebuff and Ian Ayers which provides “four simple tools that can help you dream up ingenious ideas for changing how we work, shop, live, and govern.” - its a highly recommended read. I was fortunate enough to attend one of Barry’s lectures last week and he’s already given me a few new ideas.
One idea is that of a Devil’s Advocate for corporate governance. It explains the religious origins of the term and the benefits of having a person that takes a counter-point for the sake of argument. This person is a trusted adviser and has a duty to take this contrarian view, therefore their argument is not as one of dissent. In the book, explanations are provided on how this technique could be applied to Corporate Governance, where strong-arm techniques can easily over-run outside opinions.
The Devil’s Advocate an interesting role on the subject of Information Governance - possibly an architect assigned within the Information Development Organisation. Although this approach could be applied more generally to any solution, I think it makes particular sense for those related Information Governance decisions, as:
- Success requires concession and buy-in from multiple parties
- Solutions are complex so multiple viewpoints are important
- It is easy for one group to dominate, but as information flows horizontally across organizations the impact of issues can be asymmetric
- It is easy to get “stuck” when someone brings up a counterpoint due to emotion and frustration. Using a devil’s advocate alternative viewpoints are quickly put on the table (it is their duty to identify issues)
We’ve all seen strong-arm techniques or argumentative competitors ruin projects. When attempting to re-design an organisation to take a stronger focus on managing information, the shift is bound to run into issues.A trusted and educated view that raises arguments without being seen as a dissident would certainly be productive as a way to identify ownership roles, responsibilities and possible solutions.
Posted in Information Development, information strategy | No Comments »
Tuesday, August 7th, 2007
I was speaking with a client this week who put forward the challenge that Information Management isn’t really as complicated as we in the profession make out. I stopped for a moment to think about how I could explain the intricacy of an entire body of practice and realised that I would need to pick just one example.
Given its prominence in the industry, I decided to use Master Data Management and particularly the process of matching between sets of master data.
I started with just two lists of people (set A and set B). I then explained how a typical algorithm would match individual records by creating a score and a threshold for matching. No problem my client said, he could use a spreadsheet for that!
I then added a third list (set C). Most algorithms compare two lists at a time. That means there are three combinations: AB followed by ABC, AC followed by ACB, and BC followed by BCA. To see why it matters, consider the following situation.
In set A, we have a record: “Robert Hillard, email robert.hillard[at]bearingpoint.com”
In set B, we have a record: “Robert Hillard, phone number +61 412 396 036”
In set C, we have a record: “Robert Hillard, phone number +61 412 396 036, email: robert.hillard[at]bearingpoint.com”
A typical business rule might require two items of data to match before the threshold is reached. That means we need name and email, name and phone number or email and phone number to define a match.
In the first scenario we match AB first followed matching the resulting records with set C. In this example, the two “Robert Hillard” records are not matched in the first pass meaning on the second pass when we bring in set C we can only end up with at best two records when we match the two entries to the new Robert Hillard in set C. The final result is two instances of Robert Hillard.
In the second scenario we match AC first which results in a full match on Robert Hillard, which in turn when set B is brought in matches to the instance in that file as well. The final result is just one instance of Robert Hillard.
Now understanding the complexity, my client tried to add a kludge solution by creating a master record for each match during an individual pass. There isn’t enough space in this posting to explain why this doesn’t help as the number of sets increases, however suffice it to say that each such band aid solution actually adds to the complexity when more sets are added.
In summary, the more sets there are to match the more combinations there are which will affect the outcome. For n sets there are, in fact (n-1)! (ie., n minus 1 factorial) combinations each of which will usually give a different final result for a statistically significant number of entries. Imagine the problem facing the US government when trying to bring together lists of doctors, lawyers or other professionals across 50 state lists!
Posted in Information Development | 3 Comments »
Wednesday, August 1st, 2007
The authors of this blog have been pretty passionate for some years about Information Management and promoting the benefits of putting information and data at the center of an organizations development processes – Information Development.
To help promote this approach and to promote discussion and debate in the Information Management profession, we were behind an initiative to launch an open approach to Information Management – title MIKE2.0.
During the 1990’s the volume of raw data held by enterprises has grown exponentially. All of that data had to be put to some use, and it has been both internally and externally. As a result, non-ledger data has taken on greater and greater importance in the management, oversight and assessment of companies. Unfortunately, the use of agreed processes and standards for the aggregation, measurement, quality and interpretation of the data has not moved at the same rate with every enterprise free use their own approaches. In some cases this results in innocent ambiguity while in other cases organizations have taken the opportunity to deliberately mislead their stakeholders.
The complexity of data is not generally well understood. Most often, it is assumed to be a set of static datasets which can be related to each other in an unambiguous way. The reality is that data is constantly changing across the enterprise 24 hours a day. With financial reporting, this constant change is generally well managed with ledger aggregation, group reporting and, most importantly, period-end closing. By agreeing to specific cut-offs a point of reconciliation stabilizes all of this ongoing change. Although it is taken for granted, the process followed to stabilise the data are non-trivial.
If non-ledger data is to be trusted to the same extent as financial data, then its complexity needs to be equally well managed in ways which are consistent across the industry. No one consulting firm and no one financial institution can find the “right” answer unless the approach is much more widely adopted. For this reason we have not only invested heavily in developing approaches to managing and measuring complex data, but have convinced our employer – BearingPoint – to donate it to the wider profession using a Creative Commons licensing model.
MIKE2.0 is that initiative and is larger than any one group of professionals. It is managed by a mix of industry professionals across end-user and consulting firms. It is designed as a multi-lingual collaboration that can link external reporting minimum standards with multiple internal data consolidation processes using a variety of technologies. MIKE2.0 is one of the initiatives that Information Management professionals looking to shape their industry can embrace, influence and extend.
Posted in Information Development | No Comments »
Wednesday, August 1st, 2007
I am continually struck by the lack of formal valuation models to information. Considerng how much organizations spend on building and maintaining information assets and how valuable they are to the health of the business, you would think it would be an area that would receive more focus.
While I’ve seen a number of academic papers on assessing the Economic Value of Information, the practically implemented cases are few and far between. I have done some development on “Assessment-oriented” models that can be value in formulating a strategy, such as the Economic Value of Information model in MIKE2.0.
An Information Value Assessment should provide a mechanism to assign an economic value to the information assets an organization holds and the resulting impacts of Information Governance practices on this value. It could also measure whether the return outweighs the cost and the time required to attain this return.
Governance models are just one way of assessing value. Other simple techniques could include:
- Mastering - how many systems hold this common data?
- Latency - if I load this data into a warehouse in an hourly fashion as opposed to weekly what are the gains?
- Quality - RI issues, accuracy issues
- Reach - how many people read my blog? who are the readers?
I think this is an area where industry models will greatly improve, similar to what has occurred in the past 10 years in the infrastructure space. The lack of model points to the immaturity of information management as a competency and the strict building of information with technology. I would welcome any other opinions on technqiues.
Posted in Information Development, Information Management, MIKE2.0, information strategy, information value | No Comments »
Wednesday, July 25th, 2007
Welcome to Information Development - a blog dedicated to the subject of Information Management. Complementary to the MIKE2.0 Methodology which is provides a structured competency for Information Development, this is a collection of perspectives on how the management of information has tremendous impacts on business, technology and society.
Posted in Enterprise2.0, Information Development, Information Management, MIKE2.0 | No Comments »
|
|
|
|
|
|