Archive for the ‘Information Governance’ Category
Unexpected election results around the world have given the media the chance to talk about their favourite topic: themselves! With their experience running polls, the media are very good at predicting the winner out of two established parties or candidates but are periodically blindsided by outsiders or choices that break with convention. In most cases, there were plenty of warnings but it takes hindsight to make experts of us all.
Surprises are coming as thick and fast in business as they are in politics and similarly there are just as many who get them right with perfect hindsight! The same polling and data issues apply to navigating the economy as they do to predicting electoral trends.
The Oxford Dictionary picked “post-truth” as their 2016 word of the year. The term refers to the selective use of facts to support a particular view of the world or narrative. Many are arguing that the surprises we are seeing today are unique to the era we live in. The reality is that the selective use of data has long been a problem, but the information age makes it more common than ever before.
For evidence that poor use of data has led to past surprises, it worth going way back to 1936 when a prominent US publication called The Literary Digest invested in arguably the largest poll of the time. The Literary Digest used their huge sample of more than two million voters to predict the Republican challenger would easily beat the incumbent, President Roosevelt. After Roosevelt won convincingly, The Literary Digest’s demise came shortly thereafter.
As humans, we look for patterns, but are guilty of spotting patterns first in data that validates what we already know. This is “confirmation bias” where we overemphasise a select few facts. In the case of political polls, the individuals or questions picked often reinforces a set of assumptions by those who are doing the polling.
This is as true within organisations as it is in the public arena. Information overload means that we have to filter much more than ever before. With Big Data, we are filtering using algorithms that increasingly depend on Artificial Intelligence (AI).
AI needs to be trained (another word for programming without programmers) on datasets that are chosen by us, leaving open exactly the same confirmation bias issues that have led the media astray. AI can’t make a “cognitive leap” to look beyond the world that the data it was trained on describes (see Your insight might protect your job).
This is a huge business opportunity. Far from seeing an explosion of “earn while you sleep” business models, there is more demand than ever for services that include more human intervention. Amazon Mechanical Turk is one such example where tasks such as categorising photos are farmed out to an army of contractors. Of course, working for the machines in this sort of model is also a path to low paid work, hardly the future that we would hope for the next generation.
The real opportunity in Big Data, even with its automated filtering, is the training and development of a new breed of professionals who will curate the data used to train the AI. Only humans can identify the surprises as they emerge and challenge the choice of data used for analysis.
Information overload is tempting organisations to filter available data, only to be blindsided by sudden moves in sales, inventory or costs. With hindsight, most of these surprises should have been predicted. More and more organisations are challenging the post-truth habits that many professionals have fallen into, broadening the data they look at, changing the business narratives and creating new opportunities as a result.
At the time of writing, automated search engines are under threat of a ban by advertisers sick of their promotions sitting alongside objectionable content. At the turn of the century human curated search lost out in the battle with automation, but the war may not be over yet. As the might of advertising revenue finds voice, demanding something better than automated algorithms can provide, it may be that earlier models may emerge again.
It is possible that the future is more human curation and less automation.
It’s almost impossible to live these days without a plethora of digital identities that enable us to do almost everything. Whether it be our television, gaming, social media, travel or family security, we depend on all of these things to make our lives work effectively.
Pretty quickly our homes have become as complex as almost any business of just a few years ago! Gone are the days when the most complex device in the home was the hi-fi system.
At the same time, the boundary between work and home has almost disappeared and a fragmented personal digital profile flows through to inefficiencies across our personal and professional lives.
While it might be tempting, few people have the luxury of starting their digital lives from scratch. We all have a technical legacy born of our past digital activities across technologies, family relationships and past jobs. No matter how disorganised, fragmented and out-of-control your digital life is, it is never too late to bring it back into order.
The cost of not taking stock leaves you open to security risks, complexity, fragmentation and the loss of opportunity to live the integrated promise of technology. Increasingly this means even more complexity in the relationship between our work lives and our personal technology.
Over future posts I will look at a number of aspects of our digital lives. In this instalment, I’ll tackle some of the foundations that should be put in place to bring our digital world to order.
The foundation email address
You sit at the centre of a number of circles: family, friends and work. There are a large number of systems and information that you share across all of these groups.
At the centre of your circles is an administrative email address. This email has the attribute of being the last resort for password recovery and other core account activities. It isn’t an address that you should share publically, you don’t want it compromised by excessive spam, for instance.
You could make this email address a product of your Internet Service Provider, but it is better to pick an independent and free service. The more independent of the services that you are going to use in the future, the better.
Search the internet for comparisons of the free email account services and you’ll get a range of articles comparing the benefits of each of the providers. Now is also a good time to pick a foundation name for your digital world. It isn’t necessary for this to be meaningful and it certainly shouldn’t be one that you expect your contacts to be using.
Social media identity
Next you need to have some sort of presence on the major social media sites. Privacy settings can be as tight as you want, but the purpose of these is to act as a common login credential. See Login with social media.
Social media is also the main place to manage groupings which we will talk about in future posts. These groupings come in three categories.
The first are your dependants who don’t control their own online presence, typically your children (or potentially elderly relatives). If they are under age you will create some sort of presence (but not a social media account).
The second group are those you are most closely associated with, such as your spouse or adult children. You will be inviting them to your networks but they will be in control of their own credentials.
Your third group are your very close relatives and friends with whom you regularly share content. Keep this to your immediate contacts but the techniques you use here are going to broaden out to be also your work groups that you enter and leave.
Finally, you need a password management tool. Today’s cloud services are poorly integrated and lack consistent identity management. This is a real opportunity for improvement on the Internet, hence the push towards using social media as a tool of integration. However, the goal should be that your architecture is independent of individual services and should last the distance.
There are a number of very good tools out there, just search for password management tools and compare the benefits of each. The important thing is to have a cloud-based solution that is easy to use across devices.
Having a consistent email as a foundation for managing other accounts, social media for signing-in, a defined network of relationships and a tool for managing all of the accounts you work with will set the foundation for your digital life. When I next write on this topic, we’ll build on this foundation to start describing a complete architecture for our digital lives.
Before the advances of twentieth century medicines, doctors were often deliberately opaque. They were well known for proscribing remedies for patients that were for little more than placebos. To encourage a patient’s confidence, much of what they wrote was intentionally unintelligible. As medicine has advanced, even as it has gotten more complicated, outcomes for patients are enhanced by increasing their understanding.
In fact, the public love to understand the services and products that they use. Diners love it when restaurants make their kitchens open to view. Not only is it entertaining, it also provides confidence in what’s happening behind the scenes.
As buildings have become smarter and more complex, far from needing to hide the workings, architects have gone in the opposite direction with an increasing number of buildings making their technology a feature. It is popular, and practical, to leave structural supports, plumbing and vents all exposed.
This is a far cry from the world of the 1960s and 1970s when cladding companies tried to make cheap buildings look like they were made of brick or other expensive materials. Today we want more than packaging, we want the genuine article underneath. We want honest architecture, machinery and services that we can understand.
I find it fascinating that so many people choose to wear expensive watches that keep time through mechanical mechanisms when the same function can be achieved through a great looking ten dollar digital watch. I think people are prepared to pay thousands when they believe in the elegance and function of what sits inside the case. Many of these watches actually hint at some of those mechanics with small windows or gaps where you can see spinning cogs.
The turnaround of Apple seemed to start with the iMac, a beautiful machine that had a coloured but transparent case, exposing to the world the workings inside.
So it is with business where there are cheap ways of achieving many goals. New products and services can be inserted into already cluttered offerings and it can all be papered over by a thin veneer of customer service and digital interfaces that try to hide the complexity. These are the equivalent of the ten dollar watch.
I had a recent experience of a business that was not transparent. After six months, I noticed a strange charge had been appearing on my telephone bill. The company listing the charges claimed that somewhere we had agreed to their “special offer”. They could not tell us how we had done it and were happy to refund the charges. The real question, of course, is how many thousands of people never notice and never claim the charges back?
Whether it is government, utilities, banking or retail, our interactions with those that provide us products and services are getting more complex. We can either hide the complexity by putting artificial facades over the top (such as websites with many interfaces) or embrace the complexity through better design. I have previously argued that cognitive analytics, in the form of artificial intelligence would reduce the workforce employed to manage complexity (see Your insight might protect your job) but this will do nothing to improve the customer experience.
Far from making people feel that business is simpler, the use of data through analytics in this way can actually make them feel that they have lost even more control. Increasingly they choose the simpler option such as being a guest on a single purpose website rather than embracing a full service provider that they do not understand.
Target in the US had this experience when their data analytics went beyond the expectations of what was acceptable to their customers (see The Incredible Story Of How Target Exposed A Teen Girl’s Pregnancy)
In this age of Big Data, good data governance is an integral part of the customer experience. We are surrounded by more and more things happening that go beyond our expectation. These things can seem to happen as if by magic and lead us to a feeling of losing control in our interactions with businesses.
Just as there is a trend to open factories to the public to see how things are made, we should do the same in our intellectual pursuits. As experts in our respective fields, we need to be able to not only achieve an outcome but also demonstrate how we got there.
I explained last month how frustrating it is when customer data isn’t used (see Don’t seek to know everything about your customer). Good governance should seek to simplify and explain how both business processes and the associated data work and are applied.
The pressure for “forget me” legislation and better handling of data breaches will be alleviated by transparency. Even better, customers will enjoy using services that they understand.
Regardless of company size, Bring-Your-Own-Device (BYOD) has become quite popular. According to Gartner, half of employers surveyed say they’re going to require workers to supply their own devices at work by 2017. Spiceworks did a similar study, finding about 61% of small to medium sized businesses were using a BYOD policy for employee devices. Businesses of all sizes are taking BYOD seriously, but are there differences in how large and small companies handle their policies?
Gaining experience is important in learning how to implement and manage a mobile device policy. Small companies are increasingly supporting smartphones and tablets. Companies with fewer than 20 employees are leading – Spiceworks says 69% in a survey are supportive. By comparison, 16% of employers with more than 250 employees were as enthusiastic.
According to this study, small companies appear to be more flexible in adopting BYOD. There are certain aspects, however, where they may lag behind their larger counterparts. Here are some examples.
Mobile Device Management
Larger corporations often have more resources available to implement Mobile Device Management (MDM) systems. For example, Spiceworks said 56% of respondents were not planning to use MDM mainly because the company does not see a big enough threat. Lost or stolen devices, or misuse by employees, are seen as substantial risks. On the other hand, 17% of the responding small businesses were engaging in active management and just 20% said they would within six months.
The perks of MDM include barriers against data theft, intrusion, and unauthorized use and access. It also helps prevent malware infections.
Larger businesses seem to be more understanding of the need for a proactive MDM system. They tend to possess more knowledge of the technology and the risks and face fewer budgetary hurdles. By comparison, many small companies lack knowledge, funds, and insight into the risks of connecting mobile devices to their network. Cloud-based MDM solutions are a growing alternative. The same Spiceworks study found 53% of respondents were going with a hosted device management solution.
The risks are clearly great for any sized company. A BYOD policy can boost revenue and risk management into the millions of dollars. Corporations usually have multiple layers of security. For a small business, it doesn’t take much to bring the company down. One single cyber-attack can be so costly the company won’t be able to survive.
Security, and the training that goes along with it, is costly for a small company. It might not be able to afford any of the tools necessary for adequate protection. Even if a company was going for savings, data breaches will make these seem like pennies. Such events can cause millions of dollars in damages for even the smallest businesses.
Data leakage is another security risk, besides cost. Mobile devices are prone to data theft without a good MDM system. Gartner highlights the fact mobile devices are designed to support data sharing, but lack file systems for applications. This makes it easier for data to be duplicated and sent to applications in the cloud. It is up to IT to be up on the latest technologies and uses. Obviously, larger companies have the upper hand in this area as they have a better security posture.
Both large and small companies are using BYOD. The differences lie in the willingness to adopt comprehensive Mobile Device Management systems and security policies. These come with the obvious costs which smaller businesses must wrestle with. It often comes down to comparing the daily policy operating costs with those of the risks. When a breach happens, for example, a small business feels the pain and wishes having had the right system in place. Cloud MDM systems are becoming more affordable. These are providing smaller entities with the resources of larger organizations. Time will only tell whether small and medium sized business will become as accepting of mobile device security and management as larger organizations.
Each generation over the last century has seen new technologies that become so embedded in their lives that its absence would be unimaginable. Early in the 20th century it was radio, which quickly become the entertainment of choice, then television, video and over the past two decades it has been the Internet.
For the generation who straddles the implementation of each, there have been format and governance debates which are quickly forgotten. Today, few remember the colour television format choice every country made between NTSC and PAL just as anyone who bought a video recorder in the early 1980s had to choose between VHS and Beta.
It is ironic that arguably the biggest of these technologies, the Internet, has been the subject of the least debate on the approach to governance, standards and implementation technology.
Just imagine a world where the Internet hadn’t evolved in the way it did. Arguably the connectivity that underpins the Internet was inevitable. However, the decision to arbitrarily open-up an academic network to commercial applications undermined well progressed private sector offerings such as AOL and Microsoft’s MSN.
That decision changed everything and I think it was a mistake.
While the private sector offerings were fragmented, they were well governed and with responsible owners.
Early proponents of the Internet dreamed of a virtual world free of any government constraints. Perhaps they were influenced by the end of the Cold War. Perhaps they were idealists. Either way, the dream of a virtual utopia has turned into an online nightmare which every parent knows isn’t safe for their children.
Free or unregulated?
The perception that the Internet is somehow free, in a way that traditional communications and sources of information are not, is misguided.
Librarians have long had extraordinary codes of conduct to protect the identity of borrowers from government eyes. Compare that to the obligation in many countries to track metadata and the access that police, security agencies and courts have to the online search history of suspects.
Telephone networks have always been open to tapping, but the closed nature of the architecture meant that those points are governed and largely under the supervision of governments and courts. Compare that to the Internet which does theoretically allow individuals to communicate confidentially with little chance of interception but only if you are one of the privileged few with adequate technical skill. The majority of people, though, have to just assume that every communication, voice, text or video is open to intercept.
Time for regulation
We need government in the real world and we should look for it on the Internet.
The fact that it is dangerous to connect devices directly to the internet without firewalls and virus protection is a failure of every one of us who is involved in the technology profession. The impact of the unregulated Internet on our children and the most vulnerable in our society reflects poorly on our whole generation.
It is time for the Internet to be properly regulated. There is just too much risk and (poor) regulation is being put in place by stealth anyway. Proper regulation and security would add a layer of protection for all users. It wouldn’t remove all risk, but even the humble telephone has long been used as a vehicle for scams, however remedies have been easier to achieve and law enforcement more structured.
The ideal of the Internet as a vehicle of free expression need not be lost and in fact can be enhanced by ethically motivated governance with the principal of free speech at its core.
Net neutrality is a myth
Increasing the argument for regulation is the reality of the technology behind the Internet. Most users assume the Internet is a genuinely flat virtual universe where everyone is equal. In reality, the technology of the Internet is nowhere near the hyperbole. Net neutrality is a myth and we are still very dependent on what the Internet Service Providers (ISPs) or telecommunications companies do from an architecture perspective (see The architecture after cloud).
Because the Internet is not neutral, there are winners and losers just as there are in the real world. The lack of regulation means that they come up with their own deals and it is simply too complicated for consumers to work out what it all means for them.
Regulation can solve the big issues
The absence of real government regulation of the Internet is resulting in a “Wild West” and an almost vigilante response. There is every probability that current encryption techniques will be cracked in years to come, making it dangerous to transmit information that could be embarrassing in the future. This is leading to investment in approaches such as quantum cryptography.
In fact, with government regulation and support, mathematically secure communication is eminently possible. Crypto theory says that a truly random key that is as long as the message being sent cannot be broken without a copy of the key. Imagine a world where telecommunication providers working under appropriate regulations issued physical media similar to passports containing sufficient random digital keys to transmit all of the sensitive information a household would share in a year or even a decade.
We would effectively be returning to the model of traditional phone services where telecommunication companies managed the confidentiality of the transmission and government agencies could tap the conversations with appropriate (and properly regulated) court supervision.
Similarly, we would be mirroring the existing television and film model of rating all content on the Internet allowing us to choose what we want to bring into our homes and offices. Volume is no challenge with an army of volunteers out there to help regulators.
Any jurisdiction can start
Proper regulation of the internet does not need to wait for international consensus. Any one country can kick things off with almost immediate benefit. As soon as sufficient content is brought into line, residents of that country will show more trust towards local providers which will naturally keep a larger share of commerce within their domestic economy.
If it is a moderately large economy then the lure of frictionless access to these consumers will encourage international content providers to also fall into line given the cost of compliance is likely to be negligible. As soon as that happens, international consumers will see the advantage of using this country’s standards as a proxy for trust.
Very quickly it is also likely that formal regulation in one country will be leveraged by governments in others. The first mover might even create a home-grown industry of regulation as well as supporting processes and technology for export!
In a previous post, I discussed some data quality and data governance issues associated with open data. In his recent blog post How far can we trust open data?, Owen Boswarva raised several good points about open data.
“The trustworthiness of open data,” Boswarva explained, “depends on the particulars of the individual dataset and publisher. Some open data is robust, and some is rubbish. That doesn’t mean there’s anything wrong with open data as a concept. The same broad statement can be made about data that is available only on commercial terms. But there is a risk attached to open data that does not usually attach to commercial data.”
Data quality, third-party rights, and personal data were three grey areas Boswarva discussed. Although his post focused on a specific open dataset published by an agency of the government of the United Kingdom (UK), his points are generally applicable to all open data.
As Boswarva remarked, the quality of a lot of open data is high even though there is no motivation to incur the financial cost of verifying the quality of data being given away for free. The “publish early even if imperfect” principle also encourages a laxer data quality standard for open data. However, “the silver lining for quality-assurance of open data,” Boswarva explained is that “open licenses maximize re-use, which means more users and re-users, which increases the likelihood that errors will be detected and reported back to the publisher.”
The issue of third-party rights raised by Boswarva was one that I had never considered. His example was the use of a paid third-party provider to validate and enrich postal address data before it is released as part of an open dataset. Therefore, consumers of the open dataset benefit from postal validation and enrichment without paying for it. While the UK third-party providers in this example acquiesced to open re-use of their derived data because their rights were made clear to re-users (i.e., open data consumers), Boswarva pointed out that re-users should be aware that using open data doesn’t provide any protection from third-party liability and, more importantly, doesn’t create any obligation on open data publishers to make sure re-users are aware of any such potential liability. While, again, this is a UK example, that caution should be considered applicable to all open data in all countries.
As for personal data, Boswarva noted that while open datasets are almost invariably non-personal data, “publishers may not realize that their datasets contain personal data, or that analysis of a public release can expose information about individuals.” The example in his post centered on the postal addresses of property owners, which without the names of the owners included in the dataset, are not technically personal data. However, it is easy to cross-reference this with other open datasets to assemble a lot of personally identifiable information that if it were contained in one dataset would be considered a data protection violation (at least in the UK).
One of my favorite books is SuperFreakonomics by economist Steven Levitt and journalist Stephen Dubner, in which, as with their first book and podcast, they challenge conventional thinking on a variety of topics, often revealing counterintuitive insights about how the world works.
One of the many examples from the book is their analysis of the Endangered Species Act (ESA) passed by the United States in 1973 with the intention to protect critically imperiled species from extinction.
Levitt and Dubner argued the ESA could, in fact, be endangering more species than it protects. After a species is designated as endangered, the next step is to designate the geographic areas considered critical habitats for that species. After an initial set of boundaries is made, public hearings are held, allowing time for developers, environmentalists, and others to have their say. The process to finalize the critical habitats can take months or even years. This lag time creates a strong incentive for landowners within the initial geographic boundaries to act before their property is declared a critical habitat or out of concern that it could attract endangered species. Trees are cut down to make their land less hospitable or development projects are fast-tracked before ESA regulation would prevent them. This often has the unintended consequence of hastening the destruction of more critical habitats and expediting the extinction of more endangered species.
This made me wonder whether data governance could be endangering more data than it protects.
After a newly launched data governance program designates the data that must be governed, the next step is to define the policies and procedures that will have to be implemented. A series of meetings are held, allowing time for stakeholders across the organization to have their say. The process to finalize the policies and procedures can take weeks or even months. This lag time provides an opportunity for developing ways to work around data governance processes once they are in place, or ways to simply not report issues. Either way this can create the facade that data is governed when, in fact, it remains endangered.
Just as it’s easy to make the argument that endangered species should be saved, it’s easy to make the argument that data should be governed. Success is a more difficult argument. While the ESA has listed over 2,000 endangered species, only 28 have been delisted due to recovery. That’s a success rate of only one percent. While the success rate of data governance is hopefully higher, as Loraine Lawson recently blogged, a lot of people don’t know if their data governance program is on the right track or not. And that fact in itself might be endangering data more than not governing data at all.
Stop reading now if your organisation is easier to navigate today than it was 3, 5 or 10 years ago. The reality that most of us face is that the general ledger that might have cost $100,000 to implement twenty or so years ago will now cost $1 million or even $10 million. Just as importantly, it is getting harder to implement new products, services or systems.
The cause of this unsustainable business malaise is the complexity of the technology we have chosen to implement.
For the general ledger it is the myriad of interfaces. For financial services products it is the number of systems that need to keep a record of every aspect of business activity. For telecommunications it is the bringing together the OSS and BSS layers of the enterprise. Every function and industry has its own good reasons for the added complexity.
However good the reasons, the result is that it is generally easier to innovate in a small nimble enterprise, even a start-up, than in the big corporates that are the powerhouse of our economies.
While so much of the technology platform creates efficiencies, often enormous and essential to the productivity of the enterprise, it generally doesn’t support or even permit rapid change. It is really hard to design the capacity to change into the systems that support the organisation. The more complex an environment becomes the harder it is to implement change.
Most organisations recognise the impact of complexity and try to reduce it by implementing an enterprise architecture in one form or another. Supporting the architecture is a set of principles which, if implemented in full, will support consistency and dramatically reduce the cost of change. Despite the best will in the world, few businesses or governments succeed in realising their lofty architectural principles.
The reason is that, while architecture is seen as the solution, it is too hard to implement. Most IT organisations run their business through a book of projects. Each project signs-up to an architecture but quickly implements compromises as challenges arise.
It’s no wonder that architects are perhaps the most frustrated of IT professionals. At the start of each project they get wide commitment to the principles they espouse. As deadlines loom, and the scope evolves, project teams make compromises. While each compromise may appear justified they have the cumulative effect of making the organisation more rather than less complex.
Complexity has a cost. If this cost is full appreciated, the smart organisation can see the value in investing in simplification.
While architects have a clear vision of what “simple” looks like, they often have a hard time putting a measure against it. It is this lack of a measure that makes the economics of technology complexity hard to manage.
Increasingly though, technologists are realising that it is in the fragmentation of data across the enterprise that real complexity lies. Even when there are many interacting components, if there is a simple relationship between core information concepts then the architecture is generally simple to manage.
Simplicity can be achieved through decommissioning (see Value of decommissioning legacy systems) or by reducing the duplication of data. This can be measured using the Small Worlds measure as described in MIKE2.0 or chapter 5 of my book Information-Driven Business. The idea is further extended as “Hillard’s Graph Complexity” in Michael Blaha’s book, UML Database Modelling Workbook.
In summary, the measure looks at how many steps are required to bring together key concepts such as customer, product and staff. The more fragmented information is, the more difficult any business change or product implementation becomes.
Consider the general ledger discussed earlier. In its first implementation in the twentieth century, each key concept associated with the chart of accounts would have been managed in a master list whereas by the time we implement the same functionality today there would be literally hundreds if not thousands of points where various parts of the chart of accounts are required to index interfaces to subsidiary systems across the enterprise.
One approach to realising these benefits is to have dedicated simplification projects. Unfortunately these are the first projects that get cut if short-term savings are needed.
Alternatively, imagine if every project that adds complexity (a little like adding pollution) needed to offset that complexity with equal and opposite “simplicity credits”. Having quantified complexity, architects are well placed to define whether each new project simplifies the enterprise or adds complexity.
Some projects simply have no choice but to add complexity. For example, a new marketing campaign system might have to add customer attributes. However, if they increase the complexity they should buy simplicity “offsets” a little like carbon credits.
The implementation of a new general ledger might provide a great opportunity to reduce complexity by bringing various interfaces together or it could add to it by increasing the sophistication of the chart of the accounts.
In some cases, a project may start off simplifying the enterprise by using enterprise workflow or leveraging a third-party cloud solution, however in the heat of implementation be forced to make compromises that make it a net complexity “polluter”.
The CIO has a role to act as the steward of the enterprise and measure this complexity. Project managers should not be allowed to forget their responsibility to leave the organisation cleaner and leaner at the conclusion of their project. They should include the cost of this in their project budget and purchase offsetting credits from others if they cannot deliver within the original scope due complicating factors.
Those that are most impacted by complexity can pick their priority areas for funding. Early wins will likely reduce support costs and errors in customer service. Far from languishing in the backblocks of the portfolio, project managers will be queueing-up to rid the organisation of many of these long-term annoyances to get the cheapest simplicity credits that they can find!
Calls for increased transparency and accountability lead government agencies around the world to make more information available to the public as open data. As more people accessed this information, it quickly became apparent that data quality and data governance issues complicate putting open data to use.
“It’s an open secret,” Joel Gurin wrote, “that a lot of government data is incomplete, inaccurate, or almost unusable. Some agencies, for instance, have pervasive problems in the geographic data they collect: if you try to map the factories the EPA regulates, you’ll see several pop up in China, the Pacific Ocean, or the middle of Boston Harbor.”
A common reason for such data quality issues in the United States government’s data is what David Weinberger wrote about Data.gov. “The keepers of the site did not commit themselves to carefully checking all the data before it went live. Nor did they require agencies to come up with well-formulated standards for expressing that data. Instead, it was all just shoveled into the site. Had the site keepers insisted on curating the data, deleting that which was unreliable or judged to be of little value, Data.gov would have become one of those projects that each administration kicks further down the road and never gets done.”
Of course, the United States is not alone in either making government data open (about 60 countries have joined the Open Government Partnership) or having it reveal data quality issues. Victoria Lemieux recently blogged about data issues hindering the United Kingdom government’s Open Data program in her post Why we’re failing to get the most out of open data.
One of the data governances issues Lemieux highlighted was data provenance. “Knowing where data originates and by what means it has been disclosed,” Lemieux explained, “is key to being able to trust data. If end users do not trust data, they are unlikely to believe they can rely upon the information for accountability purposes.” Lemieux explained that determining data provenance can be difficult since “it entails a good deal of effort undertaking such activities as enriching data with metadata, such as the date of creation, the creator of the data, who has had access to the data over time. Full comprehension of data relies on the ability to trace its origins. Without knowledge of data provenance, it can be difficult to interpret the meaning of terms, acronyms, and measures that data creators may have taken for granted, but are much more difficult to decipher over time.”
I think the bad press about open data is a good thing because open data is opening eyes to two basic facts about all data. One, whenever data is made available for review, you will discover data quality issues. Two, whenever data quality issues are discovered, you will need data governance to resolve them. Therefore, the reason we’re failing to get the most out of open data is the same reason we fail to get the most out of any data.
Anthropologist Robin Dunbar has used his research in primates over recent decades to argue that there is a cognitive limit to the number of social relationships that an individual can maintain and hence a natural limit to the breadth of their social group. In humans, he has proposed that this number is 150, the so-called “Dunbar’s number”.
In the modern organisation, relationships are maintained using data. It doesn’t matter whether it is the relationship between staff and their customers, tracking vendor contracts, the allocation of products to sales teams or any other of the literally thousands of relationships that exist, they are all recorded centrally and tracked through the data that they throw off.
Social structures have evolved over thousands of years using data to deal with the inability of groups of more than 150 to effectively align. One of the best examples of this is the 11th century Doomsday Book ordered by William the Conqueror. Fast forward to the 21st century and technology has allowed the alignment of businesses and even whole societies in ways that were unimaginable 50 years ago.
Just as a leadership team needs to have a group of people that they relate to that falls within the 150 of Dunbar’s number, they also need to rely on information which allows the management system to extend that span of control. For the average executive, and ultimately for the average executive leadership team, this means that they can really only keep a handle on 150 “aspects” of their business, reflected in 150 “key data elements”. These elements anchor data sets that define the organisation.
Key Data Elements
To overcome the constraints of Dunbar’s number, mid-twentieth century conglomerates relied on a hierarchy with delegated management decisions whereas most companies today have heavily centralised decision making which (mostly) delivers a substantial gain in productivity and more efficient allocation of capital. They can only do this because of the ability to share information efficiently through the introduction of information technology across all layers of the enterprise.
This sharing, though, is dependent on the ability of an executive to remember what data is important. The same constraint of the human brain to know more than 150 people also applies to the use of that information. It is reasonable to argue that the information flows have the same constraint as social relationships.
Observing hundreds of organisations over many years, the variety of key data elements is wide but their number is consistently in the range of one to a few hundred. Perhaps topping out at 500, the majority of well-run organisations have nearer to 150 elements dimensioning their most important data sets.
While decisions are made through metrics, it is the most important key data elements that make up the measures and allow them to be dimensioned.
Although organisations have literally hundreds of thousands of different data elements they record, only a very small number are central to the running of the enterprise. Arguably, the centre can only keep track of about 150 and use them as a core of managing the business.
Another way of looking at this is that the leadership team (or even the CEO) can really only have 150 close relationships. If each relationship has one assigned data set or key data element they are responsible for then the overall organisation will have 150.
Choosing the right 150
While most organisations have around 150 key data elements that anchor their most important information, few actually know what they are. That’s a pity because the choice of 150 tells you a lot about the organisation. If the 150 don’t encompass the breadth of the enterprise then you can gain insight into what’s really important to the management team. If there is little to differentiate the key data elements from those that a competitor might choose then the company may lack a clear point of difference and be overly dependent on operational excellence or cost to gain an advantage.
Any information management initiative should start by identifying the 150 most important elements. If they can’t narrow the set down below a few hundred, they should be suspicious they haven’t gotten to the core of what’s really important to their sponsors. They should then look to ask the question of whether these key data elements span the enterprise or pick organisational favourites; whether they offer differentiation or are “me too” and whether they are easy or hard for a competitor to emulate.
The identification of the 150 key data elements provides a powerful foundation for any information and business strategy. Enabling a discussion on how the organisation is led and managed. While processes evolve quickly, the information flows persist. Understanding the 150 allows a strategist to determine whether the business is living up to its strategy or if its strategy needs to be adjusted to reflect the business’s strengths.
TODAY: Mon, April 24, 2017April2017