Posts Tagged ‘books’
One of my favorite books is SuperFreakonomics by economist Steven Levitt and journalist Stephen Dubner, in which, as with their first book and podcast, they challenge conventional thinking on a variety of topics, often revealing counterintuitive insights about how the world works.
One of the many examples from the book is their analysis of the Endangered Species Act (ESA) passed by the United States in 1973 with the intention to protect critically imperiled species from extinction.
Levitt and Dubner argued the ESA could, in fact, be endangering more species than it protects. After a species is designated as endangered, the next step is to designate the geographic areas considered critical habitats for that species. After an initial set of boundaries is made, public hearings are held, allowing time for developers, environmentalists, and others to have their say. The process to finalize the critical habitats can take months or even years. This lag time creates a strong incentive for landowners within the initial geographic boundaries to act before their property is declared a critical habitat or out of concern that it could attract endangered species. Trees are cut down to make their land less hospitable or development projects are fast-tracked before ESA regulation would prevent them. This often has the unintended consequence of hastening the destruction of more critical habitats and expediting the extinction of more endangered species.
This made me wonder whether data governance could be endangering more data than it protects.
After a newly launched data governance program designates the data that must be governed, the next step is to define the policies and procedures that will have to be implemented. A series of meetings are held, allowing time for stakeholders across the organization to have their say. The process to finalize the policies and procedures can take weeks or even months. This lag time provides an opportunity for developing ways to work around data governance processes once they are in place, or ways to simply not report issues. Either way this can create the facade that data is governed when, in fact, it remains endangered.
Just as it’s easy to make the argument that endangered species should be saved, it’s easy to make the argument that data should be governed. Success is a more difficult argument. While the ESA has listed over 2,000 endangered species, only 28 have been delisted due to recovery. That’s a success rate of only one percent. While the success rate of data governance is hopefully higher, as Loraine Lawson recently blogged, a lot of people don’t know if their data governance program is on the right track or not. And that fact in itself might be endangering data more than not governing data at all.
Collaboration is often cited as a key success factor in many enterprise information management initiatives, such as metadata management, data quality improvement, master data management, and information governance. Yet it’s often difficult to engage individual contributors in these efforts because everyone is busy and time is a zero-sum game. However, a successful collaboration needn’t require a major time commitment from all contributors.
While a small core group of people must be assigned as full-time contributors to enterprise information management initiatives, success hinges on a large extended group of people making what Clive Thompson calls micro-contributions. In his book Smarter Than You Think: How Technology is Changing Our Minds for the Better, he explained that “though each micro-contribution is a small grain of sand, when you get thousands or millions you quickly build a beach. Micro-contributions also diversify the knowledge pool. If anyone who’s interested can briefly help out, almost everyone does, and soon the project is tapping into broad expertise.”
Wikipedia is a great example since anyone can click on the edit tab of an article and become a contributor. “The most common edit on Wikipedia,” Thompson explained, “is someone changing a word or phrase: a teensy contribution, truly a grain of sand. Yet Wikipedia also relies on a small core of heavily involved contributors. Indeed, if you look at the number of really active contributors, the ones who make more than a hundred edits a month, there are not quite thirty-five hundred. If you drill down to the really committed folks—the administrators who deal with vandalism, among other things—there are only six or seven hundred active ones. Wikipedia contributions form a classic long-tail distribution, with a small passionate bunch at one end, followed by a line of grain-of-sand contributors that fades off over the horizon. These hardcore and lightweight contributors form a symbiotic whole. Without the micro-contributors, Wikipedia wouldn’t have grown as quickly, and it would have a much more narrow knowledge base.”
MIKE2.0 is another great example since it’s a collaborative community of information management professionals contributing their knowledge and experience. While MIKE2.0 has a small group of core contributors, micro-contributions improve the breadth and depth of its open source delivery framework for enterprise information management.
The business, data, and technical knowledge about the end-to-end process of how information is being developed and used within your organization is not known by any one individual. It is spread throughout your enterprise. A collaborative effort is needed to make sure that important details are not missed—details that determine the success or failure of your enterprise information management initiative. Therefore, be sure to tap into the distributed knowledge of your enterprise by enabling and encouraging micro-contributions. Micro-contributions form a collaboration macro. Just as a computer macro is comprised of a set of instructions that are used to collectively perform a particular task, think of collaboration as a macro that is comprised of a set of micro-contributions that collectively manage your enterprise information.
What is more important, the process or its outcome? Information management processes, like those described by the MIKE2.0 Methodology, drive the daily operations of an organization’s business functions as well as support the tactical and strategic decision-making processes of its business leaders. However, an organization’s success or failure is usually measured by the outcomes produced by those processes.
As Duncan Watts explained in his book Everything Is Obvious: How Common Sense Fails Us, “rather than the evaluation of the outcome being determined by the quality of the process that led to it, it is the observed nature of the outcome that determines how we evaluate the process.” This is known as outcome bias.
While an organization is enjoying positive outcomes, such as exceeding its revenue goals for the current fiscal period, outcome bias basks processes in a rose-colored glow. Information management processes must be providing high-quality data to decision-making processes, which business leaders are using to make good decisions. However, when an organization is suffering from negative outcomes, such as a regulatory compliance failure, outcome bias blames it on broken information management processes and poor data quality that lead to bad decision-making.
“Judging the merit of a decision can never be done simply by looking at the outcome,” explained Jeffrey Ma in his book The House Advantage: Playing the Odds to Win Big In Business. “A poor result does not necessarily mean a poor decision. Likewise a good result does not necessarily mean a good decision.”
“We are prone to blame decision makers for good decisions that worked out badly and to give them too little credit for successful moves that appear obvious after the fact,” explained Daniel Kahneman in his book Thinking, Fast and Slow.
While risk mitigation is an oft-cited business justification for investing in information management, Kahneman also noted how outcome bias can “bring undeserved rewards to irresponsible risk seekers, such as a general or an entrepreneur who took a crazy gamble and won. Leaders who have been lucky are never punished for having taken too much risk. Instead, they are believed to have had the flair and foresight to anticipate success. A few lucky gambles can crown a reckless leader with a halo of prescience and boldness.”
Outcome bias triggers overreactions to both success and failure. Organizations that try to reverse engineer a single, successful outcome into a formal, repeatable process often fail, much to their surprise. Organizations also tend to abandon a new process immediately if its first outcome is a failure. “Over time,” Ma explained, “if one makes good, quality decisions, one will generally receive better outcomes, but it takes a large sample set to prove this.”
Your organization needs solid processes governing how information is created, managed, presented, and used in decision-making. Your organization also needs to guard against outcomes biasing your evaluation of those processes.
In order to overcome outcome bias, Watts recommended we “bear in mind that a good plan can fail while a bad plan can succeed—just by random chance—and therefore judge the plan on its own merits as well as the known outcome.”
In his book Open Data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation, Joel Gurin explained a type of Open Data called Smart Disclosure, which was defined as “the timely release of complex information and data in standardized, machine-readable formats in ways that enable consumers to make informed decisions.”
As Gurin explained, “Smart Disclosure combines government data, company information about products and services, and data about an individual’s own needs to help consumers make personalized decisions. Since few people are database experts, most will use this Open Data through an intermediary—a choice engine that integrates the data and helps people filter it by what’s important to them, much the way travel sites do for airline and hotel booking. These choice engines can tailor the options to fit an individual’s circumstances, budget, and priorities.”
Remember (if you are old enough) what it was like to make travel arrangements before websites like Expedia, Orbitz, Travelocity, Priceline, and Kayak existed, and you can imagine the immense consumer-driven business potential for applying Smart Disclosure and choice engines to every type of consumer decision.
“Smart Disclosure works best,” Gurin explained, “when it brings together data about the services a company offers with data about the individual consumer. Smart Disclosure includes giving consumers data about themselves—such as their medial records, cellphone charges, or patterns of energy use—so they can choose the products and services uniquely suited to their needs. This is Open Data in a special sense: it’s open only to the individual whom the data is about and has to be released to each person under secure conditions by the company or government agency that holds the data. It’s essential that these organizations take special care to be sure the data is not seen by anyone else. Many people may balk at the idea of having their personal data released in a digital form. But if the data is kept private and secure, giving personal data back to individuals is one of the most powerful aspects of Smart Disclosure.”
Although it sounds like a paradox, the best way to secure our personal data may be to make it open. Currently most of our own personal data is closed—especially to us, which is the real paradox.
Some of our personal data is claimed as proprietary information by the companies we do business with. Data about our health is cloaked by government regulations intended to protect it, but which mostly protects doctors from getting sued while giving medical service providers and health insurance companies more access to our medical history than we have.
If all of our personal data was open to us, and we controlled the authorization of secure access to it, our personal data would be both open and secure. This would simultaneously protect our privacy and improve our choice as consumers.
In his book Open Data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation, Joel Gurin explained that Open Data and Big Data are related but very different.
While various definitions exist, Gurin noted that “all definitions of Open Data include two basic features: the data must be publicly available for anyone to use, and it must be licensed in a way that allows for its reuse. Open Data should also be in a form that makes it relatively easy to use and analyze, although there are gradations of openness. And there’s general agreement that Open Data should be free of charge or cost just a minimal amount.”
“Big Data involves processing very large datasets to identify patterns and connections in the data,” Gurin explained. “It’s made possible by the incredible amount of data that is generated, accumulated, and analyzed every day with the help of ever-increasing computer power and ever-cheaper data storage. It uses the data exhaust that all of us leave behind through our daily lives. Our mobile phones’ GPS systems report back on our location as we drive; credit card purchase records show what we buy and where; Google searches are tracked; smart meters in our homes record our energy usage. All are grist for the Big Data mill.”
Private and Passive versus Public and Purposeful
Gurin explained that Big Data tends to be private and passive, whereas Open Data tends to be public and purposeful.
“Big Data usually comes from sources that passively generate data without purpose, without direction, or without even realizing that they’re creating it. And the companies and organizations that use Big Data usually keep the data private for business or security reasons. This includes the data that large retailers hold on customers’ buying habits, that hospitals hold about their patients, and that banks hold about their credit card holders.”
By contrast, Open Data “is consciously released in a way that anyone can access, analyze, and use as he or she sees fit. Open Data is also often released with a specific purpose in mind—whether the goal is to spur research and development, fuel new businesses, improve public health and safety, or achieve any number of other objectives.”
“While Big Data and Open Data each have important commercial uses, they are very different in philosophy, goals, and practice. For example, large companies may use Big Data to analyze customer databases and target their marketing to individual customers, while they use Open Data for market intelligence and brand building.”
Big and Open Data
Gurin also noted, however, that some of the most powerful results arise when Big Data and Open Data overlap.
“Some government agencies have made very large amounts of data open with major economic benefits. National weather data and GPS data are the most often-cited examples. U.S. census data and data collected by the Securities and Exchange Commission and the Department of Health and Human Services are others. And nongovernmental research has produced large amounts of data, particularly in biomedicine, that is now being shared openly to accelerate the pace of scientific discovery.”
Data Open for Business
Gurin addressed the apparent paradox of Open Data: “If Open Data is free, how can anyone build a business on it? The answer is that Open Data is the starting point, not the endpoint, in deriving value from information.” For example, even though weather and GPS data have been available for decades, those same Open Data starting points continue to spark new ideas, generating new, and profitable, endpoints.
While data privacy still requires sensitive data not be shared without consent and competitive differentiation still requires an organization’s intellectual property not be shared, that still leaves a vast amount of other data which, if made available as Open Data, will make more data open for business.
While data visualization often produces pretty pictures, as Phil Simon explained in his new book The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions, “data visualization should not be confused with art. Clarity, utility, and user-friendliness are paramount to any design aesthetic.” Bad data visualizations are even worse than bad art since, as Simon says, “they confuse people more than they convey information.”
Simon explained how data scientist Melinda Thielbar recommends using data visualization to help an analyst communicate with a nontechnical audience, as well as help the data communicate with the analyst.
“Visualization is a great way to let the data tell a story,” Thielbar explained. “It’s also a great way for analysts to fool themselves into believing the story they want to believe.” This is why she recommends developing the visualizations at the beginning of the analysis to allow the visualizations that really illustrate the story behind the data to stand out, a process she calls “building windows into the data.” When you look through a window, you may not like what you see.
“Data visualizations may include bad, suspect, duplicate, or incomplete data,” Simon explained. This can be a good thing, however, since data visualizations “can help users identify fishy information and purify data faster than manual hunting and pecking. Data quality is a continuum, not a binary. Use data visualization to improve data quality.” Even when you are looking at what appears to be the pretty end of the continuum, Simon cautioned that “just because data is visualized doesn’t necessarily mean that it is accurate, complete, or indicative of the right course of action.”
Especially when dealing with volume aspect of big data, data visualization can help find outliers faster. While detailed analysis is needed to determine whether the outlier is a business insight or a data quality issue, data visualization can help you shake those needles out of the haystack and into a clear field of vision.
Among its many other uses, which Simon illustrates well in his book, finding ugly data with pretty pictures is one way data visualization can be used for improving data quality.
In 1928, the physicist Paul Dirac, while attempting to describe the electron in quantum mechanical terms, posited the theoretical existence of the positron, a particle with all the electron’s properties but of opposite charge. In 1932, the experiments of physicist Carl David Anderson confirmed the positron’s existence, a discovery for which he was awarded the 1936 Nobel Prize in Physics.
“If you had asked Dirac or Anderson what the possible applications of their studies were,” Stuart Firestein wrote in his 2012 book Ignorance: How It Drives Science, “they would surely have said their research was aimed simply at understanding the fundamental nature of matter and energy in the universe and that applications were unlikely.”
Nonetheless, 40 years later a practical application of the positron became a part of one of the most important diagnostic and research instruments in modern medicine when, in the late 1970s, biophysicists and engineers developed the first positron emission tomography (PET) scanner.
“Of course, a great deal of additional research went into this as well,” Firestein explained, “but only part of it was directed specifically at making this machine. Methods of tomography, an imaging technique, some new chemistry to prepare solutions that would produce positrons, and advances in computer technology and programming—all of these led in the most indirect and fundamentally unpredictable ways to the PET scanner at your local hospital. The point is that this purpose could never have been imagined even by as clever a fellow as Paul Dirac.”
This story came to mind since it’s that time of year when we try to predict what will happen next year.
“We make prediction more difficult because our immediate tendency is to imagine the new thing doing an old job better,” explained Kevin Kelly in his 2010 book What Technology Wants. Which is why the first cars were called horseless carriages and the first cellphones were called wireless telephones. But as cars advanced we imagined more than transportation without horses, and as cellphones advanced we imagined more than making phone calls without wires. The latest generation of cellphones are now called smartphones and cellphone technology has become a part of a mobile computing platform.
IDC predicts 2014 will accelerate the IT transition to the emerging platform (what they call the 3rd Platform) for growth and innovation built on the technology pillars of mobile computing, cloud services, big data analytics, and social networking. IDC predicts the 3rd Platform will continue to expand beyond smartphones, tablets, and PCs to the Internet of Things.
Among its 2014 predictions, Gartner included the Internet of Everything, explaining how the Internet is expanding beyond PCs and mobile devices into enterprise assets such as field equipment, and consumer items such as cars and televisions. According to Gartner, the combination of data streams and services created by digitizing everything creates four basic usage models (Manage, Monetize, Operate, Extend) that can be applied to any of the four internets (People, Things, Information, Places).
These and other predictions for the new year point toward a convergence of emerging technologies, their continued disruption of longstanding business models, and the new business opportunities that they will create. While this is undoubtedly true, it’s also true that, much like the indirect and unpredictable paths that led to the PET scanner, emerging technologies will follow indirect and unpredictable paths to applications as far beyond our current imagination as a practical application of a positron was beyond the imagination of Dirac and Anderson.
“The predictability of most new things is very low,” Kelly cautioned. “William Sturgeon, the discoverer of electromagnetism, did not predict electric motors. Philo Farnsworth did not imagine the television culture that would burst forth from his cathode-ray tube. Advertisers at the beginning of the last century pitched the telephone as if it was simply a more convenient telegraph.”
“Technologies shift as they thrive,” Kelly concluded. “They are remade as they are used. They unleash second- and third-order consequences as they disseminate. And almost always, they bring completely unpredicted effects as they near ubiquity.”
It’s easy to predict that mobile, cloud, social, and big data analytical technologies will near ubiquity in 2014. However, the effects of their ubiquity may be fundamentally unpredictable. One unpredicted effect that we all became painfully aware of in 2013 was the surveillance culture that burst forth from our self-surrendered privacy, which now hangs the Data of Damocles wirelessly over our heads.
Of course, not all of the unpredictable effects will be negative. Much like the positive charge of the positron powering the positive effect that the PET scanner has had on healthcare, we should charge positively into the new year. Here’s to hoping that 2014 is a happy and healthy new year for us all.
I like it when I stumble across examples of information management concepts. While working on a podcast interview with William McKnight discussing his new book Information Management: Strategies for Gaining a Competitive Advantage with Data, I asked William for a song recommendation to play as background music while I read his bio during the opening segment of the podcast.
After William emailed me an Apple iTunes audio file for the song “Mother North” off of the 1996 album Nemesis Divina by Norwegian black metal band Satyricon, I ran into an issue when I attempted to play the song on my computer that provides two points about the information security aspects of information governance:
- The need to establish a way to enforce information security so that only authorized users can access protected information. In this case, the protected information is a song purchased from the Apple iTunes store, where purchases are associated with both an Apple ID and the computer used to purchase it. This establishes an information security policy that is automatically enforced whenever the information is accessed. If a security violation is detected, in this case by attempting to play the song on another computer, the policy prevents the unauthorized access.
- Information security policies also have to allow for unexpected, but allowable, exceptions otherwise security becomes too restrictive and inconveniences the user. In this case, Apple iTunes allows a song to be played on up to 5 computers associated with the Apple ID used to purchase it. This is an excellent example of the need to combine portability and security by embedding a security policy as the information’s travel companion. Apple does not just prevent you from playing the song, but offers the ability to prove you are authorized to play it on another computer by entering your Apple ID and password.
The goal of information security is to protect information assets against intrusion or inappropriate access. Comprehensive security must not be limited to the system of origination but must travel with the information, especially as today’s mobile users need to access information from multiple devices.
Much like the hills are alive with the sound of music, make sure that your information governance policies are alive with the sound of sound information security, thus making your organization’s easily accessible while appropriately protected information assets music to your users’ ears.
In Why Does E=mc2? (And Why Should We Care?), Brian Cox and Jeff Forshaw explained “the energy released in chemical reactions has been the primary source of power for our civilization since prehistoric times. The amount of energy that can be liberated for a given amount of coal, oil, or hydrogen is at the most fundamental level determined by the strength of the electromagnetic force, since it’s this force that determines the strength of the bonds between atoms and molecules that are broken and reformed in chemical reactions. However, there’s another force of nature that offers the potential to deliver vastly more energy for a given amount of fuel, simply because it’s much stronger.”
That other force of nature is nuclear fusion, which refers to any process that releases energy as a result of fusing together two or more nuclei. “Deep inside the atom lies the nucleus—a bunch of protons and neutrons stuck together by the glue of the strong nuclear force. Being glued together, it takes effort to pull a nucleus apart and its mass is therefore smaller, not bigger, than the sum of the mass of its individual proton and neutron parts. In contrast to the energy released in a chemical reaction, which is a result of electromagnetic force, the strong nuclear force generates a huge binding energy. The energy released in a nuclear reaction is typically a million times the energy released in a chemical reaction.”
We often ignore the psychology of collaboration when we say that a collaborative team, working on initiatives such as information governance, is bigger than the sum of its individual contributors.
“The reason that fusion doesn’t happen all the time in our everyday experience,” Cox and Forshaw explained, “is that, because the strong force operates only over short distances, it only kicks in when the constituents are very close together. But it is not easy to push protons together to that distance because of their electromagnetic repulsion.”
Quite often the reason successful collaboration doesn’t happen is that the algebra of collaboration also requires the collaborators subtract something from the equation—their egos, which generate a strong ego-magnetic repulsion making it far from easy to bind the collaborative team together.
Cox and Forshaw explained it’s because of the equivalence of mass and energy that a loss of mass manifests itself as energy. If we jettison the mass of our egos when forming the bonds of collaboration, then we is smaller than the sum of its me parts, and that loss of me-mass will manifest itself as the we-energy we need to bind our collaborative teams together.
During a podcast with Dr. Alexander Borek discussing his highly recommended new book Total Information Risk Management, he explained that “information is increasingly becoming an extremely valuable asset in organizations. The dependence on information increases as it becomes more valuable. As value and dependence increase, so does the likelihood of the risk that arises from not having the right information of the required quality for a business activity available at the right time.”
Borek referred to risk as the anti-value of information, explaining how the consequence of ineffective information management is poor data and information quality, which will lower business process performance and create operational, strategic, and opportunity risks in the business processes that are crucial to achieve an organization’s goals and objectives.
Information risk, however, as Borek explained, “also has a positive side to it: the opportunities that can be created. Your organization collects a lot of data every single day. Most of the data is stored in some kind of database and probably never used again. You should try to identify your hidden treasures in data and information. Getting it right can provide you with almost endless new opportunities, but getting it wrong not only makes you miss out on these opportunities, but also creates risks all over your business that prevent you from performing well.”
Since risk is the anti-value of information, Borek explained, “when you reduce risk, you create value, and you can use this value proposition to make the business case for information quality initiatives.”
While most business leaders will at least verbally acknowledge the value of information as an asset to the organization, few acknowledge the risk that negates the value of this asset when information management and governance is not a business priority. This means not just talking the talk about how information is an asset, but walking the walk by allocating the staffing and funding needed to truly manage and govern information as an asset—and mitigate the risk of information becoming a liability.
Whether your organization’s information maturity is aware, reactive, proactive, managed, or optimal, you must remain vigilant about information management and governance. If you need to assess your organization’s information maturity levels, check out the MIKE2.0 Information Maturity QuickScan.
TODAY: Fri, March 24, 2017March2017