Archive for the ‘Business Intelligence’ Category
Organisations are more complex today than ever before, largely because of the ability that technology brings to support scale, centralisation and enterprise-wide integration. One of the unpleasant side effects of this complexity is that it can take too long to get decisions made.
With seemingly endless amounts of information available to management, the temptation to constantly look for additional data to support any decision is too great for most executives. This is even without the added fear of making the wrong decision and hence trying to avoid any decision at all.
While having the right information to support a choice is good, in a world of Big Data where there are almost endless angles that can be taken, it is very hard to rule a line under the data and say “enough is enough”.
Anyone who has ever studied statistics or science would know that the interpretation of results is something that needs to be based on criteria that have been agreed before the data is collected. Imagine if the veracity of a drug could be defined after the tests had been conducted, inevitably the result would be open to subjective interpretation.
The application of game mechanics to business is increasingly popular with products in the market supporting business activities such as sales, training and back-office processing.
Decision making, and the associated request for supporting data, is another opportunity to apply game mechanics. Perhaps a good game metaphor to use is volleyball.
As most readers will know, volleyball allows a team to hit a ball around their own side of the court in order to set it up for the best possible return to the opposition. However, each team is only allowed to have three hits before returning it over the net, focusing the team on getting it quickly to the best possible position.
Management decision making should be the same. The team should agree up-front what the best target position would be to get the decision “ball” to and make sure that everyone is agreed on the best techniques to get the decision there. The team should also agree on a reasonable maximum number of “hits” or queries to be allowed before a final “spike” or decision.is made.
That might mean for job interviews there will be no more than three interviews. For an investment case there will be no more than two meetings and three queries for new data. For a credit application there will no more than two queries for additional paperwork.
The most important aspect of improving the speed of decision making is the setting of these rules before the data for the decision is received. It is too tempting once three interviews with a job candidate have been completed to think that “just one more” will answer all the open questions. There will always be more data that could make an investment case appear more watertight. It is always easier to ask another question than to provide an outright yes or no to a credit application or interview candidate.
But simply setting rules doesn’t leverage the power of gamification. There needs to be a spirit of a shared goal. Everyone on the decision side of the court needs to be conscious that decision volleyball means that each time they are “hitting” the decision to the next person they are counting down to one final “spike” of a yes or no answer. The team are all scored regardless of whether they are the first to hit the “decision ball” or the last to make the “decision spike”.
Games, and management, are about more than a simple score. They are also about shared goals and an overarching narrative. The storyline needs to be compelling and keep participants engaged in between periods of intense action.
For management decisions, it is important that the context has an engaging goal that can be grasped by all. In the game of decision volleyball, this should include the challenge and narrative of the agile organisation. The objective is not just to make the right decision, but also along the way to set things up for the final decision maker to achieve the “spike” with the best chance of a decision which is both decisive and right.
The game of decision volleyball also has the opportunity to bring the information providers, such as the business intelligence team, into the story as well. Rather than simply providing data upon request, without any context, they should be engaged in understanding the narrative that the information sits in and how it will set the game up for that decisive “spike”.
Recently the W3C issued a Candidate Recommendation that caught my eye about the Data Cube Vocabulary which claims to be both very general but also useful for data sets such as survey data, spreadsheets and OLAP. The Vocabulary is based on SDMX, an ISO standard for exchanging and sharing statistical data and metadata among organizations, so there’s a strong international impetus towards consensus about this important piece of Internet 3.0.
Linked Open Data (LOD) is as W3C says “an approach to publishing data on the web, enabling datasets to be linked together through references to common concepts.” This is an approach based on W3′s Resource Desription Frmework (RDF), of course. So the foundational ontology actually to implement this quite worthwhile but grandiose LOD vision is undoubtedly to be this Data Cube Vocabulary — very handily enabling the exchange of
semantically annotated HTML tables. Note that relational government data tables can now be published as “LOD data cubes” which are shareable with a public adhering to this new world-standard ontology.
But as the key logical layer in this fast-coming semantic web, Data Cubes very well may affect the manner an Enterprise ontology might be designed. Start with the fact that Data Cubes are themselves built upon several more basic RDF-compliant ontologies:
The Data Cube Vocabulary says that every Data Cube is a Dataset that has a Dataset Definition (more a “one-off ontology” specification). Any dataset can have many Slices of a metaphorical, multi-dimensional “pie” of the dataset. Within the dataset itself and within each slice are unbounded masses of Observations – each observation has values for not only the measured property itself but also any number of applicable key values — that’s all there’s to it, right?
Think of an HTML table. A “data cube” is a “table” element, whose columns are “slices” and whose rows are “keys”. “Observations” are the values of the data cells. This is clear but now the fun starts with identifying the TEXTUAL VALUES that are within the data cells of a given table.
Here is where “Dataset Descriptions” come in — these are associated with
an HTML table elment an LOD dataset. These describe all possible dataset dimension keys and the different kinds of properties that can be named in an Observation. Text attributes, measures, dimensions, and coded properties are all provided, and all are sub-properties of
This is why a Dataset Description is a “one-off ontology”, because it defines only text and numeric properties and, importantly, no classes of functional things. So with perfect pitch, the Data Cube Vocabulary virtually requires Enterprise Ontologies to ground their property hierarchy with the measure, dimension, code, and text attribute properties.
Data Cubes define just a handful of classes like “Dataset” “Slice” “SliceKey” and “Observation”. How are these four classes best inserted to an enterprise’s ontology class hierarchy? “Observation” is easy — it should be the base class of all observable properties, that is, all and only textual properties. “SliceKey” is a Role that an observable property can play. A “Slice” is basically an annotated
rdf:Bag, mediated by
skos:Collection at times.
A “Dataset” is a hazy term applicable to anythng classifiable as data objects or as data structures, that is, “a set of data” is merely an aggregate collection of data items just as a data object or data structure is. Accordingly, a Data Cube “dataset” class might be placed at or near the root of a class hierarchy, but its more clear to establish it as a subclass of an
There’s more to this topic saved for future entries — all those claimed dependencies need to be examined.
ITWorld recently ran a great article on the perils of data visualization. The piece covers a number of companies, including Carwoo, a startup that aims to make car buying easier. The company has been using dataviz tool Chartio for a few months. From the article:
Around a year ago, Rimas Silkaitis, a product manager at Carwoo, started looking for a better way to handle the many requests for data visualizations that his co-workers were making.
He looked at higher end products, like those from GoodData and Microstrategy. “Then I realized, hey, we’re a startup, we don’t have that kind of money,” he said. “That’s when we found Chartio.”
Now, most of the 40-person company–except sales and customer service, which have their own tools–have access to Chartio.
Silkaitis said he worries a bit about users misinterpreting data and creating bad visualizations, but he’s implemented procedures that seem to be working so far.
It starts with new hires. “Anybody that comes on new to the company, I sit them down and walk them through our data model and give them a tutorial on how Chartio works,” he said.
There are several key lessons in this piece related to intelligent data management, dataviz, and Big Data. Let’s review them.
DataViz Is Easier Than Ever
Over the last ten years, we have seen a proliferation of easy-to-use tools in many areas, and dataviz is no exception. Today, one needs not be a coder or work in the IT department to build powerful, interactive data visualization tool. Dragging and dropping and slicing and dicing are more prevalent than ever. Chartio is just one of dozens or hundreds of user-friendly applications that can make data come to life.
DataViz Can Be Abused
Often we look at visual representations of data and the required decision or trend seems obvious. But is it? Is the data or the dataviz masking what’s really going on? Are we seeing another example of Simpson’s Paradox?
Even with Small Data, there was tremendous potential for statistical abuse. You can multiply that by 1,000 thanks to Big Data.
Democratized DataViz Will Result in Some Bad Visualizations…and More
Some people lament the state of book publishing. Andrew Keen is one of them. Now that anyone can do it, everyone is doing it. One of the results: many self-published books look downright awful.
And the same holds true with data visualization. There are many truly awful ones out there. All else being equal, a bad dataviz will result in a bad decision. Period.
DataViz Guarantees Nothing
Even organizations that deploy powerful contemporary dataviz solutions guarantee nothing. The “right” decision still needs to be executed correctly and in a reasonable period of time.
But even if all of these dominoes fall, an organization still falls fall short of anything near 100-percent certainty of success. The world doesn’t stand still and plenty of other business realities should shatter existing delusions.
Simon Says: DataViz Requires Effective Communication and Education
Kudos for Silkaitis for understanding the need for employee training and education around Carwoo’s data. Without the requisite background, it’s easy for employees to abuse data–and make poor business decisions as a result. User-friendly tools are fine and dandy, but don’t think for a minute even the friendliest of tools obviates the need for occasional in-person communication.
What say you?
Many organizations are wrapping their enterprise brain around the challenges of business intelligence, looking for the best ways to analyze, present, and deliver information to business users. More organizations are choosing to do so by pushing business decisions down in order to build a bottom-up foundation.
However, one question coming up more frequently in the era of big data is what should be the division of labor between computers and humans?
In his book Emergence: The Connected Lives of Ants, Brains, Cities, and Software, Steven Johnson discussed how the neurons in our human brains are only capable of two hundred calculations per second, whereas the processors in computers can perform millions of calculations per second.
This is why we should let the computers do the heavy lifting for anything that requires math skills, especially the statistical heaving lifting required by big data analytics. “But unlike most computers,” Johnson explained, “the brain is a massively parallel system, with 100 billion neurons all working away at the same time. That parallelism allows the brain to perform amazing feats of pattern recognition, feats that continue to confound computers—such as remembering faces or creating metaphors.”
As the futurist Ray Kurzweil has written, “humans are far more skilled at recognizing patterns than in thinking through logical combinations, so we rely on this aptitude for almost all of our mental processes. Indeed, pattern recognition comprises the bulk of our neural circuitry. These faculties make up for the extremely slow speed of human neurons.”
“Genuinely cognizant machines,” Johnson explained, “are still on the distant technological horizon, and there’s plenty of reason to suspect they may never arrive. But the problem with the debate over machine learning and intelligence is that it has too readily been divided between the mindless software of today and the sentient code of the near future.”
But even if increasingly more intelligent machines “never become self-aware in any way that resembles human self-awareness, that doesn’t mean they aren’t capable of learning. An adaptive information network capable of complex pattern recognition could prove to be one of the most important inventions in all of human history. Who cares if it never actually learns how to think for itself?”
Business intelligence in the era of big data and beyond will best be served if we let both the computers and the humans play to their strengths. Let’s let the computers calculate and the humans cogitate.
Like many, I’m one who’s been around since the cinder block days, once entranced by shiny Tektronix tubes stationed nearby a dusty card sorter. After years using languages as varied as Assembler through Scheme, I’ve come to believe the shift these represented, from procedural to declarative, has well-improved the flexibility of software organizations produce.
Interest has now moved towards an equally flexble representation of data. In the ‘old’ days when an organization wanted to collect a new data-item about, say, a Person, then a new column would first be added by a friendly database administrator to a Person Table in one’s relational database. Very inflexible.
The alternative — now widely adopted — reduces databases to a simple forumulation, one that eliminates Person and other entity-specific tables altogether. These “triple-stores” basically have just three columns — Subject, Predicate and Object — in which all data is stored. Triple-stores are often called ‘self-referential’ because first, the type of a Subject of any row in a triple-store is found in a different row (not column) in the triple-store and second, definitions of types are found in different rows of the triple-store. The benefits? Not only is the underlying structure of a triple-store unchanging, but also stand-alone metadata tables (tables describing tables) are unnecessary.
Why? Static relational database tables do work well enough to handle transactional records whose dataitems are usually well-known in advance; the rate of change in those business processes is fairly low, so that the cost of database architectures based on SQL tables is equally low. What, then, is driving the adoption of triple-stores?
The scope of business functions organizations seek to automate has enlarged considerably: the source of new information within an organization is less frequently “forms” completed by users, now more frequently raw text from documents; tweets; blogs; emails; newsfeeds; and other ‘social’ web and internal sources; which have been produced received &or retrieved by organizations.
Semantic technologies are essential components of Natural Language Processing (NLP) applications which extract and convert, for instance, all proper nouns within a text into harvestable networks of “information nodes” found in a triple-store. In fact during such harvesting, context becomes a crucial variable that can change with each sentence analyzed from the text.
Bringing us to my primary distinction between really semantic and non-semantic applications: really semantic applications mimic a human conversation, where the knowledge of an indivdual in a conversation is the result of a continuous accrual of context-specific facts, context-specific definitions, even context-specific contexts. As a direct analogy, Wittgenstein, a modern giant of philosophy, calls this phenomena Language Games to connote that one’s techniques and strategies for analysis of a game’s state and one’s actions, is not derivable in advance — it comes only during the play of the game, i.e., during processing of the text corpora.
Non-semantic applications on the other hand, are more similar to rites, where all operative dialogs are pre-written, memorized, and repeated endlessly.
This analogy to human conversations (to ‘dynamic semantics’) is hardly trivial; it is a dominant modelling technique among ontologists as evidenced by development of, for instance, Discourse Representation Theory (among others, e.g., legal communities have a similar theory, simply called Argumentation) whose rules are used to build Discourse Representation Structures from a stream of sentences that accommodate a variety of linguistic issues including plurals, tense, aspect, generalized quantifiers, anaphora and others.
“Semantic models” are an important path towards a more complete understanding of how humans, when armed with language, are able to reason and draw conclusions about the world. Relational tables, however, in themselves haven’t provided similar insight or re-purposing in different contexts. This fact alone is strong evidence that semantic methods and tools must be prominent in any organization’s technology plans.
In his recent Harvard Business Review blog post Are You Data Driven? Take a Hard Look in the Mirror, Tom Redman distilled twelve traits of a data-driven organization, the first of which is making decisions at the lowest possible level.
This is how one senior executive Redman spoke with described this philosophy: “My goal is to make six decisions a year. Of course that means I have to pick the six most important things to decide on and that I make sure those who report to me have the data, and the confidence, they need to make the others.”
“Pushing decision-making down,” Redman explained, “frees up senior time for the most important decisions. And, just as importantly, lower-level people spend more time and take greater care when a decision falls to them. It builds the right kinds of organizational capability and, quite frankly, appears to create a work environment that is more fun.”
I have previously blogged about how a knowledge-based organization is built upon a foundation of bottom-up business intelligence with senior executives providing top-down oversight (e.g., the strategic aspects of information governance). Following Redman’s advice, the most insightful top-down oversight is driving decision-making to the lowest possible level of a data-driven organization.
With the speed at which decisions must be made these days, organizations can not afford to risk causing a decision-making bottleneck by making lower-level employees wait for higher-ups to make every business decision. While faster decisions aren’t always better, a shorter decision-making path is.
Furthermore, in the era of big data, speeding up your data processing enables you to integrate more data into your decision-making processes, which helps you make better data-driven decisions faster.
Well-constructed policies are flexible business rules that empower employees with an understanding of decision-making principles, trusting them to figure out how to best apply them in a particular context.
If you want to pull your organization, and its business intelligence, up to new heights, then push down business decisions to the lowest level possible. Arm your frontline employees with the data, tools, and decision-making guidelines they need to make the daily decisions that drive your organization.
The traditional notion of data warehousing is the increasing accumulation of structured data, which distributes information across the organization, and provides the knowledge base necessary for business intelligence.
In a previous post, I pondered whether a contemporary data warehouse is analogous to an Enterprise Brain with both structured and unstructured data, and the interconnections between them, forming a digital neural network with orderly structured data firing in tandem, while the chaotic unstructured data assimilates new information. I noted that perhaps this makes business intelligence a little more disorganized than we have traditionally imagined, but that this disorganization might actually make an organization smarter.
Business intelligence is typically viewed as part of a top-down decision management system driven by senior executives, but a potentially more intelligent business intelligence came to mind while reading Steven Johnson’s book Emergence: The Connected Lives of Ants, Brains, Cities, and Software.
Fifteen years ago, the presentation of data typically fell under the purview of analysts and IT professionals. Quarterly or annual meetings entailed rolling data up into now quaint diagrams, graphs, and charts.
My, how times have changed. Today, data is everywhere. We have entered the era of Big Data and, as I write in Too Big to Ignore, many things are changing.
Big Data: Enterprise Shifts
In the workplace, let’s focus on two major shifts. First, today it’s becoming incumbent upon just about every member of a team, group, department, and organization to effectively present data in a compelling manner. Hidden in the petabytes of structured and unstructured data are key consumer, employee, and organizational insights that, if unleashed, would invariably move the needle.
Second, data no longer needs be presented on an occasional or periodic basis. Many employees are routinely looking at data of all types, a trend that will only intensify in the coming years.
The proliferation of effective data visualization tools like Ease.ly and Tableau provides tremendous opportunity. (The latter just went public with the übercool stock symbol $DATA.) Sadly, though, not enough employees—and, by extension, organizations—maximize the massive opportunity presented by data visualization. Of course, notable exceptions exist, but far too many professionals ignore DV tools. The result: they fail to present data in visually compelling ways. Far too many of us rely upon old standbys: bar charts, simple graphs, and the ubiquitous Excel spreadsheet. One of the biggest challenges to date with Big Data: Getting more people actually use the data–and the tools that make that data dance.
This begs the question: Why the lack of adoption? I’d posit that two factors are at play here:
- Lack of knowledge that such tools exist among end users.
- Many end users who know of these tools are unwilling to use them.
Simon Says: Make the Data Dance
Big Data in and of itself guarantees nothing. Presenting findings to senior management should involve more than pouring over thousands of records. Yes, the ability to drill down is essential. But starting with a compelling visual represents a strong start in gaining their attention.
Big Data is impossible to leverage with traditional tools (read: relational databases, SQL statements, Excel spreadsheets, and the like.) Fortunately, increasingly powerful tools allow us to interpret and act upon previously unimaginable amounts of data. But we have to decide to use them.
What say you?
In his 1938 collection of essays World Brain, H. G. Wells explained that “it is not the amount of knowledge that makes a brain. It is not even the distribution of knowledge. It is the interconnectedness.”
This brought to my brain the traditional notion of data warehousing as the increasing accumulation of data, distributing information across the organization, and providing the knowledge necessary for business intelligence.
But is an enterprise data warehouse the Enterprise Brain? Wells suggested that interconnectedness is what makes a brain. Despite Ralph Kimball’s definition of a data warehouse being the union of its data marts, more often than not a data warehouse is a confederacy of data silos whose only real interconnectedness is being co-located on the same database server.
Looking at how our human brains work in his book Where Good Ideas Come From, Steven Johnson explained that “neurons share information by passing chemicals across the synaptic gap that connects them, but they also communicate via a more indirect channel: they synchronize their firing rates, what neuroscientists call phase-locking. There is a kind of beautiful synchrony to phase-locking—millions of neurons pulsing in perfect rhythm.”
The phase-locking of neurons pulsing in perfect rhythm is an apt metaphor for the business intelligence provided by the structured data in a well-implemented enterprise data warehouse.
“But the brain,” Johnson continued, “also seems to require the opposite: regular periods of electrical chaos, where neurons are completely out of sync with each other. If you follow the various frequencies of brain-wave activity with an EEG, the effect is not unlike turning the dial on an AM radio: periods of structured, rhythmic patterns, interrupted by static and noise. The brain’s systems are tuned for noise, but only in controlled bursts.”
Scanning the radio dial for signals amidst the noise is an apt metaphor for the chaos of unstructured data in external sources (e.g., social media). Should we bring order to chaos by adding structure (or at least better metadata) to unstructured data? Or should we just reject the chaos of unstructured data?
Johnson recounted research performed in 2007 by Robert Thatcher, a brain scientist at the University of South Florida. Thatcher studied the vacillation between the phase-lock (i.e., orderly) and chaos modes in the brains of dozens of children. On average, the chaos mode lasted for 55 milliseconds, but for some children it approached 60 milliseconds. Thatcher then compared the brain-wave scans with the children’s IQ scores, and found that every extra millisecond spent in the chaos mode added as much as 20 IQ points, whereas longer spells in the orderly mode deducted IQ points, but not as dramatically.
“Thatcher’s study,” Johnson concluded, “suggests a counterintuitive notion: the more disorganized your brain is, the smarter you are. It’s counterintuitive in part because we tend to attribute the growing intelligence of the technology world with increasingly precise electromechanical choreography. Thatcher and other researchers believe that the electric noise of the chaos mode allows the brain to experiment with new links between neurons that would otherwise fail to connect in more orderly settings. The phase-lock [orderly] mode is where the brain executes an established plan or habit. The chaos mode is where the brain assimilates new information.”
Perhaps the Enterprise Brain also requires both orderly and chaos modes, structured and unstructured data, and the interconnectedness between them, forming a digital neural network with orderly structured data firing in tandem, while the chaotic unstructured data assimilates new information.
Perhaps true business intelligence is more disorganized than we have traditionally imagined, and perhaps adding a little disorganization to your Enterprise Brain could make your organization smarter.
In my last post, I discussed the sin of pride and information management (IM) projects. Today, let’s talk about envy, defined as “a resentful emotion that occurs when a person lacks another’s (perceived) superior quality, achievement or possession and wishes that the other lacked it.”
I’ll start off by saying that, much like lust, envy isn’t inherently bad. Wanting to do as well as another employee, department, division, or organization can spur improvement, innovation, and better business results. Yes, I’m channeling my inner Gordon Gekko: Greed, for lack of a better word, is good.
With respect to IM, I’ve seen envy take place in two fundamental ways: intra-organizational and inter-organizational Let’s talk about each.
This type of envy takes place when employees at the same company resent the succes of their colleagues. Perhaps the marketing folks for product A just can’t do the same things with their information, technology, and systems that their counterparts representing product B can. Maybe division X launched a cloud-based CRM or wiki and this angers the employees in division Y.
At its core, intra-organizational envy stems from the inherently competitive and insecure nature of certain people. These envious folks have an axe to grind and typically have some anger issues going on. Can someone say schadenfreude?
This type of envy takes place between employees at different companies. Let’s say that the CIO of hospital ABC sees what her counterpart at hospital XYZ has done. The latter has effectively deployed MDM, BI, or cloud-based technologies with apparent success. The ABC CEO wonders why his company is so ostensibly behind its competitor and neighbor.
I’ve seen situations like this over my career. In many instances, organization A will prematurely attempt to deploy more mature or Enterprise 2.0 technologies simply because other organizations have already done so–not because organization A itself is ready. During these types of ill-conceived deployments, massive corners are cut, particularly with respect to data quality and IT and data governance. The CIO of ABC will look at the outcome of XYZ (say, the deployment of a new BI tool) and want the same outcome, even though the two organizations’ challenges are unlikely to be the same in type and magnitude.
Envy is a tough nut to crack in large part because it’s part of our DNA. I certainly cannot dispense pithy advice to counteract thousands of years of human evolution. I will, however, say this: Recognize that envy exists and that it’s impossible to eradicate. Don’t be Pollyanna about it. Try to minimize envy within and across your organization. Deal with outwardly envious people sooner rather than later.
What say you?
Next up: gluttony.
TODAY: Sun, July 27, 2014July2014