Open Framework, Information Management Strategy & Collaborative Governance | Data & Social Methodology - MIKE2.0 Methodology
Collapse Expand Close

To join, please contact us.

Improve MIKE 2.0
Collapse Expand Close
Need somewhere to start? How about the most wanted pages; or the pages we know need more work; or even the stub that somebody else has started, but hasn't been able to finish. Or create a ticket for any issues you have found.

Posts Tagged ‘data warehousing’

by: Phil Simon
19  Aug  2013

Mike Tyson, Big Data, and the Perils of the Project Mentality

Everybody has a plan until they get punched in the face.

–Mike Tyson

Who should do what on a Big Data project?

It seems like a logical and even necessary question, right? After all, Big Data is a big deal, and requires assistance from each line of business, the top brass, and IT, right?

Defining Roles

Matt Ariker, Tim McGuire, and Jesko Perry recently wrote a HBR post attempting to answer this question. In Five Roles You Need on Your Big Data Team, the three advocate five “important roles to staff your advanced analytics bureau”:

  1. Data Hygienists
  2. Data Explorers
  3. Business Solution Architects
  4. Data Scientists
  5. Campaign Experts

To be sure, everyone can’t and shouldn’t do everything in an era of Big Data. I can’t tell you for certain that bifurcating roles like the authors recommend won’t work. Still, I just don’t buy the argument that Big Data lends itself to everything fitting neatly in to traditional roles.

Take data quality, for instance. As Jim Harris writes:

The quality of the data in the warehouse determines whether it’s considered a trusted source, but it faces a paradox similar to “which came first, the chicken or the egg?” Except for the data warehouse it’s “which comes first, delivery or quality?” However, since users can’t complain about the quality of data that hasn’t been delivered yet, delivery always comes first in data warehousing.

Agreed. Traditional data warehousing projects could be thought of in a more linear fashion. In most cases, organizations were attempting to aggregate–and report on–their data (read: data internal to the enterprise). Once that source was added, maintenance was fairly routine, at least compared to today’s datasets. These projects tended to be more predictable.

But what happens when much if not most relevant data stems from outside of the enterprise? What do we do when new data sources start popping up faster than ever? Mike Tyson’s quote at the top of this post has never been more apropos.

Simon Says: Big Data Is Not Predictable

My point is that IT projects have start and end dates. Amazon, Apple, Facebook, Twitter, Google, and other successful companies don’t view Big Data as “IT projects.” This is a potentially lethal mistake. For its part, Netflix views both Big Data and data visualization as ongoing processes; they are never finished. I make the same point in my last book.

When you starting thinking of Big Data as an initiative or project with traditionally defined roles, you’re on the road to failure. Don’t make “data hygenics” or “data exploring” the sole purview of a group, department, or individual. Encourage others to step out of the comfort zones, notice things, test hypotheses, and act upon them.


What say you?

Tags: , ,
Category: Data Quality, Information Management
1 Comment »

by: Ocdqblog
27  Jun  2013

Bottom-Up Business Intelligence

The traditional notion of data warehousing is the increasing accumulation of structured data, which distributes information across the organization, and provides the knowledge base necessary for business intelligence.

In a previous post, I pondered whether a contemporary data warehouse is analogous to an Enterprise Brain with both structured and unstructured data, and the interconnections between them, forming a digital neural network with orderly structured data firing in tandem, while the chaotic unstructured data assimilates new information.  I noted that perhaps this makes business intelligence a little more disorganized than we have traditionally imagined, but that this disorganization might actually make an organization smarter.

Business intelligence is typically viewed as part of a top-down decision management system driven by senior executives, but a potentially more intelligent business intelligence came to mind while reading Steven Johnson’s book Emergence: The Connected Lives of Ants, Brains, Cities, and Software.


Tags: , ,
Category: Business Intelligence
1 Comment »

by: Ocdqblog
14  May  2013

Data Dictatorship versus Data Democracy

It seems rather obvious to state that choosing between a dictatorship and a democracy is such an easy choice it doesn’t even need to be discussed.  However, when it comes to data, it seems like the choice is not so obvious, since for as long as I can remember the data management industry has been infatuated with the notion of instituting some form of a Data Dictatorship.

Providing the organization with a single system of record, a single version of the truth, a single view, a golden copy, or a consolidated repository of trusted data has long been the rallying cry and siren song of data warehousing, and more recently, of master data management.

I admit a data dictatorship has its appeal, especially since most information development concepts are easier to manage and govern when you only have to deal with one official source of enterprise data.

Of course, the reality is most organizations are a Data Democracy, which means that other data sources, both internal and external, will be used.  During a recent Twitter chat, one discussion thread noted how the democratization of data has lead to the consumerization of data, which has made the silo-ification of data easier than ever.  Cloud-based services (and other consumerization of IT trends) make rolling your own data silo simple and inexpensive, at least in terms of financial cost, but arguably expensive in terms of splintering the enterprise’s data asset.

Forging one big data silo in the cloud to rule them all might soon be pitched as a new form of data dictatorship.  For this to happen, users must surrender the freedoms consumerization brought them, but history has shown reverting back to dictatorship after democracy is difficult, if not impossible.

As Winston Churchill famously said, “no one pretends that democracy is perfect or all-wise.  Indeed, it has been said that democracy is the worst form of government except all those other forms that have been tried from time to time.”

No one pretends that data democracy is perfect or all-wise.  Indeed, most data professionals would say that data democracy is the worst form of data governance except all those other forms that have been tried from time to time.  But perhaps it’s simply time to stop pursuing any form of data dictatorship.

Tags: , ,
Category: Enterprise Data Management, Information Development, Information Governance, Master Data Management

by: Ocdqblog
18  Apr  2013

The Enterprise Brain

In his 1938 collection of essays World Brain, H. G. Wells explained that “it is not the amount of knowledge that makes a brain.  It is not even the distribution of knowledge.  It is the interconnectedness.”

This brought to my brain the traditional notion of data warehousing as the increasing accumulation of data, distributing information across the organization, and providing the knowledge necessary for business intelligence.

But is an enterprise data warehouse the Enterprise Brain?  Wells suggested that interconnectedness is what makes a brain.  Despite Ralph Kimball’s definition of a data warehouse being the union of its data marts, more often than not a data warehouse is a confederacy of data silos whose only real interconnectedness is being co-located on the same database server.

Looking at how our human brains work in his book Where Good Ideas Come From, Steven Johnson explained that “neurons share information by passing chemicals across the synaptic gap that connects them, but they also communicate via a more indirect channel: they synchronize their firing rates, what neuroscientists call phase-locking.  There is a kind of beautiful synchrony to phase-locking—millions of neurons pulsing in perfect rhythm.”

The phase-locking of neurons pulsing in perfect rhythm is an apt metaphor for the business intelligence provided by the structured data in a well-implemented enterprise data warehouse.

“But the brain,” Johnson continued, “also seems to require the opposite: regular periods of electrical chaos, where neurons are completely out of sync with each other.  If you follow the various frequencies of brain-wave activity with an EEG, the effect is not unlike turning the dial on an AM radio: periods of structured, rhythmic patterns, interrupted by static and noise.  The brain’s systems are tuned for noise, but only in controlled bursts.”

Scanning the radio dial for signals amidst the noise is an apt metaphor for the chaos of unstructured data in external sources (e.g., social media).  Should we bring order to chaos by adding structure (or at least better metadata) to unstructured data?  Or should we just reject the chaos of unstructured data?

Johnson recounted research performed in 2007 by Robert Thatcher, a brain scientist at the University of South Florida.  Thatcher studied the vacillation between the phase-lock (i.e., orderly) and chaos modes in the brains of dozens of children.  On average, the chaos mode lasted for 55 milliseconds, but for some children it approached 60 milliseconds.  Thatcher then compared the brain-wave scans with the children’s IQ scores, and found that every extra millisecond spent in the chaos mode added as much as 20 IQ points, whereas longer spells in the orderly mode deducted IQ points, but not as dramatically.

“Thatcher’s study,” Johnson concluded, “suggests a counterintuitive notion: the more disorganized your brain is, the smarter you are.  It’s counterintuitive in part because we tend to attribute the growing intelligence of the technology world with increasingly precise electromechanical choreography.  Thatcher and other researchers believe that the electric noise of the chaos mode allows the brain to experiment with new links between neurons that would otherwise fail to connect in more orderly settings.  The phase-lock [orderly] mode is where the brain executes an established plan or habit.  The chaos mode is where the brain assimilates new information.”

Perhaps the Enterprise Brain also requires both orderly and chaos modes, structured and unstructured data, and the interconnectedness between them, forming a digital neural network with orderly structured data firing in tandem, while the chaotic unstructured data assimilates new information.

Perhaps true business intelligence is more disorganized than we have traditionally imagined, and perhaps adding a little disorganization to your Enterprise Brain could make your organization smarter.

Tags: , , , ,
Category: Business Intelligence

Collapse Expand Close
TODAY: Tue, March 19, 2019
Collapse Expand Close
Recent Comments
Collapse Expand Close