11 Nov 2012
“Errors using inadequate data are much less than those using no data at all.”
In my last post, I discussed wrath from an IM perspective. Today, I conclude this series by looking at the data management implications of gluttony:
meaning to gulp down or swallow, means over-indulgence and over-consumption of food, drink, or wealth items to the point of extravagance or waste. In some Christian denominations, it is considered one of the seven deadly sins—a misplaced desire of food or its withholding from the needy.
Now, years ago, one could pretty persuasively make the argument that you could have too much data. Data storage costs used to be much higher than they are today. What’s more, without getting too techie, structured or transactional data could not be compressed nearly as much as its unstructured equivalent (this is still true today). And, against that backdrop, many organizations made horrible decisions based upon having too much data.
Count among them an ex-employer of mine. When the company implemented its ERP system, it made the truly awful decision to bifurcate its data among two different environments. Rather than splurge on decent hardware that would support keeping key enterprise data in single tables, management spent ungodly sums on other “nice-to-have” systems. The net result for less technical end users: If they wanted to report on five years’ worth of financial transactions, payroll history, or sales, they’d have to run the same report twice and manually merge the results.
Few did and, as a result, many decisions were based upon hunches or incomplete or inaccurate data.
Gluttony and Big Data
Today, Big Data has arrived. The amount of unstructured data trumps its structured counterpart. To be sure, many old-school CIOs won’t get it. They worry about the inputs required to make Big Data hum (increased employee training, hardware upgrades, new software purchases, data transformation efforts) while the outputs are uncertain. Yes, open-source tools like Hadoop are free, but think free speech instead of free beer, as the old saying goes.
But there’s good news for those who think that this data makes me look fat. Data storage costs have tumbled, data compression has advanced, and new tools like Hadoop and NoSQL and columnar databases allow organizations to “pig out” on data. Big datawarehouses used to contain terabytes of data; now some of the largest contain petabytes.
Bottom line: today we can be as gluttonous as we want with our data. Diet if you wish, but new tools and technologies allow organizations to finally harness the power of unstructured data–and lots of it.
What say you?