Posts Tagged ‘data visualization’
While data visualization often produces pretty pictures, as Phil Simon explained in his new book The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions, “data visualization should not be confused with art. Clarity, utility, and user-friendliness are paramount to any design aesthetic.” Bad data visualizations are even worse than bad art since, as Simon says, “they confuse people more than they convey information.”
Simon explained how data scientist Melinda Thielbar recommends using data visualization to help an analyst communicate with a nontechnical audience, as well as help the data communicate with the analyst.
“Visualization is a great way to let the data tell a story,” Thielbar explained. “It’s also a great way for analysts to fool themselves into believing the story they want to believe.” This is why she recommends developing the visualizations at the beginning of the analysis to allow the visualizations that really illustrate the story behind the data to stand out, a process she calls “building windows into the data.” When you look through a window, you may not like what you see.
“Data visualizations may include bad, suspect, duplicate, or incomplete data,” Simon explained. This can be a good thing, however, since data visualizations “can help users identify fishy information and purify data faster than manual hunting and pecking. Data quality is a continuum, not a binary. Use data visualization to improve data quality.” Even when you are looking at what appears to be the pretty end of the continuum, Simon cautioned that “just because data is visualized doesn’t necessarily mean that it is accurate, complete, or indicative of the right course of action.”
Especially when dealing with volume aspect of big data, data visualization can help find outliers faster. While detailed analysis is needed to determine whether the outlier is a business insight or a data quality issue, data visualization can help you shake those needles out of the haystack and into a clear field of vision.
Among its many other uses, which Simon illustrates well in his book, finding ugly data with pretty pictures is one way data visualization can be used for improving data quality.
“I saw the angel in the marble and carved until I set him free.”
The era of Big Data has arrived, yet relatively few organizations seem to recognize it. Platitudes from CXOs are all fine and dandy, but how many have invested in Hadoop or hired a data scientist? Not too many, in my view. (See “Much Hadoop About Nothing.“)
Brass tacks: The hype around Big Data today is much greater than the reality–and it probably will be for some time.
This is unfortunate, as many organizations already have within their walls very valuable data that could be turned into information and knowledge with the right tools. Because of their unwillingness to adopt more contemporary Big Data and dataviz applications, though, that knowledge effectively hides in plain sight. The ROI question still paralyzes many CXOs afraid to jump into the abyss.
I know something about the notion of hiding in plain sight. It is one of the major themes of my favorite TV show, Breaking Bad.
To some extent, I understand the reluctance surrounding Hadoop. After all, it’s a fundamentally different way of thinking about data, modeling, and schema. Most IT professionals are used to thinking about data in orderly and relational terms, with tables and JOIN statements. Those will the skills to work with this type of data are in short supply, at least for the time being. “Growing” data scientists and a new breed of IT professionals doesn’t happen overnight. The same thing happens with lawyers and doctors.
Overcoming stasis isn’t easy, especially in budget-conscious, risk-averse organizations. To that end, here are a few tips on getting started with Big Data:
- Don’t try to boil the ocean. Small wins can be huge, to paraphrase from the excellent book The Power of Habit: Why We Do What We Do in Life and Business.
- Communicate successes. Getting people to come to you is much easier than forcing them. The carrot is more effective than the stick.
- Under-promise and over-deliver.
Compared to a year ago, I have seen progress with respect to Big Data adoption. Increasingly, intelligent people and companies are doing more with new forms of data—and getting more out of it. As a result, data visualization has become a big deal. To paraphrase Michelangelo, they are starting to set the data free.
What say you?
Is contemporary dataviz really new?
So would argue no. After all, many of the same reporting, business intelligence, and analytics applications also provide at least rudimentary levels of data visualization and have for some time now. Yes, there are “pure” dataviz tools like Tableau, but clear lines of demarcation among these terms just do not exist. In fact, lines between terms blur considerably.
But I would argue that modern-day dataviz really is new. This begs the natural question: How is contemporary dataviz fundamentally different than KPIs, dashboards, and other reporting tools?
In short, dataviz is about data exploration and discovery, not traditional reporting. To me, those trried and true terms always implied that that the organization, department, group, or individual employee knew exactly what to ask and measure. Example included:
- How many sales per square foot are we seeing?
- What’s employee turnover?
- What’s our return on assets?
These are still important questions, even in an era of Big Data. But contemporary dataviz is less, well, certain. There’s a genuine curiosity at play when you don’t know exactly when you don’t know what you’re looking for, much less what you’ll find.
In keeping with the data discovery theme of this post, why not try to answer my question about dataviz using dataviz? Still, while it’s only a proxy, I find Google Trends to be a very useful tool for answering questions about what’s popular/new, where, when, and how things are changing. For instance, consider the searches taking place on “data visualization” over the past four years throughout the world:
Since I live in the US, I was curious about how my home country broke down. In other words, is dataviz more popular in different parts of the country? With Google Trends, that’s not hard to see:
Note here that new and popular are not necessarily one and the same. Again, this was meant to serve as a proxy–and to illustrate the fact that dataviz doesn’t necessarily lead to a particular next step. I was exploring the data and, if I really wanted, I could keep going.
Data discovery doesn’t necessarily lead to a logical outcome–and that’s fine.
What say you?
ITWorld recently ran a great article on the perils of data visualization. The piece covers a number of companies, including Carwoo, a startup that aims to make car buying easier. The company has been using dataviz tool Chartio for a few months. From the article:
Around a year ago, Rimas Silkaitis, a product manager at Carwoo, started looking for a better way to handle the many requests for data visualizations that his co-workers were making.
He looked at higher end products, like those from GoodData and Microstrategy. “Then I realized, hey, we’re a startup, we don’t have that kind of money,” he said. “That’s when we found Chartio.”
Now, most of the 40-person company–except sales and customer service, which have their own tools–have access to Chartio.
Silkaitis said he worries a bit about users misinterpreting data and creating bad visualizations, but he’s implemented procedures that seem to be working so far.
It starts with new hires. “Anybody that comes on new to the company, I sit them down and walk them through our data model and give them a tutorial on how Chartio works,” he said.
There are several key lessons in this piece related to intelligent data management, dataviz, and Big Data. Let’s review them.
DataViz Is Easier Than Ever
Over the last ten years, we have seen a proliferation of easy-to-use tools in many areas, and dataviz is no exception. Today, one needs not be a coder or work in the IT department to build powerful, interactive data visualization tool. Dragging and dropping and slicing and dicing are more prevalent than ever. Chartio is just one of dozens or hundreds of user-friendly applications that can make data come to life.
DataViz Can Be Abused
Often we look at visual representations of data and the required decision or trend seems obvious. But is it? Is the data or the dataviz masking what’s really going on? Are we seeing another example of Simpson’s Paradox?
Even with Small Data, there was tremendous potential for statistical abuse. You can multiply that by 1,000 thanks to Big Data.
Democratized DataViz Will Result in Some Bad Visualizations…and More
Some people lament the state of book publishing. Andrew Keen is one of them. Now that anyone can do it, everyone is doing it. One of the results: many self-published books look downright awful.
And the same holds true with data visualization. There are many truly awful ones out there. All else being equal, a bad dataviz will result in a bad decision. Period.
DataViz Guarantees Nothing
Even organizations that deploy powerful contemporary dataviz solutions guarantee nothing. The “right” decision still needs to be executed correctly and in a reasonable period of time.
But even if all of these dominoes fall, an organization still falls fall short of anything near 100-percent certainty of success. The world doesn’t stand still and plenty of other business realities should shatter existing delusions.
Simon Says: DataViz Requires Effective Communication and Education
Kudos for Silkaitis for understanding the need for employee training and education around Carwoo’s data. Without the requisite background, it’s easy for employees to abuse data–and make poor business decisions as a result. User-friendly tools are fine and dandy, but don’t think for a minute even the friendliest of tools obviates the need for occasional in-person communication.
What say you?
On October 28, 2012, the Oklahoma City Thunder traded star sixth-man James Harden to the Houston Rockets. The move was not entirely expected, as the team was unable to work out a long-term extension with Harden. Fans were disappointed, as this trade broke up the young core of the Western Conference champions. (Harden was looking for a max contract and the Thunder had two max players signed long-term already.*)
While the move itself wasn’t entirely unexpected, the data behind the move was even more surprising.
Rockets’ GM Daryl Morey comes from the Moneyball school of sports management. That is, all else equal, it’s better to make decisions based upon data than gut instinct. To this end, Morey had long coveted Harden, an incredibly efficient player.
As the following chart from HotShotCharts demonstrates, Harden naturally navigates to places on the floor that lend themselves to high expected values. (Click on the image to expand it).
You can noodle for days on the HSC site, looking at visual data from different teams, players, and arenas. For his part, Harden generally takes shorter three-pointers and layups. (See the red dots above.) He avoids long two-pointers because they have lower expected values. Note the low shot counts inside the arc but outside of the paint.
What’s more, field goal percentage (FGA) is a better gauge of player effectiveness. Players like Kobe Bryant, Allen Iverson, and Carmelo Anthony score a bunch of points, but they typically take far too many shots. (Even I would score ten points per game if you gave me enough shots, I’m not very good at hoops.)
Data is permeating every facet of business and, I’d argue, life. While not a complete substitute for common sense, we are seeing dataviz tools crystallize differences among companies, products, and even NBA players.
Relying exclusively on old standbys like Microsoft Excel leaves money on the table. Why not look at different ways to view your data? You may well be surprised at what you find.
What say you?
* The Thunder offered Harden $55.5 million over four years–$4.5 million less than the max deal Harden coveted and will get from the Rockets, sources told ESPN The Magazine.
Fifteen years ago, the presentation of data typically fell under the purview of analysts and IT professionals. Quarterly or annual meetings entailed rolling data up into now quaint diagrams, graphs, and charts.
My, how times have changed. Today, data is everywhere. We have entered the era of Big Data and, as I write in Too Big to Ignore, many things are changing.
Big Data: Enterprise Shifts
In the workplace, let’s focus on two major shifts. First, today it’s becoming incumbent upon just about every member of a team, group, department, and organization to effectively present data in a compelling manner. Hidden in the petabytes of structured and unstructured data are key consumer, employee, and organizational insights that, if unleashed, would invariably move the needle.
Second, data no longer needs be presented on an occasional or periodic basis. Many employees are routinely looking at data of all types, a trend that will only intensify in the coming years.
The proliferation of effective data visualization tools like Ease.ly and Tableau provides tremendous opportunity. (The latter just went public with the übercool stock symbol $DATA.) Sadly, though, not enough employees—and, by extension, organizations—maximize the massive opportunity presented by data visualization. Of course, notable exceptions exist, but far too many professionals ignore DV tools. The result: they fail to present data in visually compelling ways. Far too many of us rely upon old standbys: bar charts, simple graphs, and the ubiquitous Excel spreadsheet. One of the biggest challenges to date with Big Data: Getting more people actually use the data–and the tools that make that data dance.
This begs the question: Why the lack of adoption? I’d posit that two factors are at play here:
- Lack of knowledge that such tools exist among end users.
- Many end users who know of these tools are unwilling to use them.
Simon Says: Make the Data Dance
Big Data in and of itself guarantees nothing. Presenting findings to senior management should involve more than pouring over thousands of records. Yes, the ability to drill down is essential. But starting with a compelling visual represents a strong start in gaining their attention.
Big Data is impossible to leverage with traditional tools (read: relational databases, SQL statements, Excel spreadsheets, and the like.) Fortunately, increasingly powerful tools allow us to interpret and act upon previously unimaginable amounts of data. But we have to decide to use them.
What say you?
When most people think of data, images of complex Microsoft workbooks and spreadsheets come to mind. Tables with rows and columns of structured data like dates, stock prices, sales, home sales, and invoices.
Historically, many analysts and execs alike have had to think about data in this rather pedestrian way. To some extent, BI projects started in the mid to late 1990s changed that, although many organizations never “got around” to them. Excel was the killer app for this type of thing: simple, relatively powerful, and good enough.
These days, however, data visualization tools like Tableau and others allow users at all levels within an organization to think of data in a fundamentally different way. To paraphrase from the classic Peter Frampton album, data is starting to come alive.
Stories Over Spreadsheets
Are we talking about the death of the spreadsheet? Of course not. I just don’t see that happening anytime soon. However, no longer is Excel with attendant charts and pivot tables the sole means by which to present data, particularly to decision makers.
In the words of Kris Hammond, CTO of Narrative Science, a joint research project at Northwestern University Schools of Engineering and Journalism, ”For some people, a spreadsheet is a great device. For most people, not so much so. The story. The paragraph. The report. The prediction. The advisory. Those are much more powerful objects in our world, and they’re what we’re used to.”
No argument here, but simple Excel charts can’t possibly do justice to certain types of data. Look at the following figure:
One could make the argument that this is the equivalent of data art.
Get out of the “data is boring” mind-set. It doesn’t have to be. SaaS-based and open-source tools allow even cash-strapped organizations to make data interactive, informative, and dare I say exciting. Forget new colors, fonts, or superficial treatments. More than ever, it’s easy to make your data tell a story, to learn new things from visualized data that would otherwise be lost in plain-Jane columns and rows.
Without question, data can be turned into information and, ultimately, knowledge. Old school employees and execs need to realize that decisions for the most part today should be made based upon solid data, but the presentation of that data need not be boring.
What say you?
TODAY: Mon, April 24, 2017April2017