John T. Behrens, VP-Advance Computing & Data Science Lab, Pearson
Data Visualization used to be the realm of data scientist and technical specialists. Now, it is the primary interface for employee and customer interactions with data. It is how the end-user experiences data-driven business and for many, it will constitute how they experience the CIO in their day-to-day work.
"Data Visualization should promote intimacy with your data and analysis, not detachment"
It’s your best shot at direct marketing IT evolution to the whole organization. Data and data visualization software are evolving rapidly and may need to be reconsidered. Below I present some key ideas on how to think about data and data visualization.
The vast majority of “data” displays communicate the results of analytic summaries or statistics, not raw data. This is an essential point because the friendliness of data visualization can distract us from the fact that a long road has been taken from the data to the visualization. While certain important points are highlighted in data visualization, other points are obscured. Data visualization systems should be integrated into an overall process that allows tracking of the provenance of the information and the ability to unpack hidden assumptions quickly. When the CEO says “I see the average, but how far does that vary?” you don’t want to spend three days rummaging through spreadsheets.
The purposes of data visualization can be split into the goals of communication and exploration. Communication is what we consider when we think about presenting charts and graphs and dashboards. This a clear picture of what needs to be communicated. The bigger opportunity in data visualization centers on dynamic updating and interactive data analysis. Current models of visualization conceptualize the graphic not as a static picture, but as an interactive active object. Click on the bar chart for customers from New England, and all the other graphics immediately highlight those customers as well. Click the sub-set button and all the displays are now updated to show only customer data from New England. In modern data analysis, the display is the input control.
A Chance for Process Improvement
A few years ago my team worked with a group that had to create a product by balancing trade-offs across a number of variables at once. The process was a traditional batch process:
We would send a report about the data underlying the proposed product given a certain configuration of variables (what if X1 was high and X2 was low and X3 was…), they would review it and make another request. At that rate the process was going to take months. To decrease time to market and enable the end-user we created a simple data visualization using slider-bars. The end user could move the sliders to the appropriate location (X1 is high, X2 is low and X3 is…). After each slider-bar movement the statistical model of the data was updated, a graphical visualization was updated, and a table outlining implications of the product configuration was updated. With a simple interactive visualization tool (that took two days to create) we empowered the end-user to cut this process from months to weeks and we were free to move on. This was not simply about creating data visualization; it was about combining the power of visual communication with interactive and dynamic computing to bring the user close to their data.
Not Just for Analysts Anymore
Sometime we think about data visualization as a feature of the business intelligence suite or the analytics platform. The power of data visualization is much greater than that and should be considered broadly in all end-user experience contexts. As applications become increasingly mobile and data generating, customers increasingly see data generation and access to data as a given-even for “non-data” products. Data visualization is about empowering end-users writ large, not just business analysts.
“We don’t want visualizations and analysis tools, we want actionable insights”.
I keep this quote from an internal customer on my desk to remind me that all the technically excellent and compelling visualization are useless if they do not meet the customer where they are. By far the most important issue is: who will be using the tools and how will they interpret them? It is essential to consider the personas of your end users and approach the issue as a user-experience concern as well as an analytic concern. The tools of data visualization need to be aligned with the goals and background of the end users. Luckily many current visualization tools allow for multiple levels of customization and access control: Lock down specific displays for some-end users, allow customization for others, allow a range of explorations for others.
I recently encountered an analytics package that was so statistically intelligent it would pick the “right” predictive model for you and provided a scatterplot display of the raw time-series as well as the prediction path. For some sets of data it worked rather well but for others it worked very poorly; a fact that was easily visible in the scatterplot. Unfortunately the package could neither tell me which type of algorithm was used, or allow me to change it. It un-interactively just did a bad job. Luckily the simple scatter plot visualization was easily able to argue against its own fancy statistical add-on. Visualization should promote intimacy with your data and analysis, not detachment. Visualization software will continue to become increasingly intelligent, but we should evaluate changes cautiously.
Data visualization will continue to evolve as a standard method to enable end users by helping them own and interacting with the data. It’s your chance to reach out directly to end user and help them into the age of data-driven business. It’s the last mile in the IT pipeline and your chance to shine.