Interactive Data Visualization & Exploration
I recently attended the Rakuten Technology Conference 2016 and saw a very interesting presentation from Erik Berlow titled “Next-Generation Interfaces for Thinking with Data“, which was about data scope interfaces.
I’m familiar with Data Science, data mining, and data visualization and have been working with a few tools, from RapidMiner to cloud based ML platforms such as IBM Watson, Microsoft, Amazon, & Google.
Data Scope Interfaces
Erik opened my eyes to some new ways of looking at the problem of constructing visualization and analysis tools in an increasingly data saturated world. He introduced the concept of Data Scopes, interactive tools for exploring multidimensional data and inter-relationships using the human visual system as the pattern recognition engine. The data can be pre-processed with Machine Learning to slice, dice, connect and categorize to provide a rich content source for the visual interfaces to explore.
Living in quake city (Tokyo) I decided to start with the NOAA Global Significant Earthquake Database, 2150 BC to present for this exercise.
As anyone who has worked with data science will tell you, massaging the data into shape is usually the most time consuming and painful part of the process. Garbage in, Garbage out as the old computer saying goes! I actually asked Erik in the Q&A what his processes were to clean the data, and he made the interesting comment that Data Scopes are also useful for rapidly detecting anomalous data. That’s why this earthquake data set is a great example because it has lots of missing data, in addition to lots of useful data.
Have a play with the Data Explorer below and see what patterns and relationships you can detect. See if you can spot where the data is missing, and see if you can spot where data looks a little strange (usually due to inaccuracy of records going back in time I suspect). It was a fun project and I learned a lot which I hope to apply moving forward to some more explorations.
Disclaimer: I did a small amount data manipulation for the purposes of this example . Many of the largest & oldest quake events are missing data in key categories, so I added a baseline 10000 to total injured in order to get some to display in the bubble chart.
Open the Quakescop in a new tab: https://sonicviz.github.io/quakescope/
Source code available on GitHub: https://github.com/sonicviz/quakescope