June 25 was the OpenDNS engineering quarterly hackathon, a time when everyone gets to drop everything and work on one project in a single marathon session. For this hackathon, the data visualization team created a new timeline visualization.
The idea for the project came from some internal interviews with the research team. The purpose of these interviews were to understand the researchers’ typical workflow and uncover their need toward the data stored within OpenDNS Security Graph. To give a little more context, Security Graph is a tool built by the Security Labs team, sourced from 70+ billion DNS requests that OpenDNS handles each day. Given the size of data, however, one ongoing challenge is deciding how we can represent the data in a manner that will make the work of security researchers and analysts easier. The interviews gave us a number of great insights. One key finding was that the team would like to have a quick view of the important events for a particular domain in addition to how the features evolve over time. The reason for this is that malicious behavior may surface if visualized correctly and succinctly.
This finding led us to the idea of creating a timeline visualization. Our goal was to answer the question, “What does the life of a domain look like?” More specifically, our project would visualize the time stamps – including key events and turning points for a domain – whenever it changes (e.g., created or updated) its IP address or name server. The design is guided by Shneiderman’s Mantra of data visualization: Overview First, Zoom and Filter, then Details-on-Demand. The timeline gives the user a look at the whole history of a domain, zoom in and out, and the ability to select the dataset of interest. Furthermore, the user can click on every event on the timeline to expand it and view more detail.
Here is a very good image of what the timeline looks like from the original repository.
With the original design and prepared libraries at hand, we developed everything within that single day. The following two images show how we did and what we made: the whole process flow, and our timeline visualization. As you may notice from Figure 1, all data directly comes from our Investigate APIs.
Figure 1. Process Flow
Figure 2. Timeline Visualization
In Figure 2, “Key Events” displays the dates when events including the target domain occur. For example, the domain is registered, updated, or tagged by OpenDNS. “IP Addresses” represents historical records of the domain, so that the user can see how frequently the domain has changed its IP addresses over a certain period. (Note: as of today, windows7download.com is in our block list and considered as a malicious domain. Please do not attempt to browse to it).
During the development, we faced two problems. The first one is overlapping of event circles. This does not matter so much as long as two events do not happen at the exact same time since the visualization supports zoom-in. Yet hours and minutes are not always available in the data, which means that two events will be regarded to occur at the same time if the date is only available. The second problem is that it was hard to represent the start and end dates of an event. If we simply used a rectangle or line to represent this, another overlapping issue would come up here too. That is, multiple rectangles or lines may overlap with each other and it may be difficult to distinguish when the event starts or ends exactly.
Since our visualization was built within a single day, those two above are still remaining issues. Solutions to them will be part of our future work on this project.
We hope a new version solves all problems, and this visualization will help customers get a quicker, better insight into a domain’s history.