The universe is big, mysterious and full of secrets.

Every day, servers exchange enormous amounts of data. Usually, this data is kept and archived for a defined period of time. As we store more and more information, our desire to understand its behavior grows. The reason is simple: knowledge is powerful. So, if we identify a pattern in our past, we master the present – then we can predict the future.

Today’s data scientists are the modern oracles, always trying to discover ingenious ways to analyze information in order to identify new patterns and anomalies. Just like the first astronomers raised their eyes to the sky and clouds to predict the seasons, we are constantly analyzing a deluge of digital messages to monitor the general state of the system. It is crucial to step back and take a look at the big picture and understand that abstraction is the key to mastering the present. 

Visualize the universe

When it comes to patterns, semantic graphs are one of the most beautiful data structures out there. They can represent anything and can be applied to a wide range of problems, from social networks to artificial brains. Graph nodes can define any type of information and edges can model arbitrary connections between them.

Sure, it sounds simple in theory, but in practice we are dealing with billions and billions of nodes and edges! When it comes to visualization, there is a tremendous challenge in building an engine that can handle the vast amount of data that the universe holds. Even more importantly, the graph has to be dynamic and constantly changing over time. The goal is to build a robust and stable tool to establish, try, and confirm our hypotheses.

How does it work? The engine recreates a physical force system where connected nodes attract each other and disconnected nodes repulse each other. At first, all the nodes are placed at approximately the same position, forming a point of high density. Of course, at this stage it’s still impossible to see anything clearly. But then, we heat the node particles at a high temperature and let the physics engine do the rest!

Temperature creates a random particle movement and the messy network structure expands in space at high speed, throwing nodes in every direction.

That’s right, an explosion of data!

After this stage, the structure progressively reaches equilibrium. This is the true beauty of this method: we artificially reimplemented a natural force directed system and therefore natural structures emerge. We cool down the temperature progressively and as the particle acceleration decreases, the network structure crystallizes.

Demonstration

Since a picture is worth a thousand words, we’re sharing with you a WebGL application and a couple of screenshots of our visualization tool to illustrate this algorithm.

You may note that the process has been intentionally slowed down to make the attraction-repulsion forces more obvious. 

Controls:

  • Click play to start animation
  • Navigate with the mouse and keyboard arrows
  • Double click on a node to zoom in and target a node
  • Click anywhere to go back to FPS view
  • Select various display modes in the menu

NOTE: The engine may take a while to load, please be patient.

http://thibaultreuille.github.io/OpenDNS/raindance-2013-08-09-blog/ 

Umbrella Security Graph Data Sets

Using this visualization method, we can process the massive amounts of data flowing into to the Umbrella Security Graph at any given time – watching for new malicious patterns or any other unusual activity. It’s always interesting to navigate around certain unknown domains, exploring their infected neighbors to determine the likelihood of the domains becoming compromised as well. Below, you will see actual data sets extracted from the Umbrella Security Graph database, representing relationships between domains.

Domain neighborhood with a depth of 3 :

 1

Domain neighborhood with a depth of 4 :

3 5 4 

References :

http://en.wikipedia.org/wiki/Force-directed_graph_drawing

http://cs.brown.edu/~rt/gdhandbook/chapters/force-directed.pdf

 

This post is categorized in: