• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Cisco Umbrella

Enterprise network security

  • Contact Sales
  • Login
    • Umbrella Login
    • Cloudlock Login
  • Why Us
    • Why Cisco Umbrella
      • Why Try Umbrella
      • Why DNS Security
      • Why Umbrella SASE
      • Our Customers
      • Customer Stories
      • Why Cisco Secure
    • Fast Reliable Cloud
      • Global Cloud Architecture
      • Cloud Network Status
      • Global Cloud Network Activity
    • Unmatched Intelligence
      • A New Approach to Cybersecurity
      • Interactive Intelligence
      • Cyber Attack Prevention
      • Umbrella and Cisco Talos Threat Intelligence
    • Extensive Integrations
      • IT Security Integrations
      • Hardware Integrations
      • Meraki Integration
      • Cisco Umbrella and SecureX
  • Products
    • Cisco Umbrella Products
      • Cisco Umbrella Cloud Security Service
      • Recursive DNS Services
      • Cisco Umbrella SIG
      • Umbrella Investigate
      • What’s New
    • Product Packages
      • Cisco Umbrella Package Comparison
      • – DNS Security Essentials Package
      • – DNS Security Advantage Package
      • – SIG Essentials Package
      • – SIG Advantage Package
      • Umbrella Support Packages
    • Functionality
      • DNS-Layer Security
      • Secure Web Gateway
      • Cloud Access Security Broker (CASB)
      • Cloud Data Loss Prevention (DLP)
      • Cloud-Delivered Firewall
      • Cloud Malware Protection
      • Remote Browser Isolation (RBI)
    • Man on a laptop with headphones on. He is attending a Cisco Umbrella Live Demo
  • Solutions
    • SASE & SSE Solutions
      • Cisco Umbrella SASE
      • Secure Access Service Edge (SASE)
      • What is SASE
      • What is Security Service Edge (SSE)
    • Functionality Solutions
      • Web Content Filtering
      • Secure Direct Internet Access
      • Shadow IT Discovery & App Blocking
      • Fast Incident Response
      • Unified Threat Management
      • Protect Mobile Users
      • Securing Remote and Roaming Users
    • Network Solutions
      • Guest Wi-Fi Security
      • SD-WAN Security
      • Off-Network Endpoint Security
    • Industry Solutions
      • Government and Public Sector Cybersecurity
      • Financial Services Security
      • Cybersecurity for Manufacturing
      • Higher Education Security
      • K-12 Schools Security
      • Healthcare, Retail and Hospitality Security
      • Enterprise Cloud Security
      • Small Business Cybersecurity
  • Resources
    • Content Library
      • Top Resources
      • Cybersecurity Webinars
      • Events
      • Research Reports
      • Case Studies
      • Videos
      • Datasheets
      • eBooks
      • Solution Briefs
    • International Documents
      • Deutsch/German
      • Español/Spanish
      • Français/French
      • Italiano/Italian
      • 日本語/Japanese
    • Security Definitions
      • What is Secure Access Service Edge (SASE)
      • What is Security Service Edge (SSE)
      • What is a Cloud Access Security Broker (CASB)
      • Cyber Threat Categories and Definitions
    • For Customers
      • Support
      • Customer Success Webinars
      • Cisco Umbrella Studio
  • Trends & Threats
    • Market Trends
      • Hybrid Workforce
      • Rise of Remote Workers
      • Secure Internet Gateway (SIG)
    • Security Threats
      • How to Stop Phishing Attacks
      • Malware Detection and Protection
      • Ransomware is on the Rise
      • Cryptomining Malware Protection
      • Cybersecurity Threat Landscape
      • Global Cyber Threat Intelligence
    •  
    • Woman connecting confidently to any device anywhere
  • Partners
    • Channel Partners
      • Partner Program
      • Become a Partner
    • Service Providers
      • Secure Connectivity
      • Managed Security for MSSPs
      • Managed IT for MSPs
    •  
    • Person looking down at laptop. They are connecting and working securely
  • Blog
    • News & Product Posts
      • Latest Posts
      • Products & Services
      • Customer Focus
      • Feature Spotlight
    • Cybersecurity Posts
      • Security
      • Threats
      • Cybersecurity Threat Spotlight
      • Research
    •  
    • Register for a webinar - with illustration of connecting securely to the cloud
  • Contact Us
  • Umbrella Login
  • Cloudlock Login
  • Free Trial
Security

Visualizing the Evolution

Author avatar of Chen YeChen Ye
Updated — October 15, 2020 • 6 minute read
View blog >

Evolutionary data is a collection of past events and circumstances. Understanding it can be extremely valuable, because it reveals history, brings insights to the present, and often times forecasts the future well. In this post we’ll outline some useful techniques for visualizing evolutionary data and provide tips to make a powerful impact.

The data

At OpenDNS, we possess a huge amount of evolutionary data for domains. First of all, we see every single query our customers have made, which depicts the query volume and the change of infrastructure for one domain over time. We also keep record of key timestamps, for example, the time when a domain firstly appears in the query logs. Then, malicious domains usually have more interesting data: e.g., the time when they get blocked by us. Lastly, we have the most complete record of whois data about one domain, from the moment a domain is registered, to its expected expiration date.
For continuous evolutionary data, such as a domain’s query volume over time, a simple line chart would be effective enough. In this blog, we are particularly interested in the other type of evolution data: time-to-event data, for instance, the time when a domain is created, or the time when a domain is flagged by us or another source. Security researchers at OpenDNS have been using this data for their work in the past.
We’ve published a previous blog about visualizing the life for one domain, but it’s important to look at things at large scale. By visualizing the time-to-event data for group of domains, we are able to find patterns and outliers, and re-examine our models.

The visualization

There are two types of graphs in this visualization tool. The first one is a group timeline (see figure 1). The idea is simple: firstly we draw a timeline for every event type, then for each domain, we mark the timestamp of an event on its own timeline with a circle, finally use line to connect all the events (circles). Therefore, each line on this graph represents one domain’s evolution. The slope of the line indicates the chronological order of events. In fact, when you only have two events, it’s a SlopeGraph invented by Edward Tufte.

Group Timeline
Figure 1. Group Timeline

The second type of graph, box plot, is a purely scientific way of showing the distribution of numerical data. To draw a box plot you need to calculate five statistics of your data: minimum, first quartile, median, third quartile, and maximum, see figure 2 below. Data points outside the range from minimum to maximum are outliers.
A box plot shows rich information within limited space, and is particularly useful for comparison between multiple sets of data. On the other hand, it does take some effort to interpret it for first-time viewers.

A box plot
Figure 2. A box plot

Using this tool, you can choose any pair of event types from your dataset, and it will calculate the time interval (number of days) between them, and draw the box plot.

Use Cases

Model Efficacy Analysis

Our security researcher Jeremiah implemented NLP-Rank, a model good at catching phishing domains. Let’s visualize a sample of malicious domains NLP-Rank has caught recently (See figure 3).

Figure 3. ODNS first_seen to ODNS first_tag
Figure 3. ODNS first_seen to ODNS first_tag

On the timeline, you will notice most of them are vertical lines, indicating no latency between when OpenDNS observes a domain for the first time (ODNS first_seen) and when we blocked it (ODNS first_tag), because nlp-rank is able to flag a phishing domain the moment it appears in our logs. The box plot is echoing this result with all the statistics equal to 0 day. 
However, the lines with irregular slope on the timeline (highlighted in orange) stand out, which correspond to the negative outliers on the box plot. In fact, there are a few complex reasons behind their existence. The main one is when a phishing domain was observed for the first time but it was not serving content yet, so we cached the domain after which it started a phishing attack. The second reason for negative outliers is that processing large logs can introduce some latency. Sometimes a phishing domain is live just for couple hours, and when we try to retrieve content, it’s already down. Lastly, we might also miss things during the gap when researchers maintain and upgrade the model.

Now let’s cross-check OpenDNS response with VirusTotal’s feed, by adding another event type from VirusTotal: the first time when VirusTotal detects a malicious url that has positive detections(VT_FirstFlag).

OpenDNS first_tag to VirusTotal first_flag
Figure 4. OpenDNS first_tag to VirusTotal first_flag

First of all, out of 267 domains, VirusTotal returns first_flag results for 165 of them.
In figure 4, looking at the box plot, the box (the lower to higher quartile) is located on the negative side of the scale, along with more negative outliers, which indicates that nlp-rank caught these domains earlier statistically. Accordingly, these domains have lines with negative slope on the group timeline. However, there are positive outliers, which is worth further investigation.

Exploit Kits Analysis
Exploit Kit Domains
Figure 5. Exploit Kit Domains

Following the same method let’s analyze some recent exploit kit domains.  In figure 5, we can quickly identify a few things:

  1. We usually block a EK domain a few hours after we see it.
  2. Our results are basically aligned with VirusTotal, as in the second box plot, most of the statistics are zero.
From Creation to Becoming Active
Figure 6. From Creation to Becoming Active

Now if we add another ingredient, the time a domain gets registered (Registered), the correlation becomes interesting. In figure 6, most domains were only “put into use” a certain period of time after they were registered, from a few days up to almost a year. This happens because attackers hijacked benign domain registrants and then create subdomains for their malicious content. This technique is also referred to as domain shadowing.  However, a small family of domains are not using domain shadowing but instead are freshly created for dedicated delivery of exploit kits. They are highlighted in orange in figure 6, and listed as below if you are interested: 

  • bqa2h6.f298wh[.]top
  • jw1f0y.wkfroa[.]top
  • qp5gwu.masihae[.]top
  • rn58cb.f298wh[.]top
  •  yca6j8.masihae[.]top
  • zx5wlc.wkfroa[.]top

Design

Through the above use cases, we demonstrated how visualizations such as box plots and slope graphs could be used in security research.  However, to make them even more enticing, these visualizations can be expanded by using a few interaction techniques to facilitate people explore their own dataset.

Filtering & Sorting

Often times visualizations provide the big picture, and ideally point out the direction of your next step. In this tool, filtering allows you to cross out some events that are not relevant, and give focus to the “real meat” of the analysis.
In addition, the ability to sort the timelines helps users find interesting correlations between two adjacent timelines events, which would be difficult to find in a text-only format.

Details on demand

Instead of calculating a box plot for each of the two events, the tool will provide this extra information when you want it, and you are free to select any pair of metrics matter.
Hovering to highlight and view domain detail is also supported.

Coloring

Although we didn’t apply colors in above examples, coloring can provide the users more useful information. You can use color to represent different sets of domains, so it’s easy to compare. Or, you can even use a linear color scale to represent quantitative data. In figure 1, the coloring is actually encoding the creation date, so the older the redder.

Next Steps

In this post, we analyzed phishing domains and exploit kit domains. Another interesting analysis is to compare the evolution pattern for different sets of the domains, such as domains by different attacks, threat types or classifiers.
Furthermore, If we could put this specific slice of data into a larger context, use it with other type of data, for example the query volume over time and the neighborhood connections within a large graph, it would reveal a clearer picture of the domains in question. However, it will certainly bring up more challenges to design, as we want to display more information but maintain systematic simplicity and great user experience at the same time.

Suggested Blogs

  • Cisco Umbrella Delivered Better Cybersecurity and 231% ROI February 21, 2023 2 minute read
  • Cisco Listed as a Representative Vendor in Gartner® Market Guide for Single-Vendor SASE January 26, 2023 3 minute read
  • How to Evaluate SSE Vendors: Questions to Ask, Pitfalls to Avoid June 23, 2022 5 minute read

Share this blog

FacebookTweetLinkedIn

Follow Us

  • Twitter
  • Facebook
  • LinkedIn
  • YouTube

Footer Sections

What we make

  • Cloud Security Service
  • DNS-Layer Network Security
  • Secure Web Gateway
  • Security Packages

Who we are

  • Global Cloud Architecture
  • Cloud Network Status
  • Cloud Network Activity
  • OpenDNS is now Umbrella
  • Cisco Umbrella Blog

Learn more

  • Webinars
  • Careers
  • Support
  • Cisco Umbrella Live Demo
  • Contact Sales
Umbrella by Cisco
208.67.222.222+208.67.220.220
2620:119:35::35+2620:119:53::53
Sign up for a Free Trial
  • Cisco Online Privacy Statement
  • Terms of Service
  • Sitemap

© 2023 Cisco Umbrella