• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Cisco Umbrella

Enterprise network security

  • Free Trial
  • Contact us
  • Blog
  • Login
    • Umbrella Login
    • Cloudlock Login
  • Products
    • Product
      • Cisco Umbrella Cloud Security Service
      • Cisco Umbrella Investigate
      • Product Packages
      • Support Packages
    • Functionality
      • DNS-Layer Security
      • Secure Web Gateway
      • Cloud Access Security Broker (CASB)
      • Interactive Intelligence
      • Cloud-Delivered Firewall
    •  
    • Webinar signup
  • Solutions
    • By Need
      • Protect Mobile Users
      • Fast Incident Response
      • Web Content Filtering
      • Shadow IT Discovery & App Blocking
      • Unified Threat Enforcement
      • Reduce Security Infections
      • Secure Direct Internet Access
      • Securing Remote and Roaming Users
    • By Network
      • Protect Guest Wi-Fi
      • SD-WAN Security
      • Off-Network Endpoint Security
    • By Industry
      • Higher Education Security
      • K-12 Schools Security
      • Healthcare, Retail and Hospitality Security
      • Enterprise Cloud Security
      • Small Business Cybersecurity
      • Our Customers
      • Customer Stories
    • Ransomware Defense for Dummies book
  • Why Us
    • Fast Reliable Cloud
      • Global Cloud Architecture
      • Cloud Network Status
      • Cloud Network Activity
      • Recursive DNS Services
      • Top Reasons to Trial
      • Getting Started
    • Unmatched Intelligence
      • Cyber Attack Prevention
      • Interactive Intelligence
    • Extensive Integrations
      • IT Security Integrations
      • Hardware Integrations
      • Meraki Integration
      • Cisco SD-WAN
    • Navigation-dropdown-promo-free-trial_102820
  • Resources
    • Content Library
      • Top Resources
      • Cybersecurity Webinars
      • Events
      • Analyst Reports
      • Case Studies
      • Customer Videos
      • Datasheets
      • eBooks
      • Infographics
      • Solution Briefs
    • International Documents
      • Deutsch/German
      • Español/Spanish
      • Français/French
      • Italiano/Italian
      • 日本語/Japanese
    • Cisco Umbrella Blog
      • Latest Posts
      • Security Posts
      • Research Posts
      • Threats Posts
      • Product Posts
      • Spotlight
    • For Customers
      • Support
      • Customer Success Hub
      • Umbrella Deployment Hub
      • Customer Success Webinars
      • What’s New
      • Cisco Umbrella Studio
  • Trends & Threats
    • Market Trends
      • Rise of Remote Workers
      • Secure Internet Gateway (SIG)
      • Secure Access Service Edge (SASE)
    • Security Threats
      • Ransomware
      • Cryptomining Malware Protection
      • Cybersecurity Threat Landscape
    •  
    • Navigation-dropdown-promo-threat-report_020521
  • Partners
    • Channel Partners
      • Partner Program
      • Become a Partner
    • Service Providers
      • Secure Connectivity
      • Managed Security for MSSPs
      • Managed IT for MSPs
    •  
    • Become a partner
  • Free Trial Signup
  • Umbrella Login
  • Cloudlock Login
  • Contact Us
Research

Crime scene evidence of an infected site: Predicting malware by examining server software

By OpenDNS Security Research
Posted on June 13, 2013
Updated on October 15, 2020

Share

Facebook0Tweet0LinkedIn0

Every day, OpenDNS discovers thousands of websites serving malicious content, by harnessing massive amounts of DNS data.

Besides what DNS level data can tell us, examining the type of server software cybercriminals use also helps increase the accuracy of our algorithms.

In this experiment, we collected 50,000 domain names that have been actively serving malware between March 6th and June 6th, and 50,000 popular domain names that we never saw involved in malicious activities.

pielabelsIn all the following charts, the inner ring represents malicious domain names, whereas the outer ring represents data from supposedly benign domains.

Web server software

httpservers

As of today, Apache remains the most popular web server software, though Nginx is clearly on the rise.

That said, malicious domains run Apache more often (62.88%) than benign domains do (41.64%), when compared to Nginx it’s more the opposite (10.87% versus 26%).

Another interesting observation is, compared to malicious domains, benign domains clearly tend to obfuscate or hide the server software they are running. Our data show that malicious domains typically use one of nine different “Server:” header signatures. A staggering 95.27% of domains serving malware match these signatures, whereas benign domains match the same signatures only 17.23% of the time.

Some websites are also taking advantage of Content Delivery Networks (CDNs). However, we couldn’t find any domains currently serving malware using Akamai, Bitgravity, Cachefly, Chinacache, or Limelight.

Though it’s not inconceivable, one can assume that websites using one of these CDNs are much less likely to be malicious.

However, 0.2% of malicious domains are using Cloudflare, and 0.1% of them were using Microsoft Azure.

X-Powered-By header

poweredby

The next thing we examined was the “X-Powered-By” header, which is also an identifier for the software running a web application or site.

Although the difference is not significant, Plesk is found more often on compromised websites than benign ones (5.67% vs 1.49%). But perhaps most important to note here is the presence of a “X-Powered-By” header which doesn’t indicate the presence of Plesk, ASP, or PHP.

Web servers running Ruby (Rack), NodeJS (Express), Mono, and Java-based application servers (Jboss/Tomcat) are clearly less used for malware distribution than other software stacks.

Cookies

Cookies are a good indication of whether a website needs to somehow track a user, and also a good indication of what framework or application is running.

In order to ignore cookies sent by third-party services, like ad servers, we only analyzed the home page of each website, and discarded cross-domain content.

cookies

Approximately half of the visited benign websites don’t serve any cookies. Compare that to 77.58% of malicious websites that don’t serve cookies.

Benign websites also tend to have a higher diversity of cookie names than malicious websites.

This can be partly explained by the fact that cybercriminals will often target applications that are easier to compromise, and hosting services that are malware-friendly often offer similar operating systems and software stacks.

WordPress

Not all WordPress instances are sending cookies at the first visit.

A more reliable way to detect sites powered by WordPress that inspect cookies is to look for specific files. The one tested here is /wp-includes/wlwmanifest.xml.

wp

According to this test, no less than 19.50% of malicious/compromised sites are running WordPress.

But WordPress is also omnipresent on sites that haven’t been compromised (yet): the file was also found on 13.92% of the benign web sites from our training set.

Last-Modified

Looking at the “Last-Modified” header when requesting the home page is a good way to see whether a website is regularly updated.

Plotting the CDF of both classes of domains shows that sites whose home page hasn’t been recently updated have a higher likelihood to be malicious or compromised than sites containing more dynamic content.

days

Content-Length

The length of the content is also a useful feature. I examined HTML code for the home pages only of these sites.

In this training set, none of the benign examples served HTML code larger than 2 Mb on the home page, at least according to the Content-Length header.

cl1

cl2

Large HTML code was always found on sites directly serving malware payloads, and on compromised sites, serving obfuscated Javascript leading to an exploit.

A few examples as of today:

hxxp://portail-bassin-arcachon.com 11,255,479 bytes
hxxp://portail-cote-azur.com 9,542,640 bytes
hxxp://location-mer.eu 8,437,761 bytes
hxxp://portail-cote-vendeenne.com 6,934,555 bytes
hxxp://portail-toulousain.com 6,914,263 bytes
hxxp://grupokarion.com 6,272,172 bytes
hxxp://portail-sologne.com 6,079,756 bytes
hxxp://unoshn.com 4,373,545 bytes
hxxp://portail-vallee-des-rois.com 4,355,293 bytes
hxxp://lacajareiki.com 2,854,292 bytes

SSH

ssh

More than 20% of web servers are also running an SSH server on the same IP address. This holds true both for benign and malicious servers.

FTP

ftp

The figures are quite different when it comes to FTP servers.

No less than 36.65% of web servers serving malicious content are running an FTP server. That’s nearly twice as much as servers for which we didn’t observe any malicious activity (18.57%).

In both cases, Pure-FTPd is the most popular FTP server software, with a 46.5% share, mainly due to it being shipped with Cpanel.

POP

pop

A POP server usually doesn’t share the same IP as a benign web server. Only 13.3% of benign web servers are also listening to port 110.

However, POP servers run simultaneously on 23.15% of malicious web sites.

The distribution of the POP server software is similar in both benign and malicious cases, with Dovecot being by far the most popular option.

SMTP

smtp

As expected, SMTP servers also tend to be more frequently found on web servers hosting malicious content than on benign ones: 25.03% vs 17.49%.

Using this data for classification

After analysis, we then used this data to extract simple binary features:

  • Server: *Apache*
  • Server: *nginx*
  • Server: !*(IIS or Apache or Nginx or Litespeed or Oversee or Lighttpd or ATS or Varnish or Tengine)*
  • Server: *Akamai*
  • X-Powered-By: *(Plesk or ASP or PHP)*
  • The presence of cookies
  • Set-Cookie: *(wordpress or ci_session or uid or PHPSESSID or PHP_SESSION_ID or virtuemart or VisitorID)*
  • Last-Modified date > 1 day
  • Content-Length >= 2,000,000
  • The presence of an FTP server
  • The presence of an SSH server
  • The presence of an SMTP server
  • The presence of a POP server

tree2

A decision tree trained with these features on 2/3 of our examples leads to the following ROC curve:

roc2

This classifier is simple and extremely fast, but it clearly doesn’t perform well enough on its own for our security needs. Furthermore, collecting test data is a network-intensive operation.

However, we have many models currently tagging domain names as suspicious or not according to different algorithms.

Some of these domains have a very high precision and are added to the list we are blocking after a quick manual review. For instance, newly registered domains acting as fast-flux fall into this category.

Output of other models need extra votes before we are confident enough to blocklist them and thereby protect our customers. And this new classifier is going to play a significant role in this regard.

Previous Post:

Previous Article

Next Post:

Next Article

Follow Us

  • Twitter
  • Facebook
  • LinkedIn
  • YouTube

Footer Sections

What we make

  • Cloud Security Service
  • DNS-Layer Network Security
  • Secure Web Gateway
  • Security Packages

Who we are

  • Global Cloud Architecture
  • Cloud Network Status
  • Cloud Network Activity
  • OpenDNS is now Umbrella
  • Cisco Umbrella Blog

Learn more

  • Webinars
  • Careers
  • Support
  • Cisco Umbrella Live Demo
  • Contact Sales
Umbrella by Cisco
208.67.222.222+208.67.220.220
2620:119:35::35+2620:119:53::53
Sign up for a Free Trial
  • Cisco Online Privacy Statement
  • Terms of Service
  • Sitemap

© 2021 Cisco Umbrella