In October, the OpenDNS research team was in Europe presenting new threat detection models at two renowned security conferences. First, Security Researcher Thomas Mathew and I (Dhia Mahjoub) presented at BruCON on Oct. 9 about a “Unified DNS View to Track Threats.” Then a couple weeks later I presented on Oct. 22 at Hack.lu about “A Collective View of Current Trends in Criminal Hosting Infrastructures.”
SPRank at BruCON
In the talk “Unified DNS View to Track Threats,” we discussed a new model dubbed Spike Rank or “SPRank” that leverages DNS traffic below recursive resolvers. We call this data “recursive DNS data.” Unlike previous models which primarily placed emphasis on features like ASN, BGP prefixes, and WHOIS information, SPRank analyzes traffic signals as the primary interest. The decision to move away from focusing heavily on ASN, IP, and WHOIS information as identifiers of threats came after a new series of threats began to emerge over the past year. The increase in exploit kit campaigns and “domain-shadowing” usage rendered many of the classical domain reputation or IP reputation methods ineffective. Traditionally, a domain reputation model will assign different scores to a domain based on the reputation of its IP host. These reputation scores are based on the historical “goodness” of an IP range or domain. Exploit kits using compromised domains pose problems to these models. Compromised domains that have a great historical reputation will easily fool a reputation system. Furthermore, cheap hosting makes it difficult to assign meaningful scores for IP reputation as new ranges appear having no historical context to provide it a score. SPRank avoids these issues by analyzing the DNS request patterns to a domain.
SPRank detects domains showing as a sudden surge — or a spike — in DNS queries issued from our 65 million worldwide clients towards our resolvers. These domains feature what we call the “spike behavior.” This behavior is typical of domains used for malware campaigns such as exploit kits, DGAs, fake software, Browlock, and phishing. But it can also be associated with spam domains, domains victimized for DNS amplification attacks, and a slew of other suspicious and even benign uses.
A major breakthrough of this model is it separates the detected domains into benign, suspicious, and malicious classes. Within the malicious class, we focus mainly on exploit kit domains. Exploit kits are currently the most efficient and widespread infection delivery method of financially motivated malware.
The other main advantage of SPRank is that it pinpoints inherent features of malware domains that criminals cannot easily change. Because of OpenDNS’s unique perspective of the Internet and its domains, we can distinguish between acquired or assigned features and inherent features. The assigned features of a domain include the lexical makeup, DGA setup (seed, algorithm), or the hosting, and registration choices. These features are controlled by adversaries, as they can change or update the features when needed. On the other hand, inherent features are related to traffic patterns that emerge globally from clients querying malware domains and are harder to obfuscate or change by the adversary. We are talking here about features such as the distribution of clients across IP space and geography, the geography of resolvers being used, query types, query volumes, domain traffic patterns, etc.
The SPRank system consists of a few main subsystems:
- Spike Detection
- Domain History Filter
- QType Filter
- Domain Records Filter
- Expansion of threat intelligence via IP, prefix, ASN, hoster, fingerprint, and email pivoting
To learn more about the motivation behind SPRank, its details, components, and results, we invite you to check out our video of the talk at BruCON. The results of the model are very promising. It detects the most current and virulent malware campaigns such as Angler, RIG, and Nuclear exploit kits, in addition to DGAs, fake software, or phishing. Current exploit kit campaigns drop malware payloads ranging from crypto-ransomware, banking Trojans, and info stealers to bots used for DDoS, spam, or click-fraud, as the diagram below shows.
IP Space Monitoring at Hack.lu
In the talk “A Collective View of Current Trends in Criminal Hosting Infrastructures” at Hack.lu, I discussed a two-year long effort of research I’ve been conducting about malware hosting IP infrastructures. In this research, I discuss a selection of hosting patterns identified from analyzing DNS, IP space, BGP prefixes, and ASN peering relationships. These patterns have been adopted by suspicious and bulletproof hosting providers to harbor malicious content and deliver malware campaigns on a large scale. In these patterns, we distinguish between botnet-based hosting infrastructures and dedicated hosting providers. In the first category, I discussed a “hosting as a service” infrastructure used to host fast flux malware CnCs. In the second category, I cover eight different recorded hosting patterns:
- Compromised domains, i.e. “domain shadowing”
- Domain shadowing on multiple hosting IPs
- Sibling peripheral ASNs and bulk malware IP setup
- Leaf ASNs
- Offshore registration and diversification of IP space
- Rogue ASN and affiliated hosters
- Abuse of large hosting providers
- Shady hosts within larger hosting providers
We have been tracking “domain shadowing” for a few years now [1][2][3], and Cisco discussed it this year [4]. Despite being a widely known pattern, it is still being used by adversaries for delivering exploit kits, browlock and other suspicious content. We have also been tracking various variants of rogue and bulletproof hosting providers. A very noticeable pattern here is rogue or bulletproof providers register businesses in offshore jurisdictions in the Caribbean islands, Central America, or the Indian Ocean, and they diversify their IP space in both ARIN (North America) and RIPE (Europe) for resiliency and evasion. Below, we show the example of QHoster, a Bulgarian hoster registered in Belize with IP space in ARIN and RIPE, which has been hosting exploit kits and phishing campaigns for some time.
The combination of these patterns helps us design a model to monitor IP space usage for malicious purposes, and identify in a predictive fashion IP ranges that will be used for malware campaigns even before any domain is hosted on the IPs. This model has been successful in mitigating exploit kit campaigns such as Angler, RIG, and Nuclear.
The major advantage of this model is that it analyzes IP space with a much finer granularity than conventional IP, BGP, ASN reputation scoring methods. We focus on IP ranges that are smaller than the BGP prefix and that are operated by rogue hosts or purchased by criminal customers. We also analyze IP fingerprints to single out servers that share the same configurations and that are purchased in bulk and set up in advance to deliver malware and exploit kit campaigns.
Furthermore, if we confirm that IP ranges, ASNs, or hosts match several of these patterns at the same time, we can flag them with a high confidence as rogue, bulletproof, or heavily abused. We can then quarantine or block their IP space. Additional validation also comes from monitoring hosted content over time.
Stay tuned for future separate blogs in which we will discuss these hosting patterns in more detail.
The final takeway is that “SPRank” and “IP Monitoring” can work separately but they are much more efficient if they operate, in tandem, to provide a higher coverage and accuracy in detecting threats. Since we are faced with a massive amount of DNS and IP space data flowing in real time through our worldwide infrastructure, SPRank becomes crucial at finding entry points or seeds of malware domains for immediate blocking but also for “IP Monitoring” to further drill into associated indicators by pivoting around IPs, fingerprints, prefixes, ASNs, hosters, emails, content, etc to expand the intelligence graph and proactively mitigate attacks before they occur. At the same time, “IP Monitoring” can function in a standalone fashion by sweeping and scrutinizing IP ranges picked up by other models or feeds.