A parked domain is a domain name that has been registered and is serving temporary content, is being held for future use, or is being used for monetization purposes. Some parked domains serve custom 404 pages, redirects, or advertisements.
Parked domains often serve ads to visitors as a mechanism for generating revenue for the domain owner. As more visits translate to more ad impressions and more ad impressions translate to more money for the domain owner, parked domains meant for monetization often use tricks to increase traffic volumes. These tricks include typo squatting, SEO trickery and search term mining, as well as name “guesses” (e.g. is a user is looking for spare car parts, they may type “carparts.example.com” into their browser).
How to Determine If a Domain Name is Parked
There are a few methods of identifying if a domain is parked or not. As Dhia Mahjoub pointed out previously, comparing the resolution between the domain name and a random subdomain can be used to determine if a single domain name is parked or not. Comparing the content served back from the server with varying HTTP referer headers is another way as parked domains often tailor ad content based on search engine queries. For example, if you end up visiting a parked domain and the referer is from a Google search for puppies, the ads on the parked domain may be for things like dog food or dog toys.
Another indicator of a single domain being parked is how many other domains are also resolving to the IP address that name resolves to. A large amount of domain names resolving to a single IPv4 address often indicates a parking IP or a shared hosting provider. This technique makes use of passive DNS data. Another technique which uses passive DNS data is to look at the number of domains the name servers of the domain in question are authoritative for. Name servers which have been delegated a large number of domain names (something like 15,000 or more) are often authorities for parked domains. Lastly, looking at the number of third party locations referenced in the source of the HTML served back from the domain is another indicator. As parked domains contain mostly dynamically loaded advertisements.
Recently a paper was published at the 2015 Network and Distributed System Security (NDSS) Symposium which outlined ways of identifying parked domains based on DNS records and HTTP content. The paper is a very good read and the accompanying Github is fantastic. We on the research team have adapted some of the techniques used in this paper to classify parked domains.
Notes of Interest
While exploring the techniques used in the paper previously mentioned, we came up with a novel mechanisms for comparing two domain’s HTML content. Using RabbitMQ, Celery, and Flask we built a basic web service which would render a page using PhantomJS and return the HTML as a string. Using the html parsing code from Python’s lxml module we created a tree of the HTML elements; this is essentially a DOM tree. We then converted the DOM tree to a networkx graph and used matplotlib to visualize the tree. We also converted the DOM tree to a graph specific to the zss module which implements the Zhang Shasha algorithm. Doing so for two different domain names allows us to calculate a tree edit distance (similar to a string edit distance) between the two DOMs.
Here is a script which reads HTML from files (of full saved web pages) and does the comparison.
Below are some interesting images showing the DOM structure of three parked domain names who had a very similar DOM tree regardless of the content on the page (as dynamically served ads often change on each page load).
The same 3 domains can also be viewed within OpenDNS Investigate. The following WHOIS information shows us all registrant and nameserver history for each domain:
And confirms the parking location at Bodis – a known parking provider.
Below are two images showing the DOM trees of Google searches. They too are very similar to each other because the actual content of the pages is irrelevant.
What is the Impact of Parked Domains to Your Network?
There is no legitimate reason for anyone to visit a parked domain. By definition, parked domains serve back useless content. Additionally, the strong focus on dynamically serving ads to browsers make parked domains a great vehicle for malvertising. We’ve also noticed many of the domain shadowing names Angler EK have repurposed in the past were originally parked.
Comparing DOM trees is a very telling method of grouping like HTML content together. As parked domains often reuse a set of templates for displaying advertisements to users comparing DOM trees eliminates any noise the ads may introduce when comparing HTML source of two web pages.