Attackers have long used typosquatting, brandjacking, and similar methods to deceive users into unknowingly visiting malicious websites. To keep the asymmetric battle going, defenders have taken up searching newly registered domain names to uncover these threats before they materialize. In this post we’ll share the common patterns we see, plus a technique for automating discovery so you can one-up those pesky attackers.
Prepending or appending common words
The pentesters out there will agree that one tried-and-true recipe for a successful phishing exercise is to send an email with “mandatory” in the subject line and a link to malicious page hosted at “companyname-support.com”. Prepending or appending common tech words to a brand is one of the most common techniques used by attackers; it works because users find something inherently trustworthy about a familiar string such as the company name or internally used jargon. Searching for occurrences of these is simple:
Substituting visually similar characters is another common method. Here the attacker may use letters like uppercase “i” or lowercase “L”interchangeably or replace letters with numbers in a leet speak fashion. We can expand our regex to incorporate this pretty easily as well:
Finally, repeated characters in a brand can be often mistakenly left out, duplicated, or overlooked. For instance, an attacker may register facebok.com, or perhaps a user may mistakenly type faceboook.com. A minor change to our regex can adjust for repeated characters:
While our classification algorithms have automated the broad discovery of these types of attacks at a much more advanced level, responders can still benefit from basic pattern searching on the brands they need to worry about. The Investigate API is the perfect tool for facilitating this type of automation, and pyinvestigate makes interacting with the API painless. To search for a string within the last 24 hours, you can:
If you’re monitoring just a single brand, you might be able to get away with creating static regular expressions. However, to support searching for many brands, you’ll need to implement a basic regex generator. Here’s an example of one that will look up static mappings for each letter of the brand and return a list of character substitutions:
It makes sense to search within an extended time period when you first search for imposter domains so that you can get a feel for what’s out there. Once you are aware of all of the domains matching your brand, you can speed up the process by constraining your search to just the last 24 hours and running the search at a regular interval within the time period.
Brand Watch is a python application which implements the methods described in this blog post. You can access its source here:
Running it is simple:
To exclude certain domains you have already investigated, provide a text file to the -e command line argument and they won’t be shown:
Now you can set this up as an upstart task or cron job to detect new domains—you could even have it shoot off a daily email if you were so inclined 🙂