When we’re not busy threat hunting, we enjoy a good superhero movie. In the same way that the DC franchise is on a movie release spree, malware can follow a similar pattern: either dormant and biding it’s time, or taking the world by storm.
Knock, Knock, the Doc!
Worth a Thousand Words
Macro based malware lurking in PDFs or documents has been around for so long that it deserves a fresh look. You could call it “old wine in a new bottle”, but sticking with the movie theme, we like to think of it as a reboot of old classics like Die Hard or Jumanji.
Malware that hides in EXIF headers of images was reported by Sucuri a few years ago and has been known for some time, so it’s not new, but we are seeing new ways of implementation. For example, a Cisco Umbrella user reported receiving a seemingly legitimate email which contained a URL to an image, that looked something like: maliciousexample.com/agagag/3egdha.jpg
When we get samples from our customers, the analysis is pretty straight forward. We closely review the document, any linked URLs or PDFs, and inspect the resources for malicious components such as macros for word documents, or web page content and domain names for phishing campaigns. In the case of the email pointing to a single .JPG, that analysis breaks down a little since it doesn’t appear suspicious right off the bat. We may review the headers of the email or the domain of the link trying to identify what is malicious, but we ordinarily don’t assume that the .JPG itself is the vector for malware.
We’re not the only ones to miss this vector. Online sandboxes can also come up empty, depending on how they’re configured when analyzing the submission:
Just because our first attempt at a sandbox analysis didn’t find anything, does not mean that we would assume that there’s nothing to find. Why would our customer get an email pointing to a .JPG for no apparent reason? The actor that sent this email wants something, and it’s up to us to dig deeper. All we have to go on at this point is an image file. It’s possible steganography is being used to conceal malicious code, a technique known as stegosploiting.
Downloading the .JPG and running it through steganographic libraries didn’t reveal anything in this case. There was no hidden pattern or marker in the image to trigger a malicious attack.
But when we analyzed the image file through a sandbox environment configured differently than the first, the service identified the image as a trojan. At this point, suspicions were definitely raised. We know the customer received this mail from a source that they do not trust, which implies a malicious actor. Blocking the host domain and noting the file hash is a solid step, but maybe we can find evidence of something hidden in the binary of the .JPG.
Now we’re onto something! .JPG files commonly have metadata to go along with the images, textual information that can include the name of the photo or photographer, where it was taken, the time and date that the image was made, and many other snippets of useful data. Extracting the metadata of an image is easy, and in this case, it turns out to be exactly what we’re looking for:
Look at the strange “Make” and “Model” values. The “Make” has a value of “/.*/e” and the “Model” is an eval function! It evaluates the decoded base64 string that is present. This is a big clue as to how this malware functions. If you don’t know by now, it’s very rarely a good idea for programs to evaluate a decoded base64 string.
So let’s see what this base64 string decodes to:
This is the last piece of the puzzle for us. Putting the pieces together, we can deduce the following: The malware works in stages. The first stage of the malware comes from the domain that was infected and compromised. The second stage is the search and replace function hidden in EXIF headers in the .JPG file.
The first stage site was taken down quickly, and we could not retrieve the code for that step. Assuming a typical multi-stage delivery of malware, we can expect that the following could have happened:
The site that hosted the malicious JPG could have contained this:
$exif = exif_read_data(‘/home/path/images/dir1/gagagate/3ecfgagsag.jpg’);
The function “exif_read_data” reads the exif header from an image file, and in our example specifically reads the “Make” and “Model” labels as shown above. From our example, it then executes and decodes, calling POST variable ‘zz’.
The key aspect here is that the code does not look malicious at all. Instead, it looks like more of a search and replace function, which is why the sandbox environments may not have detected them as malicious. Searching and replacing by itself isn’t something that would be flagged. Additionally, the attacker needs to send a proper POST request, replacing the variable “zz” with malicious instructions.
Window to the Malware
So how do these otherwise benign sites get compromised to act as backdoors? One way is out of date software plugins. Old versions of WordPress and Joomla may allow attackers to get access to sites based on their security vulnerabilities. They then end up hosting stealthy malicious images or even phishing urls.
Small scale or low key shopping websites are often the victims, but it isn’t limited to retail shops. Even enterprise sites and blogs get compromised frequently. One of the major problems is failure to update their plug-ins regularly. Plug-ins are not something you can buy once, install, and never worry about again. It is instead like buying an elevator, then not servicing it and expecting it to run error free for the life of the building.
A Picture of Health
JPG malware is not that common, but it can be very nasty. Attackers can target stock images that are common in powerpoint presentations and embed malicious code either using stegosploit or infect the site that hosts the stock images for slides.
When these pictures are added into presentations, this could create a widespread issue, as presentations are usually shared between many people. One stage of the code can connect to these compromised websites or to websites that are hosted by bulletproof hosting providers. This could be used to drop malicious payloads onto systems.
Umbrella protects users from connecting to malicious sites on the internet and analyzes over 180 billion DNS requests daily. The sheer volume of DNS requests gives our researchers a unique view of the internet to better identify trends on threats, faster.
Umbrella uses statistical models to hunt for domains tied to malicious infrastructure. In this way, Umbrella can stop infections before they happen and help you stay one step ahead of malicious actors.