Exactly one month ago my blog was the subject of a spam injection attack that has brought back consequences that are still with me to this day. Even though I am a web developer with years of experience and a sound approach to security, I was brought to my knees for days without even knowing it.
In this post I will explain to you what happened, what to look for and how to prevent that this happens to you.
Tell-tale signs
The very first sign that alerted me that something was going on was that my feed, which I self-subscribe to, started to appear with a list of spammy keywords at the end of each post. They didn’t have any links, just the words. I checked my blog and everything seemed okay, so I thought maybe this was some sort of problem with Feedburner (which lately has had a lot of issues, so why not one more, I thought.

A couple of days later, I started noticing some strange ads popping up on my blog. They had to do with health and pills, and I thought this was strange, as most of my content is centered around social media and technology. I was starting to get worried but had no clue what was going on, as I would view the HTML code and my Wordpress installation and nothing strange was happening. Again, I thought maybe some health company had purchased space on my blog (and unfortunately you’ll see the same ads in this post as well, as Google thinks this post is about that).
Because I was protected with Akismet for my comments, and had FTP turned off for my blog, I was 100% sure I wasn’t infected with anything strange.
The truth was that I was infected.
The Final Discovery
After some research, I found out about some clever software injections that are either pushed via templates or plugins that are downloaded from non-Wordpress sites. I remembered I had downloaded a couple of plugins from external sites and went into panick mode.
The first recommended check was doing a site search with spammy keywords. I did, and was in for a rude surprise, they were all there.

The funny thing was that if I clicked on the link to my site, I didn’t see them. The only way to see them was to go to the “Cached” version and then see the results in text mode.
I also went to my Google Webmaster tool (if you don’t have this, you should immediately) and saw all the spammy keywords in the content analysis:

I was totally infected. It had been days (if not weeks) that this had been happening.
How the injection works
The spam keywords and links are hidden in chunks of code that are not human readable, usually encoded with a PHP function called base64 that converts all the HTML into words and letters that can be later decoded.
But when are they decoded? This is the smart part: if you see your site, your browser version is read by the spammy code and doesn’t render anything. But if the Google Bot or other bots are the ones accessing the code, it then decodes and prints out the spammy code.
Other times, it decodes it randomly, so only some users can see them.
One way to check out how this is triggered is by crawling your site using cURL, a tool that’s available for most Linux installations. If I did the following command, I could see the spam links on my footer section:
curl --no-sessionid --user-agent "Googlebot/2.1 (+http://www.googlebot.com/bot.html)" http://jungleg.com
Steps to solve it
You can try and pinpoint which of the functions is triggering the spam links. In my case, I just did a backup of the blog database and installed a new Wordpress folder from zero, adding the plugins and templates from Wordpress.
It is very important to notify Google about your attack as soon as you can. For me it was too late, my PageRank had gone down from 3 to zero. I wrote a reconsideration request, and even though I haven’t heard back from them, my blog did get back to a PR 2, and most of the spammy content is gone, even though I still see those pesky health ads every so often.

Monitoring: the hard part

In theory, we would all have to do this monitoring every day, hopefully before the Google bot hits our site. But who has time to issue the cURL command or be looking at his own site’s Google Search results? What if it’s only one of your older posts?
As a developer, I thought this would be a good tool to write and on Saturday I released version 1 of this tool to the blogosphere: it’s called SpamCheckr.
SpamCheckr crawls your site acting as one of a handful bots to surface spammy keywords and will show you the text content the bot sees. Since Saturday 84 people have checked their sites, with at least 2 getting some sort of spam content present.
I will write, as time permits, a second version of the tool that will crawl your blog on a scheduled fashion, and alert via email or SMS if it finds spam — hopefully before Google indexes the content, ruining your hard-earned PageRank and ad revenue.
Have you been infected by blog spam? Tell me your war stories!
{ 1 trackback }
{ 11 comments… read them below or add one }
The Aftermath of a Wordpress Spam Injection (and a Tool to Prevent it) http://ff.im/-2f2yz
This comment was originally posted on Twitter
More from author
The Aftermath of a Wordpress Spam Injection (and a Tool to Prevent it) http://ff.im/-2f4H0
This comment was originally posted on Twitter
More from author
The Aftermath of a Wordpress Spam Injection (and a Tool to Prevent …: For me it was too late, my PageRank had .. http://bit.ly/lKakw
This comment was originally posted on Twitter
More from author
http://bit.ly/E4l46 great reason to be careful when you install WP plugins
This comment was originally posted on Twitter
More from author
The Aftermath of a Wordpress Spam Injection (and a Tool to Prevent …: For me it was too late, my PageRank had .. http://bit.ly/iU3Nn
This comment was originally posted on Twitter
More from author
Very cool tool. My only problem is the name. I would have spelled it with an “er”.
I don’t run a blog, but I had a similar situation recently. In the last month or so, my installation of roundcube was compromised. The attacker able to execute code that replaced my index page with a redirect to a phishing bank site. It had been that way for like 10 days before I noticed. This installation is shared by all my clients, but no one complained so I assume that no one had tried using webmail in that time. My wife brought it to my attention when she was trying to use roundcube from her computer and complained that it kept sending her to a weird site. I thought for sure she was infected with some sort of malware. Then I tried from my machine and had the same problem so I knew it wasn’t her computer. I went into panic mode thinking my server was hacked. After some investigation, it turned out my webmail vhost was the only one affected. So I deleted everything and installed the latest version from scratch. The new version is supposed to include some security updates that may or may not have been related to how my installation was compromised.
More from author
It’s the first time I commented here and I must say you share us genuine, and quality information for bloggers! Good job.
p.s. You have a very good template for your blog. Where did you find it?
More from author
I just found this same sort of thing on my Wordpress install – version 2.8.2. There was a file in the wp-includes directory called feed-atom2.php and included the Base64_decode for a remote user. I’ve saved the file if you’d like to have it :)
I couldn’t have found the problem without your post. Thank you.
More from author
Another good idea is to setup Google alerts for spammy keywords for your domain. Hopefully you can catch it before it starts setting off Google alerts, but if not it’s a nice, free fail safe.
More from author
> “you’ll see the same ads in this post as well, as Google thinks this post is about that”
FYI regarding the health ads on this page, you can use HTML comments to provide hints to Google’s Ad crawler of which portions of the page should be emphasized or de-emphasized. To suppress the health-related keywords, you’d surround the paragraphs with those keywords with the following:
<!– google_ad_section_start(weight=ignore) –>
…
<!– google_ad_section_end –>
Google’s description of the technique is here:
https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=23168
More from author
> curl –no-sessionid –user-agent “Googlebot/2.1 …
Might the next escalation in the spambot war be for them to start checking not only the user-agent, but also that the IP address resolves back to the google.com domain?
More from author