Clever New Comment Spammer

I think I’ve been hit with a new kind of insidious comment spam. At about four this morning I got a comment on an old entry that said:

Well, I just wanted to sign a blog on the first time in my life :))

Kind of cute, right? Isn’t that nice that some guy, “James Hatchkinson,” came across my site and was so enamored that he decided to leave a comment, his frist ever. Well, two minutes later the exact same comment, URL, and name was left on the WordPress blog. Clue #1.

The URL he left as his with his comment is, which I’m not going to link because this may be this spam’s whole point. I clicked the URL from the comment before realizing it was probably just a newbie way of saying “I don’t have a site yet.” People I know have left similar things for their URL in the past. Well, the link takes you to some sort of web company with a hideous flash intro and an equally mediocre web site. Hmmmmm. Clue #2.

Clue #3, each comment came from radically different IP addresses. Let’s give this guy incredible benefit of the doubt and say just maybe he was a newbie user who just came upon an old entry, left a silly comment with what he thought was a fake website, and then continued browsing to another one of my sites, went to a slightly old entry, and left the same comment. So why did his IP change? The first comment came from, which resolves to, and the second from, which is a proxy of some sort. Most users, especially the type that would leave this sort of comment, don’t randomly start using proxies mid-browsing. Strike three.

Finally, I decided to look up this guy’s IP in my access logs, to see what pages he visited. There were no records of his IP visiting any pages on either site in my PHP/Javascript based logging software, which means whatever client was used to leave this comment doesn’t support javascript or the <noscript> tag and images. Time to grep the raw logs. No referrer, none of any of the usual signs you would see in a log entry. Here’s the relevant lines from my logs: - - [18/Sep/2003:04:03:50 -0500] "GET /p644 HTTP/1.0" 301 303 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)" - - [18/Sep/2003:04:03:54 -0500] "GET /p644 HTTP/1.0" 200 15796 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)" - - [18/Sep/2003:04:03:56 -0500] "POST / HTTP/1.1" 302 5 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"

And from - - [18/Sep/2003:04:01:35 -0500] "GET /development/archives/39 HTTP/1.0" 200 7220 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)" - - [18/Sep/2003:04:01:40 -0500] "POST /development/ HTTP/1.0" 302 0 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"

There’s got to be a good story behind this. If this is indeed malicious comment spam then this is the most clever I’ve seen yet. If I hadn’t been the author of two posts he spammed and gotten the email notification I never would have suspected a thing. Has anyone else seen this?

What’s worrying about this whole thing is IP filtering (reactive) techniques that are usually used to block comment spam or content filtering (proactive) techniques which we’ve been experimenting with on WordPress wouldn’t catch this guy. In fact I can’t think of any good way to preemptively block this sort of thing. If Google didn’t give blogs so much credence we wouldn’t be having this problem. I suppose now we have to watch every comment with an eagle eye, on the lookout for anything suspicious.

Update: I got it reversed above, “he” commented on the WordPress blog first and then here.

8 thoughts on “Clever New Comment Spammer

  1. Not yet, but I’ll keep my eyes peeled. I did get hit by the penis-enlargment spammer again, though. I’m probably going to work up a quick hack for IP and url blocking. And I think that after we go gold with 0.72 of WP, I might work on something more official, with new database tables and an admin interface.

  2. Yeah, I fell for it too! Doh!
    I recently noticed in the guest book of another site I manage. Several innocent looking comments which said things like “I like your site” and “you have the same name as me”, supposedly from different visitors, but with the urls being for french commercial sites with lots of banner adverts which looked to be the run by same people.

  3. Hi, I was so enamored with the CSS Zen place, that I followed links all the way here, and found this rather interesting post. I know how the guy is doing it, he’s probably using a Perl script with the LWP mod, and since most of these blog sites are all the same in this comment area, automation is a breaze. And the Google thought is probably correct as well, having done quite a bit of time with the Googlies, I suspect that he’s looking for outside links to point to his site, and rasie his PageRank.

    Really burns me, people like this. But here’s what you do, your pages need to check for things inside the browser. Things like Javascript and maybe a plugin or two. I would have to give this some thought, because there are Perl Mods like Mechinze, that create all the browser responces I can think of right now.

    For me, this is a curiosty at the momemt, and I’ll have it in the back of my mind until I find an answer to solve this, but if you are really interested in a little effort out of me, send an email saying that you would really like some protection from this. I don’t know if you do or not, and like I said, I was just following links.

    If I come up with an answer and you sent an email, I’ll send it to you when I’m done. I use Perl, PHP, and javascript, no ASP at all, but perhaps someone could translate it.

    cool site by the way.