Category Archives: Spam

The fight against spam on the web.

TagJag Thoughts

I had a very brief comment during Chris' session "Should TagJag get funded?" On the stage with Chris and Rick Segal were two of my favorite members of the venture community, Brad Feld and Jeff Clavier. My feedback may have been phrased more negatively than I meant it to be, but what I was trying to constructively criticise is that TagJag would be a lot more unique and valuable to me if beyond merely listing the results pages of the different services it aggregates, it presented the results interesting and timesaving ways. For example: better categorization of time-based vs. authority-based sources; combining different results into a single list; de-duping and filtering results; filtering the spam that the different providers seem to be unable to catch; providing different notification thresholds and mediums beyond RSS and HTML, like email, SMS, IM. All of these would provide value to me beyond what the individual services provide, save me time, and provide something greater than the sum of its parts. tagjag freedbacking

Commercial Akismet

Blog Herald asks about WP plugging a commercial project, namely Akismet. One of the lessons I learned from Ping-O-Matic is that web services like this can grow far beyond what you anticipated, need a lot of attention, and can be expensive to maintain. (Akismet has to be really fast otherwise it bugs people and delays commenting.) You also have a social contract with all of your users to continue to provide a service they’ve all come to rely on. When Akismet first got started, I wasn’t at all worried about the technology — I was using it myself and it worked great. I spent most of my brain cycles planning out how the service could be economically independent and self-sustaining in the future, so it could thrive and provide a great service to the public without relying on charity. I had to balance this with my desire to just give everything away (as I usually do).

I’m happy with where it eventually ended up. The Pro-Blogger limit was set very high and the vast majority (over 99.9%) of people use Akismet at no cost whatsoever. I’m able to justify devoting my time to the service while still putting bread on the table and the larger blogger community can stop dealing with disgusting spam on their blogs. The technology has scaled incredibly well and even before the Yahoo deal Akismet had a bright future. Also the API and the plugin itself is completely open so people could clone the API or modify the plugin if they wanted. The service just hit its first major milestone, has been embraced by the development community, and I’m confident now that it will continue as a public service. I think it’s also providing something pretty valuable, as evidenced by the people who have been buying Pro-blogger licenses just to support it, not because they fall under the commercial terms.

Akismet Stops Spam

Akismet is a new web service that stops comment and trackback spam. (Or at least tries really hard to.) The service is usable immediately as a WordPress plugin and the API could also be adapted for other systems.

I must say, this has been one of the more rewarding things I’ve worked on lately — when people tell you they’re able to spend more time with their family because they’re not spending 30 minutes a day dealing with spam it really puts things in perspective. If nothing else, I hope this makes blogging more joyful for at least one person.

Anyway, try it out, install it for a friend, link it on your blog. The more you use it the more effective it becomes. It’s a virtuous cycle that will hopefully curb the spam arms race.

Update: The reviews are starting to come in. Here’s some one with stats (from when the service was still in development).

Update Phishing

I just got a spam/phishing email that looks exactly like a Windows Update notification, and every link in the email is to a real Microsoft site, save one. The download link, which I must “Install now to maintain the security of your computer from these vulnerabilities, the most serious of which could allow an attacker to run code on your computer,” goes to a file named Windows-KB835935-SP2-ENU.exe on the domain windowsupdatenow.net. I’m sure the exe will do awful things to whoever falls for this. I hope Microsoft/Scoble get their lawyers on whoever is behind this, I’ll admit until I noticed the download link domain the email seemed totally legit.

SpamAssassin 3

I hit some bumps setting the new SpamAssassin up but now that it’s running I’m amazed. I was getting to the point again where a few spams were getting through every day, and that number seems to have been going up, but since installing 3.0 I haven’t seen a single spam in my inbox. The Bayesian identification seems a lot faster and more accurate. Because of a procmail typo this morning about a dozen emails got lost, so if you think one of those may have been you please re-send.

Contact Spam

My contact form, which sends mail to a whitelisted address so I don’t miss any messages, is getting absolutely hammered by spambots. They’re not hitting my comments and the contact form is something I wrote from scratch, but it has received over 200 spams in the past hour. The more they do stupid stuff like this the more data I have to block them in the future.

News.com Leads Blog Communication

This is the coolest thing I’ve seen all year. Check out the HTML of this article I linked a few days ago. Notice anything at the top?

<link rel="pingback" href="http://tb.news.com/p2t.cgi/2100-1032-5368454" />

Houston, we have Pingback support! Let’s dig deeper:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
<rdf:Description
rdf:about="http://news.com.com/2100-1032-5368454.html"
dc:title="Microsoft flip-flop may signal blog clog"
dc:identifier="http://news.com.com/2100-1032-5368454.html" />
trackback:ping="http://tb.news.com/tb.cgi/2100-1032-5368454"
</rdf:RDF>

Ugly as sin, but that’s trackback. It gets better…

A little URI hacking takes us to this page which lists all trackbacks and pingbacks the article recieved. How cool is that?

It’s my understanding that even though they’ve had the trackback autodiscovery code for a while they’ve been recieving mostly pingbacks, which makes sense given that it’s more fully and elegantly automatic. It would be cool if they could add support for the nascent rel="trackback" discovery method and save themselves the trouble of the RDF hack. Hopefully spammers won’t exploit their trackback server too soon and they can support legacy systems that don’t implement Pingback yet.

The implications of this are fairly large. News.com is obviously bootstrapping code that will involve their readers with the blog conversation surrounding their articles. How long for other sites to catch up? Will they plug into Technorati or Pubsub next? As far as I know this is the first major media organization to implement Trackback and Pingback. The team at News.com should be commended for their effort and leadership in this area.

Bloggers Declare Bore

Online Journalism Review writes Bloggers Declare War on Comment Spam, but Can They Win? I’m not sure what that has to do with journalism, but they talk to the same old people and read the same old sites and (not surprisingly) come to the same old tired conclusions. I’m trying to figure it out because I like everyone the article refers to and the article itself is well-written, but it feels very contrived. I think it may be because it draws a lot from blog material a year or more old, and selectively, like the writer had an agenda and Googled until there were enough quotes to fill the space. For example Mark Pilgrim’s blog is called “comment-free” when the entry on the front page for the last three weeks clearly has comments. Is it too much to ask to look at the front page of a blog you’re quoting? The article talks about Blogger redirecting URIs but not about Blogger’s registration aspect. It talks about Typekey but not the PATRIOT act. (Totally kidding there.)

You probably saw this coming from me, but most of all I think it’s silly that they don’t mention a single one of the dozens of other blogging systems that deal effectively with these issues every day. You can’t discuss the Movable Type spam epidemic without talking about people like Molly who tried everything out there including MT-Blacklist to no avail, then switched software and got on with their lives. There is a lot more to the story, but that’s been the conversation over the past year and a lot has come of it. The essence of blogging is communication and comments are here to stay, it’s just a matter of moderation.