Michael Krotscheck has an interesting post called Friends don’t let Friends use TypePad, which apparently ruffled some feathers and elicited a pretty venomous response from a Six Apart Vice President. I guess is part of their new plan to “compete” but statements like “TypePad simply blows WordPress.com away on SEO” and “On WordPress.com, you’re kind of moving into a bad neighborhood — by their own admission, one-third of the blogs on WordPress.com are spam” don’t exactly lend credibility. Michael responded eloquently in a comment and then again in a follow-up post. Lloyd has jumped in with some specific facts on Typepad’s (lack of) SEO. In the meantime we just turned on sitemaps for everybody on WordPress.com, a popular user request.
Category Archives: Spam
SecurityFocus SQL Injection Bogus
Since people are asking, this so-called alert on Security Focus appears to be completely false and has no information that an attacker or the WordPress developers could use. It is completely content-free, except for making claims that every version of WP since 2.0 is vulnerable.
Online, apparently, it’s fine for someone to run into a crowded theatre and yell “fire” and the less basis there is in fact the more people link to them. It’s not uncommon to see crying-wolf reports like the above several times in a week, and a big part of what the WP security team is sifting through things to see what’s valid or not.
A valid security report looks like this, it usually includes sample code and a detailed description of the problem. The WP security team was notified of the KSES problem and it was fixed in 2.5. You can impress your friends by saying whether a security report is valid or not, so it’s a good critical facility to pick up.
All that said, there is a wave of attacks going around targeting old WordPress blogs, particularly those on the 2.1 or 2.2 branch. They’re exploiting problems that have been fixed for a year or more. This typically manifests itself through hidden spam being put on your site, either in the post or in a directory, and people notice when they get dropped from Google. (Google will drop your site if it contains links they consider spammy, you’ll remember this is one of the main reasons I came out against sponsored themes.) Google has some guidelines as well, what to do if your site is hacked. If I were to suggest WordPress-specific ones, I would say:
OpenID and Spam
Magnolia is going to be restricting their signups to only OpenID users:
Why? Because 75% of new accounts being created there lately have been created by spammers using automated tools. Spammers took over Ma.gnolia. Now, the company is using OpenID as a system of 3rd party verified identity and using the superior spam blocking skills of services like Yahoo! and AIM to clean up the Ma.gnolia ranks. Spamfighting could be the incentive that puts many other vendors over the edge to leverage OpenID.
At best this is a Club solution, meaning it’ll be effective as long as Magnolia is not a worthwhile enough target or not enough people use the technique.
Anyone advocating that a Yahoo, Google, or AOL account is going to stop spam signups, sploggers, or anything of the sort is out of touch with the dark side of the internet. The going rate for a valid Google account is about a penny each. For $100 get a text file with 10,000 valid logins and passwords, and go to town. We used to require email verification to signup for WordPress.com, and the vast majority of splogs were coming from Gmail or Yahoo email addresses, hundreds of thousands of them. Myspace and ICQ are both good examples of completely closed identity systems with registration barriers but still overrun with spam.
Each of the big guys probably has an anti-abuse team larger than all of Magnolia fighting these spam signups, but it obviously hasn’t been effective. In theory you could blacklist OpenID providers but who’s going to block Google and Yahoo and even if they did they’re just pushing the problem outward, to the point where spammers eventually run their own identity providers, and if you think they won’t come from millions of unique registered domains look at your comment spam queue.
OpenID has a ton of promise for the web — let’s not hurt it by setting people up for disappointment by telling them it’s a spam blocker when it’s not. Regardless of registration, identity verification, or CAPTCHA, you still need something working at the content level to block spam.
Percentage of Splogs
I’ve been indicated a few places saying a third of blogs are spam. Someone came up with this by me saying we’ve axed around 800,000 splogs on WordPress.com, and looking at our number of blogs, which is 2.5m.
As for percentage of the total blogosphere, reported by Technorati as north of 100 million, which are splogs, I’d say the number is much higher – probably 80%. This isn’t as bad as it sounds, I just think spammers are very effective at creating hundreds of thousands to millions of blogs, they tend to stick around, and I feel like Technorati’s number doesn’t doesn’t adequately scrub these out.
While I’m making data-less estimates, I’d say there are about 25-30 million non-spam blogs, and about 8-14 million of those are active in terms of getting traffic or new posts. You could cover a meaningful portion of the blogosphere by just indexing 4 or 5 million blogs.
Splogs and blogger attrition are two problems no one really talks about, but that’s okay because I don’t think either is hindering anyone’s growth as measured by metrics that matter, like pageviews or uniques. (Though many of the services supporting so many splogs must have an inordinate amount of resources devoted to them.)
See also: Blog Ping and Spam Statistics, WordPress.com February wrap-up.
TechCrunch’s Social Responsibility
Mike Arrington on TechCrunch did an interesting thing a few days ago, he asked their readers if they should accept advertising from PayPerPost/Izea. Their readers made the right decision and voted that it would be disingenuous to accept advertising from a company that, in Michael’s words, pollutes the blogosphere. He also notes that TechCrunch is being held to a higher standard than most mainstream media would:
The comments that are most interesting to me are the ones that say we’re selling out if we take their advertising. I understand that we are held to a certain standard (and we hold ourselves to that standard), but it’s interesting that we supposed to do things that would never be asked of MSM.
While I’m sure there’s mainstream media which turn away advertisers because of social reasons, the point that we should hold flagship blogs to high standards is a good one.
On that point, I would encourage the crew at TechCrunch to re-examine their advertising and implicit endorsement of Text Link Ads, which pollutes the blogosphere in the same way PayPerPost does, by selling links with the intention of gaming Google. Just as PayPerPost “posties” were recently penalized by Google and Pagerank was one of the criteria that advertisers looked for when choosing which bloggers to give money to, Text Link Ads has been doing the same thing for years, they’ve just been more explicit about it. (And their corporate site has been penalized in Google for a long time.)
I should also note that if TechCrunch decides that the same reasons it decided to not accept advertising from Izea also apply to Text Link Ads, it’ll be operating at a higher standard than Google itself, who even though its business is directly impacted by the search engine spamming both of these companies practice allows both TLA and PPP to advertise via Adwords and Adsense.
Guardian on Splogs
The Guardian: Why Google is the service of choice for sploggers examines spam, splogs, Blogger, and WordPress.com. As you may tell from the title, it’s overly harsh on Google, but nonetheless has some interesting commentary and information. Like I said last time someone wrote about this, I would never suggest WP.com is splog-free because I delete too many of them myself, but it is a problem we take very seriously and are ever vigilant against.
Love and Hate
One of my favorite funny graphics from the on-hiatus Creating Passionate Users was this one from the entry Be brave or go home. Because on this entry on my blog a few days ago the part of the blogosphere that makes money from ad-embedded themes has been viciously attacking me personally. Attempted assassinations are never fun, at least for the person on the receiving end, but overall I’m happy for a few reasons:
- Some of the paid links in themes are to the same URLs I see in Akismet, so I know that there is at least some overlap between the people financing these themes and attacking our blogs, and any way we can fight them is good.
- I know that this is something the majority of the WordPress community has voted for.
- I am hopeful we’ll stop seeing threads like this in the support forum. “I installed the ecologici theme found here [link to wordpress.net] I customized it, no problems. I went to add my scripts to the footer and found this code…”
- The attacks sting less when it’s from people who have significant financial interests in seeing sponsored themes continue. They’re just trying to protect their money.
- That they’re making so much noise is an indication we’re doing something meaningful.
- The attacks sting less when they’re from people with questionable personal practices. [1]
Still, there is a lot of hard work ahead.
[1] For example one attack post from “Franky” on a blog called Wisdump (didn’t that used to be run by the awesome Paul Scrivens?) I noticed it was loading a little slow, then I saw pingomatic.com in my address bar. I looked at his source and saw he had embedded a 1×1 pixel iframe loading the ping page for Ping-O-Matic on every one of his pages. I must admit this is clever, it utilizes the distributed network of everyone who visits your site to attack Ping-O-Matic and spam the ping servers, and of course IP blocking is useless because it’s coming from the regular folks on your site. But it is also extremely skeevy. (And I believe a little bit of JS on the ping page should fix that right up.)
Startup Essentials Update
Today I talked with Kevin Olsen from Sun’s Startup Essentials program a bit about what happened in our case. He said they were pretty overwhelmed with the few couple of weeks of inquiries, and it sounds like they’re doing them by hand. About 20% of their emails didn’t get through to applicants because they were caught in spam filters. (They’ve started calling and snail mailing to get around that.) Finally someone has mistyped the name of our company as Automatic instead of Automattic (two Ts) and a company that was older than 4 years had applied as “Automatic” and was rejected, and somehow our application got caught up in that.
Wikipedia Nofollows
Wikipedia has decided to nofollow all external links to help offset people spamming the service. In theory this should work perfectly, but in practice although all major blogging tools did this two years ago and comment and trackback spam is still 100 times worse now. In hindsight, I don’t think nofollow had much of an effect, though I’m still glad we tried it.
Spammers Hack Blogs
Blog spammers have sunk to new lows.
Nivi, a blog I’m subscribed to, was showing dozens and dozens of entries being updated even though there was no discernible difference. However as I started looking closer, I noticed if you view the source, for example on this post, there is are ton of spam links there. You can click the screenshot to the left.
The implications of this are disturbing. His blog was hacked (which isn’t unusual and could have been for a thousand reasons like another account on his server being hacked, and old version of phpBB or other software) but instead of doing anything obvious to disturb the content of the site they invisibly modified his posts using CSS-hidden text. He has probably had hundreds of posts modified. I can’t imagine cleaning it up will be pleasant.
bbPress Goes Gold
The big news today is bbPress 0.72 “Bix” has been released, the first officially released version of the blazingly fast forum software 2 years in the making. It includes some of the things WordPress has become well-known for, like spam protection, easy extensibility, and WP-like customization.
Wikipedia Spam
Sometimes I’m amazed at how much manual labor the Wikipedia uses. For example, how long can this type of spam protection go on before it becomes overwhelming?
Typepad Splogs
I just wanted to give a quick kudos to the Typepad folks for being one of the best in the industry when it comes to dealing with splogs. Since they’re a paid service I don’t come across splogs on Typepad very often, but when I do their support is easy to contact, very responsive (I had a reply from “Carla” within 3 hours), and they obviously understand the problem and how to deal with it.
Plaxo Revisited
It recently became more important for me to sync my address book across several computers on various platforms. Solutions like LDAP seemed like a pain and had bad support in Thunderbird. I don’t want to go to a hosted app like Joyent or Zimbra, and I need to be able to work offline. Anyway in my searches I came across Plaxo. In the past I grew to hate the Plaxo contact update spam I used to get every day, so I had pretty much permanently written it off.
However this time when I saw they had support for Thunderbird, Mac OS X address book, and Yahoo and I got pretty excited. I tried it out, and I am now syncing a Mac Mini, a Powerbook, a Macbook, my Windows desktop, and a Vaio laptop to a single address book. It cleaned up dupes pretty well, and the online interface is surprisingly usable as well. This is also the best way I know of to get Thunderbird to use the OS X address book, so you get integration with all the other apps like Adium which feed off that.
What could be improved? Sync is really hard, and few do it well. My experience with Plaxo has been pretty good thus far—I think I’ve avoided spamming anyone for contact updates—and I’d love to connect other bits and pieces into the Plaxo cloud. They should open up their API so developers can start to integrate the system into other products and services, and it can become a de facto standard.
Update: They do have an API, I had just missed it. Cool!
MySpace Spam
This is an example of a MySpace spam profile, it’s very convincing—see if you can spot the ad. I think this phenomenon is under-reported. They are using data from your profile—location, age, romantic preferences—to highly target messages and “adds.” Seventeen hundred friends. It would be interesting to know the growth of spam on social networks like MySpace is as high as email or comments. The incentives are there.