Search Engine Markshowdown

I decided to run the web page analyzer (excellent tool) against the front pages of a few of the latest and greatest search engines and also do a little analysis of my own. Here are some of the results in one of the only tables you’ll ever see on this site:

  Feedster Technorati Google Yahoo Search
HTML 6.11 3.72 1.18 7.82
Ext. CSS 11.47 11.63 0 1.45
Other 9.10 6.70 15.10 1.72
Total 26.70 22.05 16.27 11.00
Compressed No No Yes No

Numbers are kilobytes, and may not add up exactly due to rounding. CSS is external, linked files. “Other” includes images and javascript.

Yahoo was the surprise winner here. Their HTML was alright but I think could be reduced quite a bit without losing anything. You’ll note they have the heaviest HTML of the bunch, heavier than other sites showing quite a bit more on their front page. They should probably talk to Doug. Overall though I think Yahoo has consistently been doing great nearly-standards-compliant work in their new designs. Yahoo could save about 67% of their HTML size with compression. Interestingly, Yahoo was the only site to specify ISO-8859-1 encoding, all the others claimed UTF-8.

Google was optimized to the hilt, but it’s kind of silly that they put so much effort into their markup but couldn’t go the last inch and make it valid HTML 4. They could probably make it a bit smaller with some more intelligent CSS usage. At least they don’t have font tags anymore. I think under normal circumstances they would have won but they have an olympic logo right now that’s pretty heavy. Google was the only site that used gzip compression for their HTML, but even uncompressed they only weighed in at about 2.4 kilobytes, still the lightest of the group.

Technorati clearly had the smartest markup of the group, and was the only one that validated. (An impressive feat for any website in this day and age.) Their markup is clean as a whistle with excellent structure and logic, and their numbers aren’t bad when you consider that they have a lot of stuff on their front page. This isn’t too surprising since Tantek did it. Their CSS, however, is pretty heavy. It’s strange because it’s very optimized in some ways but bloated in others, I think they could cut a few K from it pretty easily. One smart thing they did is have the CSS named with the date, so it’s name versioned and they can update it monthly without caching issues. All that said, they’re so far ahead of everything else they don’t need to worry about much. Technorati could save about 53% of their XHTML size with compression.

Feedster has its heart in the right place, but the implementation falls far short. For example it has a XHTML 1.1 doctype but then has the needless XML declaration at the top throwing IE into quirks mode. They use CSS in places, but then they have a table with 75 non-breaking spaces in it for positioning. There’s a ton needless markup, including a full kilobyte of HTML comments. On the bright side, they have the most room to improve. Feedster could save about 61% of their XHTML size with compression.

8 thoughts on “Search Engine Markshowdown

  1. You’ll note they have the heaviest HTML of the bunch, heavier than other sites showing quite a bit more on their front page. They should probably talk to Doug.

    Hey, what about lil’ ol’ me? (sniff, sniff)

  2. Eric, did you leave off the closing <blockquote> to spite me? 🙂 Doug’s Yahoo markover came to mind when writing this but any company serious about getting their code up to snuff should ping you. (I know a few interesting ones you can’t mention have.)

  3. That’s all fine and dandy, but what about cached files? For search engines explicitly, 99% of their hits are returning users. Thus Google wins by a landslide at a mere 1.18k and putting Yahoo down at dead last place. External CSS files and images are cached. Seems to me a vital piece of information that was unaccounted for =/

  4. I like Google’s interface, but now when I look at Yahoo!, it looks quite similar too. Cache is also available for Yahoo!. I don’t mind using either of them now.