I decided to run the web page analyzer (excellent tool) against the front pages of a few of the latest and greatest search engines and also do a little analysis of my own. Here are some of the results in one of the only tables you’ll ever see on this site:
Yahoo was the surprise winner here. Their HTML was alright but I think could be reduced quite a bit without losing anything. You’ll note they have the heaviest HTML of the bunch, heavier than other sites showing quite a bit more on their front page. They should probably talk to Doug. Overall though I think Yahoo has consistently been doing great nearly-standards-compliant work in their new designs. Yahoo could save about 67% of their HTML size with compression. Interestingly, Yahoo was the only site to specify ISO-8859-1 encoding, all the others claimed UTF-8.
Google was optimized to the hilt, but it’s kind of silly that they put so much effort into their markup but couldn’t go the last inch and make it valid HTML 4. They could probably make it a bit smaller with some more intelligent CSS usage. At least they don’t have font tags anymore. I think under normal circumstances they would have won but they have an olympic logo right now that’s pretty heavy. Google was the only site that used gzip compression for their HTML, but even uncompressed they only weighed in at about 2.4 kilobytes, still the lightest of the group.
Technorati clearly had the smartest markup of the group, and was the only one that validated. (An impressive feat for any website in this day and age.) Their markup is clean as a whistle with excellent structure and logic, and their numbers aren’t bad when you consider that they have a lot of stuff on their front page. This isn’t too surprising since Tantek did it. Their CSS, however, is pretty heavy. It’s strange because it’s very optimized in some ways but bloated in others, I think they could cut a few K from it pretty easily. One smart thing they did is have the CSS named with the date, so it’s name versioned and they can update it monthly without caching issues. All that said, they’re so far ahead of everything else they don’t need to worry about much. Technorati could save about 53% of their XHTML size with compression.
Feedster has its heart in the right place, but the implementation falls far short. For example it has a XHTML 1.1
doctype but then has the needless XML declaration at the top throwing IE into quirks mode. They use CSS in places, but then they have a table with 75 non-breaking spaces in it for positioning. There’s a ton needless markup, including a full kilobyte of HTML comments. On the bright side, they have the most room to improve. Feedster could save about 61% of their XHTML size with compression.