Sign up for the Google Newsletter, the
Google-Friends mailing list is powered by groups.yahoo.com.
Get Custom
Get Custom plugin, easy interface to WordPress’ infinitely flexible custom fields system.
Matrix Reloaded Cup
On Milk
I didn’t realize how fast time passes until I started buying milk.
Search Engine Markshowdown
I decided to run the web page analyzer (excellent tool) against the front pages of a few of the latest and greatest search engines and also do a little analysis of my own. Here are some of the results in one of the only tables you’ll ever see on this site:
Feedster | Technorati | Yahoo Search | ||
---|---|---|---|---|
HTML | 6.11 | 3.72 | 1.18 | 7.82 |
Ext. CSS | 11.47 | 11.63 | 0 | 1.45 |
Other | 9.10 | 6.70 | 15.10 | 1.72 |
Total | 26.70 | 22.05 | 16.27 | 11.00 |
Compressed | No | No | Yes | No |
Numbers are kilobytes, and may not add up exactly due to rounding. CSS is external, linked files. “Other” includes images and javascript.
Yahoo was the surprise winner here. Their HTML was alright but I think could be reduced quite a bit without losing anything. You’ll note they have the heaviest HTML of the bunch, heavier than other sites showing quite a bit more on their front page. They should probably talk to Doug. Overall though I think Yahoo has consistently been doing great nearly-standards-compliant work in their new designs. Yahoo could save about 67% of their HTML size with compression. Interestingly, Yahoo was the only site to specify ISO-8859-1 encoding, all the others claimed UTF-8.
Google was optimized to the hilt, but it’s kind of silly that they put so much effort into their markup but couldn’t go the last inch and make it valid HTML 4. They could probably make it a bit smaller with some more intelligent CSS usage. At least they don’t have font tags anymore. I think under normal circumstances they would have won but they have an olympic logo right now that’s pretty heavy. Google was the only site that used gzip compression for their HTML, but even uncompressed they only weighed in at about 2.4 kilobytes, still the lightest of the group.
Technorati clearly had the smartest markup of the group, and was the only one that validated. (An impressive feat for any website in this day and age.) Their markup is clean as a whistle with excellent structure and logic, and their numbers aren’t bad when you consider that they have a lot of stuff on their front page. This isn’t too surprising since Tantek did it. Their CSS, however, is pretty heavy. It’s strange because it’s very optimized in some ways but bloated in others, I think they could cut a few K from it pretty easily. One smart thing they did is have the CSS named with the date, so it’s name versioned and they can update it monthly without caching issues. All that said, they’re so far ahead of everything else they don’t need to worry about much. Technorati could save about 53% of their XHTML size with compression.
Feedster has its heart in the right place, but the implementation falls far short. For example it has a XHTML 1.1 doctype
but then has the needless XML declaration at the top throwing IE into quirks mode. They use CSS in places, but then they have a table with 75 non-breaking spaces in it for positioning. There’s a ton needless markup, including a full kilobyte of HTML comments. On the bright side, they have the most room to improve. Feedster could save about 61% of their XHTML size with compression.
Wired Markup
Campaign Game Mimics Real Life was a decent article and the game looks fun but what most impressed was that they used the <cite>
element to mark up the game name.
Spam Whoops
I had forgotten to check the spam folder on one of my accounts for a while. Over 67,000 spams caught by SpamAssassin!
IBM Blogs
So Sun uses Jroller and IBM uses WordPress. As Carthik says this looks like a staging setup.
Adsense Idea
Wouldn’t it be neat if instead of requiring a crawler bot to visit the page the adsense javascript could actually scrape the page text that the user sees (like a screen reader like JAWS) and use that to formulate the ad in real-time? That would prevent people from cloaking pages for the adsense bot an would allow it to be used on pages that you must be logged in to view the content. I know it’s probably impossible, but it’d still be neat.
Platform Buzz
Breakdown of publishing platform buzz, shiny graphs! Forgot “word press”.
Danish Politician Blogs
David Hansson writes in: “Thanks for the link to Loud Thinking. You must have a pretty popular blog because you’re sending tons of people my way 😉 Also thanks for WordPress. I used it to put the first Danish politician in parliment online with a blog at http://auken.nextangle.com.” It’s a small world, and blogs are making it smaller.
Ping-o-Matic!
Pushing a cool ten million. The new database system is working out great.
Blog Appeal
“I’ll have you know, that WordPress is very sexy. Just ask any WP site owner, they’ll say that their sex appeal has increased by a factor of 2 since they moved to WordPress. And you’ve never been moved until you’ve been moved by someone like me.” I can attest to the first part. Shelley is doing WordPress/blog consulting to raise money for a new camera. If you’re in the market for that sort of thing, may want to drop her a note.
Link Thanks
I just wanted to take a moment to thank those people who give proper attribution (aka a hat tip) when they post about something they found here. More and more lately I’m seeing things that I know started here show up from blogs of people I know and respect with nary a note or link back. Taking the time to properly attribute things can be a drag sometimes, but I think it’s important to maintain the credibility of weblogging as a medium and to reward those who bring new things to light. If you are someone who does properly credit things please know that I appreciate it quite a bit, and I hold you in a higher esteem than more “professional” blogs who are sloppy at best with their attribution.
WordPress.org Search
I’ve ripped out the guts and redone the search on the WordPress.org support forums in the hopes of making it something more people will use. Try it out! The new system searches the wiki (hosted on a different machine), thread titles, recent posts, and does a FULLTEXT post search for the most relevant posts. It has contextual search highlighting (like Google).
When I have some time to get back to this every section will have a “more of this” link to take you to more results (paged). It does this currently with the wiki search, counting the total results and linking to the wiki search directly if there are more than 5 results. Probably still a few bugs to work out. The fulltext query was taking over two seconds to run until I tweaked the JOIN type to get the MySQL optimizer to use the proper index and join order. Everything should validate as XHTML.
A new system is also in place to inject custom results at the top of the page. We’ve been logging searches for the last few months (over a 129,000 so far, about 43,000 unique searches) and I’m going to be working closely with the documentation team to identify which searches are most common and what tailored information would be best to present the user with when they search for targetted terms, be it a blog post, an external resource, someplace on WordPress.org itself, a wiki page, or a specific thread. We can watch trends and spikes in searches to identify any problems in the application itself or features that may be insufficently documented or hard to use.
The work is far from finished, but I think it’s a strong first step into fully integrating search as a support mechanism and bringing the WordPress team even closer to the pulse of the users.
Keyword Idea
Idea of the day: create ad-hoc keyword caches using referrers from search engines for use with internal searches.
Google Doesn’t Read XHTML
Google doesn’t index application/xhtml+xml
pages, geez what a mess. I prefer the rigidity of XHTML syntax but I don’t want the mess that comes along with it. Hat tip: Petroglyphs. Update: See comments.
Gallery: 8-18-2004
Auto-imported from old gallery:
Leap seconds
Dealing with leap seconds, I remember the days when 60 * 60 * 24 was good enough for anyone.
WordPress and Smarty
Donncha looks at Smarty and WP, again. It’s been done before, it’d be interesting to have a method that could automatically stay current with the native PHP methods.