11 Comments

  • Matt August 18, 2004 @ 12:29 am

    Thanks to Dinah for pointing out the XHTML error in this post.

  • Jacques Distler August 18, 2004 @ 1:39 am

    That’s old news, now fixed.

    You can easily check that by any-old Google search which yields hits on my blog. For a while, none of my pages made it into the Google cache (they were, in Google-parlance, “partially-indexed”). Now all are there. The fix came about 4 months ago, IIRC.

    There are many reasons to fear sending application/xhtml+xml.

    This isn’t one of them.

  • Anne August 18, 2004 @ 1:54 am

    Indeed. Try this: search! Fortunately, it seems Google doesn’t support hexadecimal references :-)

  • Anne August 18, 2004 @ 1:55 am

    Well, actually. It doesn’t really have natice support I guess (unknown format), but it does index the content a bit. (I’m not sure about recognizing markup and giving more value to H1 et cetera.)

  • Matt August 18, 2004 @ 1:56 am

    Thank goodness. :) Thanks for the corrections.

  • Dave August 18, 2004 @ 4:37 am

    I’m confused, Anne’s link shows:

    Contact
    File Format: Unrecognized – View as HTML
    Contacteer ons. Limpid. > Rozenstraat 12-B. > 3702 VN Zeist. > Tel.:
    +31 (0)6 228 362 87. > info@li …
    annevankesteren.nl/test/ examples/css/advanced/limpid-2.xhtml – Similar pages

    Which would leave me feeling a little worried that google is properly indexing the content….
    Seeing a result like this in google would confuse a novel user too, hell i think i would be turned off from clicking it…

  • Anne August 18, 2004 @ 8:26 am

    You get that result since you didn’t searched for anything specific. If you use keywords, instead of the location it will give back something more useful. Note also that the page in question is a bit weird.

  • Wayne August 18, 2004 @ 8:58 pm

    Google seems to be indexing application/xhtml+xml files, but they’re clearly handling them differently. Files on my server with the .xhtml extension are always served as application/xhtml+xml. Google extracts metadata from my site’s index.html file — including the title and a content summary — but provides no information about my .xhtml alternative. In addition, Google has properly cached the .html version, but not the .xhtml version.

  • Peter J. August 19, 2004 @ 5:57 pm

    OK, there’s something funny going on. From request to request Google seems to change its behaviour. It’s been by the URL in my original post three times, the first time getting a 406, the second a 200, and the third a 406 again. All told yesterday I received seven requests that resulted in a 406 and twenty that resulted in 200, across 24 unique pages. Are any of the other a/x+x senders seeing similar behaviour?

  • Chris Watson a.k.a. "The Bicycling Guitarist" October 21, 2004 @ 6:30 pm

    Before converting my site from HTML 4 transitional to XHTML 1.0 strict code, Google listed dozens of my pages. Soon after I did that, it dropped me completely. It’s been a year and a half and Google STILL doesn’t crawl my site. Googlebot says 406 error when it goes to my home page, even though the bots of many other search engines have no trouble. My code is stricter than strict, ultra validated, and served as text/html according to the rules of Appendix C for backwards compatibility. If interested, my site is http://www.TheBicyclingGuitarist.net
    Chris Watson a.k.a. “The Bicycling Guitarist”

  • Aleksandersen August 6, 2006 @ 11:23 pm

    Is this issue addressed by Google now?

    Does Googleblot support application/xhtml+xml now?

Share Your Thoughts