Monthly Archives: September 2003

Keeping Links Kosher

As part of the re-vamp I’ve put together a 404 script that emails me whenever it’s called. This has certainly been an eye-opener as to the misguided traffic that this site gets. An email is so much different than just seeing the hits in your logs, and I would recommend anyone serious about maintaining a site do something similar.

There are a few links to my old curly quotes entry that link to a rather funky perversion of a fly-by-night URI scheme that has long since gone by the by. These links worked just fine until I deleted the file that was keeping things going, now it’s time to move things into the magic .htaccess file.

Let’s take a look at the URI in question:
http://www.photomatt.net/archives/m/200209?p=186

My first thought was to just plop latter part of the request and create a rule just for this link, as such.

Redirect Permanent /archives/m/200209?p=186 http://photomatt.net/p186

Didn’t work, never matched. Next try I decided to go for something a little broader:

RedirectMatch 301 .*p=([0-9]+) http://photomatt.net/p$1

Didn’t work, never matched. Some research found that the problem lies in the query string, and the Apache redirect directives don’t address the query string. So let’s give mod_rewrite a go:

RewriteRule ^.*p=([0-9]+) http://photomatt.net/p$1 [R=301,L]

Still no luck. (For those that wonder, the HTTP response code 301 indicates that the resource has been permanently moved. “Permanent” in the first try is just a synonym of “301”.) It looks like the magical mod_rewrite doesn’t match query strings either. Some more research turned up that while redirect doesn’t match or rewrite query strings, it does pass them all. So we are left with:

Redirect Permanent /archives/m/200209 http://photomatt.net/

Which, counter-intuitively, works. The ?p=186 on the end is just passed to the root of the site, which gives it to my index file which knows just what to do with it. I would like to eliminate the query string entirely and forward the URI to http://photomat.net/p186 but while that would be trivial in any scripting language I can’t nail down how to do it on the Apache or mod_rewrite level. So my options now are to add something to the global header to catch p=something query strings and redirect it, but I’d like to keep that file clean, so more likely is that I’ll start adding some URI management code into the 404 handler and generally make that file more sophisticated in general. We’ll see.