RSS Requests and Browsers

Out of curiousity I ran some stats on the different RSS versions I offer here. The results were pretty much what I expected:

  1. 47 % — RSS 2
  2. 39% — RSS .92
  3. 14% — RDF 1.0

Also as an update to a previous look, I don’t know what it was about Mozilla that month (August). Here’s what is happening currently:

  1. 38% — Internet Explorer
  2. 36% — Netscape/Mozilla
  3. 6% — Googlebot
  4. 3% — Safari

That’s pretty much on par for the course. One interesting note is that IE6 users seem to spend the most time on the site, for whatever that’s worth. As before, let me know how these things stack up in your neck of the woods.

Wildcard DNS and Sub Domains

What follows is what I consider to be best practice for my personal sites and a guide for those who wish to do the same. Months ago I dropped the www. prefix from my domain in part because I think it’s redundant and also because I wanted to experiment with how Google treated valid HTTP redirect codes. The experiment has been a great success. Google seems to fully respect 301 Permanent Redirects and the change has taken my previously split PageRank has been combined and now I am at 7. There are other factors that have contributed to this, of course, and people still continue to link to my site and posts with a www. (or worse) in front of it, but overall it just feels so much cleaner to have one URI for one resource, all the time. I’m sure that’s the wrong way to say that, but the feeling is there nonetheless.

Now for the meat. What’s a good way to do this? Let’s look at our goals:

  • No links should break.
  • Visitors should be redirected using a permanent redirect, HTTP code 301, meaning that the address bar should update and intelligent user agents may change a stored URI
  • It should be transparent to the user.
  • It should also work for mistyped “sub domains” such as ww. or wwww. (I still get hits from Carrie’s bad link)

So we need a little magic in DNS and in our web server. In my case these are Bind and Apache. I am writing about this because at some point the code I put in to catch any subdomain stopped working and while I reimplemented it I decided to write about what I was doing. This method also works with virtual hosts on shared IPs where my previous method did not.

In Bind you need to set up a wildcard entry to catch anything that a misguided user or bad typist might enter in front of your domain name. Just like when searching or using regular expressions you use an asterisk (or splat) to match any number of any characters the same thing applies in Bind. So at the end of my zone DB file (/var/named/photomatt.net.db) I added the following line:

*.photomatt.net. 14400 IN A 64.246.62.114

Note the period after my domain. The IP is my shared IP address. That’s all you need, now restart bind. (For me /etc/init.d/named restart.)

Now you need to set up Apache to respond to requests on any hostname under photomatt.net. Before I just used the convinence of having a dedicated IP for this site and having the redirect VirtualHost entry occur first in my httpd.conf file. That works, but I have a better solution now. So we want to tell Apache to respond to any request on any subdomain (that does not already have an existing subdomain entry) and redirect it to photomatt.net. Here’s what I have:

<VirtualHost 64.246.62.114>
DocumentRoot /home/photomat/public_html
BytesLog domlogs/photomatt.net-bytes_log
ServerAlias *.photomatt.net
ServerName www.photomatt.net
CustomLog domlogs/photomatt.net combined
RedirectMatch 301 (.*) http://photomatt.net$1
</VirtualHost>

The two magic lines are the ServerAlias directive which is self explanitory and the RedirectMatch line which redirects all requests to photomatt.net in a permanent manner.

There is a catch though. The redirecting VirtualHost entry must come after any valid subdomain VirtualHost entries you may have, for example I have one for cvs.photomatt.net and I had to move that entry up in the httpd.conf because Apache just moves down that file and uses the first one it comes to that matches, so the wildcard should be last.

That is it, I’m open to comments and suggestions for improvement.

Beginning On PhotoStack

It’s an entirely pleasant, rainy day, so I thought it would be a wonderful time to get going with Noel’s PhotoStack. I grabbed the latest version available and uploaded it to the server.

Trying to be as true as to I would actually use a program like this, I didn’t read any of the documentation. Plus there’s a readme file, but it has no extension so opening it means no less than three or four dialogs in Windows XP. A .txt extension wouldn’t hurt anybody. It gave me a message that the storage directory wasn’t set up properly, which told me that I probably need to edit a configuration file of some sort. So I fire up SSH. A ls -lah (which I have aliased as ll) shows a config.php, which I guess is what I’m looking for.

I fire up the one true editor. There seems to be a little more at the top than necessary and it doesn’t say much, but that’s a personal peeve. The varible names seem logical (some camelCase going on) but the descriptions above each is not always helpful. Mostly it’s just $photosName. I’ve never used the program before, and the description “The name of your Photos section.” makes sense to me as an English sentence but I don’t quite grok its significance.

Next up is the path information, which could possibly be streamlined. First we have $dirRoot where PhotoStack seems to want the absolute path to the script. It recommends “$_SERVER[‘DOCUMENT_ROOT’].’/photos’ may work for you.” but even though that makes perfect sense I’ve dealt a lot with this in WordPress. More people have messed up DOCUMENT_ROOTs than you could ever imagine, there are a few other solutions that may be better. One I’ve had good success with is dirname(__FILE__). which works like a charm for finding the absolute path of the current directory. realpath() may also be helpful, but we use the first trick in WordPress. The next variable is the URI of where PhotoStack is located, with the instruction “no trailing slash.” This is another pet peeve, but an instruction like this should be avoided at all costs. No trailing slash there, should I have a trailing slash on $dirRoot? It didn’t say anything. It causes confusion. It’s programatically trivial to detect and remove a trailing slash on this variable, so why even bother the user? Don’t make me think.

There are a lot more configuration options, a lot. It suggests replacing “no” with “yes” or vice versa to change the value. While this is probably more intuitive than boolean values of true or false, I think spelling out “yes” or “no” several times is a little patronizing. I know, impossible to please.

Okay so I’m done with the configuration file, I reload the URI. Still doesn’t work! I’m guessing it’s time to go to the readme, probably the storage directory needs to have permissions set or something. There is no wrapping in the file, which means each paragraph stretches really far and to read it I’ll have to scroll horizontally in my editor. Wrapping at 72 characters would probably not be a bad idea. Fortunately I know a shortcut (ctrl + j) to fix this but that’s just luck. I find out the software is licensed under the Creative Commons Attribution-NoDerivs-NonCommercial license, which I suppose means I can never use this for a paying client and if I want to improve on the code and release the changes without explicit permission from Noel. If Noel fell off a cliff I suppose the code would be locked under the license and development would halt? I don’t know exactly, but that’s what goes through my mind. Generally I’m much more comfortable with more liberal licenses, be it MIT or GPL or Artistic or anything Free. Next it says:

Templates in PhotoStack are not licensed as part of PhotoStack. Therefore, they are not subject to the licensing terms of PhotoStack. I’m placing this decision in the realm of Mr. Allen… If anyone can lend some clarification that would be great.

That doesn’t inspire the greatest confidence, but I’m planning to modify the templates anyway so maybe I should worry. Finally at the end it tells me to chmod 777 the storage directory. Ah, what I needed to know. I have heard some peopl ecomplain about the liberal chmod requirement before, but it’s really only necessary because of the way most web servers are set up to execute PHP, which I suppose could change in the future. It’s no problem to me. I secure things at a much lower level.

I’m already at the command line so it’s a simple matter to modify the directory. No errors but the pictures don’t load. Whoops, I missed setting $webDir variable, probably because I was planning to talk about it but went off on a tangent. The default value of this is “http://yoursite.com/photos”, and this is splitting hairs but there are several domains expressly for this purpose and an RFC to back them up. It’s a good practice because since yoursite.com isn’t reserved, it could theorectically be taken by some unsavory character that used its ubiquity in examples in some malicious fashion. You never know.

It loads! However I click on the sample album and there’s something funky going on with the layout. Perhaps it has something to do with the size I set my thumbnails at (150×150, though I would like to just be able to say something like 150 px on its longest side and allow it to keep its proportion, or just 150 px wide all the time. I’m not crazy about every thumbnail being square). What ever it is it will have to wait until tomorrow because it is past my bedtime. Hopefully tomorrow I can start loading this thing up with photos.

“Matthew! How could you possibly be so nitpicky with this poor guy’s project? How would you like that if someone did that to your project?” Actually, I would be thankful and flattered. If I didn’t like Noel and think PhotoStack could be great I wouldn’t be spending time documenting my thoughts on it. Constructive feedback is golden to an open-minded developer.

Fray Day 7

I'll be at Fray Day 7I’m going to participating in Fray Day 7 here in Houston tonight at 8 PM. I’m a “featured speaker” tonight and I’m going to be telling a story I call “The Little Red Button That Changed My Life.” Several friends have already expressed an interest in coming and I’m looking forward to seeing everyone. It should be a lot of fun for everyone involved. I know Robert Nagle has worked very hard in putting all this together, my Dad told me he heard the event mentioned on the radio Friday morning.

To be honest though I’m scared to death.

Where? The Nexus Cafe (Walden Internet Village) is located on 2828 Rogerdale, 2nd floor (between Richmond and Westheimer).

Two Great Shows

Radiohead was really exciting. The Chronicle has a review, but like most stories there it’s painful to read and I doubt that link will last as long as this entry is on my front page. Our seats were on the lawn and we were a bit to the left and back. We had a good clear view of the stage but couldn’t see too many details, though certainly everything came through. I would have liked te been closer to see how some of the effects were done, but maybe next time. They went through old and new songs, starting with some of the latest ones from Hail to the Thief and moving forward. There were a few flubs, such as Yorke skipping a section on 2 + 2 = 5 one or two other minor things that I doubt too many people noticed. My only complaint would be that with several songs they would end with a solo usually from the guitarist on the right (I can’t think of his name at the moment) but you could tell it was the end of the song and the energy was dying around him as he was trying to build up his solo. It would of been nice if they took a cue from jazz and went from a solo back into the melody or some sort of chorus to end the tune and keep the energy up.

The Lincoln Center Jazz Orchestra at Jone’s Hall last night was one of those musical experiences that will stay indelibly burned in my memory for a long time. I had been looking for tickets and the day before my uncle called asking if I’d like to go with him, row B right in the center. Close enough to hear the musician’s sounds and not just the amplification, I was blown away. Every soloist and every piece was top-notch. The highlight of the evening, besides of course Houston native and HSPVA grad Andre Hayward’s music, was Eric Lewis’ piano. I have never heard of this man before, nor can I find anything on the web. Throughout the concert whenever Wynton introduced him he prefaced his name with what sounded like “Top Professor” which I’m sure means something, but I’m not sure what. Lewis’ solo on A Love Supreme’s Resolution was so intense and captivating that I was completely taken away by it in a way that music effects you only a few times in your life. The personnel of the group was different in several regards from the program, but that’s to be expected with the dynamics of a touring group and the fact that the programs are printed months before. If you have a chance to check out the Lincoln Center Jazz Orchestra, do so. Highly recommended.

Ongoing

I usually write entries in my head before I put my fingers to the keyboard. The problem with this is that the longer I go between entries, the more that I try to cram into my mental post and inevitably the more that’s lost.

When you last left your Author he was gearing up for the second night of the Kemah Jazz Festival. It was fantastic, as expected, and he had good fun with the company. Tim Hagans made a guest appearance on Woody Witt’s set and it was the highlight of the night. Ended up leaving a little bit early due to tiredness, and slept well.

Saturday started with leftover pizza from Star Pizza, which, in hindsight, was most likely bad. Your Author was very, very hungry and ignored the fact that it tasted a little funny (it was vegetarian “gourmet” pizza anyway) and he was already on the way to rehearsal. By the end of the dress rehearsal with Steve Fulton things were queasy. But not too queasy to miss Kathy and Christine‘s birthday party that night, to which he was accompanied by Elissa. Too queasy to eat much there save a taste of really nice meat stick from Coffee “BBQ” Mike and a slice of cake, both of which were sorely regretted later.

Saturday night and Sunday morning were very harsh, and will not be discussed. Many thanks to my angel of a mother who helped smooth things over.

Sunday the Author was still sick, but knew he couldn’t miss the gig at Kemah, so went and played anyway. It went well, and many thanks to those such as Cody, Elissa, Greg, Sarah, and the others that attended. Food was still a bad idea though, and the trip had an early end. That night the fever came back strong and not much sleep was had.

Monday was a day of recovery. Tuesday was a return to normal affairs and catching up with things.

Which brings us to today. Things are very busy with many projects, but that’s par for the course. Tonight is the Radiohead concert which I’ve been looking forward to for months it seems. The weather is gorgeous. Can’t wait.

Kemah Jazz 2003

As I may have told some of you already, I’m performing in the Kemah Boardwalk Jazz Festival again this year. This year I’m performing with a different (and honestly much better) group than I did in previous years. So on Sunday from 2:10–3:00 I’ll be on the Kemah stage, jazzin’ it up. An interesting note about this performance is that I’m going to be making my public debut on flute. So if you can make it down come and say hi to me before or after the gig and we’ll chat.

If you’re interested in seeing some of the other performances as well this weekend probably the closest thing to a good schedule online is at JazzHouston. I have some pictures from last night’s performance forthcoming. All of the music was fantastic, and I really mean that. One of the non-music highlights of the night for me (besides the beautiful sky and good pizza) was Dennis Dotson saying my name from the stage as part of troubleshooting some amplicification problems. It’s the little things. 🙂

If you need any more information, feel free to contact me. Don’t be shy, no one else is: several days ago I got a call on my cell phone from a number I didn’t recognize. I said hello and an unfamilar voice asked me if I had this year’s schedule for the festival. Apparently from a search engine she came across last year’s schedule and assumed I was the authority to contact, on my phone nonetheless, for this year’s. I directed her to a website or two that would have more current information than mine. It was certainly an interesting experience, but now I’m faced with problems of transparency. I want it to be as easy as possible for my readers to get in contact with me, except when I don’t. We’ll see how this works out. Of course it never would have been a problem in the first place if the festival had a decent website or the schedule available in a non-graphic form. What if a blind person wants to go and can only get to the schedule through the web? I guess they’re out of luck. The festival organizers are very open to suggestions though, and I’m sure if this issue is brought to their attention they’ll address it, it just probably hasn’t occurred to them.

When Jish Comes to Town

It seems lately that the prolific Jish can’t get enough of Houston. He was just here last month and this Wednesday he made another appearance. (It’s even rumored he may be back next month.) Last time we all met at the Crazy Cantina and that was the plan for this month but apparently they were a little too crazy over there so the Cantina is no more. Christine luckily found this out last week and moved the party down the street to Cabo’s. There still wasn’t quite enough air conditioning (or maybe the company and conversation was just too hot) but it was all in all pretty cool as we got an entire room to ourselves. (For better or worse, some people had trouble finding it.)

There was just too much going on that night to even begin to cover it all but I’ll add that my photos from the night are now online.

Other posts about the night:

Note: I posted the first of what I hope to be many stealth disco movies to that album, but it’s too dark and I can’t figure out how I could adjust the levels to lighten it. If you have any suggestions regarding this please let me know.

Design SIG Meeting Tonight

I just wanted to let everyone know I’m going to be presenting tonight at the web design special interest group at HAL-PC. I’m going to be covering advanced CSS layout techniques, why CSS is easier for making sites, Topstyle 3, and we’re going to do a brief makeover of someone’s site at the meeting. I’ll be going through my personal methodology in making a standards-based website and redesigning legacy sites. If you have been struggling in trying to move away from table-based design or if you’d just like a free critique of your website, come out. The meeting is free and open to everybody.

Here are the details:

When? Tonight, September 18th, 7:00 PM.
Where? HAL-PC Headquarters, 4543 Post Oak Place Drive, Suite 200 Houston, TX 77027-3103.
Why? To learn and have fun.
How do I get there? Use the map linked above. Easy directions: Take 610 and exit San Felipe, head inside the loop (East) and then take a left at the first intersection. You will come to a stop sign, then pass Microcenter, and then at the second stop sign take a right. Go on that street until it dead ends in a loop, on the right end of the loop there’ll be a driveway going to an underground garage. Park, take the elevator to the second floor, and then the only office there is the HAL-PC offices. Once you’re in the office if you have any trouble finding the room we’ll be in just ask any of the friendly volunteers and they’ll point you in the right direction.

If you have any questions just drop me an email before the meeting and I’ll send you any additional information you might need.

Clever New Comment Spammer

I think I’ve been hit with a new kind of insidious comment spam. At about four this morning I got a comment on an old entry that said:

Well, I just wanted to sign a blog on the first time in my life :))

Kind of cute, right? Isn’t that nice that some guy, “James Hatchkinson,” came across my site and was so enamored that he decided to leave a comment, his frist ever. Well, two minutes later the exact same comment, URL, and name was left on the WordPress blog. Clue #1.

The URL he left as his with his comment is nositeyet.com, which I’m not going to link because this may be this spam’s whole point. I clicked the URL from the comment before realizing it was probably just a newbie way of saying “I don’t have a site yet.” People I know have left similar things for their URL in the past. Well, the link takes you to some sort of web company with a hideous flash intro and an equally mediocre web site. Hmmmmm. Clue #2.

Clue #3, each comment came from radically different IP addresses. Let’s give this guy incredible benefit of the doubt and say just maybe he was a newbie user who just came upon an old entry, left a silly comment with what he thought was a fake website, and then continued browsing to another one of my sites, went to a slightly old entry, and left the same comment. So why did his IP change? The first comment came from 195.200.168.250, which resolves to anaconda.pacwan.net, and the second from 80.58.4.44, which is a proxy of some sort. Most users, especially the type that would leave this sort of comment, don’t randomly start using proxies mid-browsing. Strike three.

Finally, I decided to look up this guy’s IP in my access logs, to see what pages he visited. There were no records of his IP visiting any pages on either site in my PHP/Javascript based logging software, which means whatever client was used to leave this comment doesn’t support javascript or the <noscript> tag and images. Time to grep the raw logs. No referrer, none of any of the usual signs you would see in a log entry. Here’s the relevant lines from my photomatt.net logs:

80.58.4.44.proxycache.rima-tde.net - - [18/Sep/2003:04:03:50 -0500] "GET /p644 HTTP/1.0" 301 303 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"
80.58.4.44.proxycache.rima-tde.net - - [18/Sep/2003:04:03:54 -0500] "GET /p644 HTTP/1.0" 200 15796 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"
80.58.4.44.proxycache.rima-tde.net - - [18/Sep/2003:04:03:56 -0500] "POST /b2comments.post.php HTTP/1.1" 302 5 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"

And from wordpress.org:

anaconda.pacwan.net - - [18/Sep/2003:04:01:35 -0500] "GET /development/archives/39 HTTP/1.0" 200 7220 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"
anaconda.pacwan.net - - [18/Sep/2003:04:01:40 -0500] "POST /development/b2comments.post.php HTTP/1.0" 302 0 "-" "Mozilla/4.0(compatible; MSIE 6.0; Windows NT 5.1)"

There’s got to be a good story behind this. If this is indeed malicious comment spam then this is the most clever I’ve seen yet. If I hadn’t been the author of two posts he spammed and gotten the email notification I never would have suspected a thing. Has anyone else seen this?

What’s worrying about this whole thing is IP filtering (reactive) techniques that are usually used to block comment spam or content filtering (proactive) techniques which we’ve been experimenting with on WordPress wouldn’t catch this guy. In fact I can’t think of any good way to preemptively block this sort of thing. If Google didn’t give blogs so much credence we wouldn’t be having this problem. I suppose now we have to watch every comment with an eagle eye, on the lookout for anything suspicious.

Update: I got it reversed above, “he” commented on the WordPress blog first and then here.

Assorted

A little of what’s going on in the corner of the world wide web I frequent: