Lately it is with less frequency that news from the White House sends a chill up my spine, yet it seems the White House is using technical means to prevent spidering and archival of key documents. This is, without question, highly questionable. I hope there is a good reason for this or that it will be reversed quickly, but one has to wonder whether such a deliberate action could have been done by someone who is not a stakeholder, like a web lackey.
Of course this is a situation that could be addressed by technical means. A spidering robot that did not follow the robot exclusion rules could spider a number of public government web pages at set intervals, say twice a day, archive the results of the crawl, and a summary of the differences between the versions could be offered as a service of government transparancy. WhiteHouse.gov would certainly be worth watching and others such as the Fed could be interesting as well. It’s not a trivial task, but I would imagine one of the groups interested in such things would have no problem funding the development and mantainance of such a tool. For complete transparancy the tool could be open source. I can’t think of a legitimate objection that could be brought against such a service by operators of the websites in question. Bandwidth use would be trivial compared to the amount of traffic such sites must get every day.