At 11:30 PM 9/1/99 -0400, Jim wrote:
Rick wrote:
I had concluded, naively as it turns out, that the program was using a simpler algorithm something akin to this:
Traverse the tree top down where a node is a link not previously seen before and not a link offsite and not a link which is above the current level. Repeat with the new node. If no more nodes at this level, pop up a level and continue with the next node. If at top level, quit.
The trouble is that the desired conceptual tree doesn't really exist. The collection of links (viewed as pointers from one file to another) does not produce a tree. The directory hierarchy is a tree, but it
Not to belabor it, but are you sure? have you tried it?
isn't quite the organization one wants, because it is a disk map, not a site map. So I took the middle road, using the links projected onto the disk hierarchy (or the other way around, if that makes more sense to you), the result being a true site map with a structure partially imposed by the disk hierarchy. It's really a cute scheme; perhaps I should patent it.
Anyway, given these circumstances, what is your vision for the future of the sitemap file? It appears that it cannot reliably be made in full detail programmatically. One could hand-edit it to make it right, but then this calls for continual maintenance whenever a new file is created in the future. But is anyone really ready to sign up for this job?
Actually the program I wrote is handling it pretty well. For the most part people have been pretty good about naming language variants of the same file in a sensible manner. The program is pretty good about the few exceptions to the rule. It reads hints to the exceptions from the sitemap file itself <!-- encoded as simple HTML comments --> and writes these hints back to its output in the same format. So one can hand-edit hints into the sitemap and the hints stay in later generations of the sitemap as long as they make sense. (If the files to which the hints refer disappear, the hints go away.)
There are about eight hints in the file as I have it currently. This isn't bad at all, especially since they don't have to be maintained any more, unless new anomalies are created. Most of them are hints not to parse a file (like the site map itself, which otherwise would become the mother of all files).
In the latest version I've generated here, I exclude image files from the outline (standalone images were included before) and catch some files I had been missing before.
So my vision for this is that I make the program platform-independent (I took a couple of shortcuts in MacPerl) and ship it over to Rainer and Arthur. They install a cron task to run it once a week or so. And then someone should keep an eye on it once in a while to make sure
Who is going to bell this cat?
that people are naming files reasonably, and if not, install hints in the sitemap file itself.
If that doesn't work, I can always do the cron task here and e-mail the results. This is less desirable in my view.
One side effect of this exercise is that I have learned that the FAQs listing /gene/faqs/FAQ.html is referenced only once on our server, in one of the SwissGen pages. Somehow this doesn't seem right. I may be able to find other such nearly forgotten pages. If so, I'll make you aware of it.
OK
Of course, all this is really designed so that Rainer can't figure out what the blazes is going on. :-) Who'da thunk that an Ami would be imposing _more_ order on a German system? (really big :-)
:) Well, I won't try to stop you doing it, but I can't help having a "it's your funeral" reaction. I have other reasons for naming files the way that I do and I don't want to feel constrained by additional rules that I consider unnecessary. I'm not saying I will go out of my way to name things in ways that are not useful, but I still don't consider the filnames part of our public interface. And I don't personally want to work on making this scheme, which I consider built on shaky foundations, work. And I do wish you'd put your name at the bottom of the GTA page. ;) (Gratuitous throw-in there. ;)) Rick
-- =Jim Eggert EggertJ@LL.mit.edu