Re: [webpages-l] sitemap_m.html

18 Sep 1999

      At 11:30 PM 9/1/99 -0400, Jim wrote:
...
Rick wrote:
...
I had concluded, naively as it turns out, that the program was
using a simpler algorithm something akin to this:
Traverse the tree top down
       where a node is a link not previously seen before
       and not a link offsite
       and not a link which is above the current level.
       Repeat with the new node.
       If no more nodes at this level, pop up a level
       and continue with the next node.
       If at top level, quit.
The trouble is that the desired conceptual tree doesn't really exist.
The collection of links (viewed as pointers from one file to another)
does not produce a tree.  The directory hierarchy is a tree, but it
Not to belabor it, but are you sure?  have you tried it?
...
isn't quite the organization one wants, because it is a disk map, not
a site map.  So I took the middle road, using the links projected onto
the disk hierarchy (or the other way around, if that makes more sense
to you), the result being a true site map with a structure partially
imposed by the disk hierarchy.  It's really a cute scheme; perhaps I
should patent it.
...
Anyway, given these circumstances, what is your vision for the
future of the sitemap file?  It appears that it cannot reliably be
made in full detail programmatically.  One could hand-edit it to
make it right, but then this calls for continual maintenance
whenever a new file is created in the future.  But is anyone really
ready to sign up for this job?
Actually the program I wrote is handling it pretty well.  For the most
part people have been pretty good about naming language variants of
the same file in a sensible manner.  The program is pretty good about
the few exceptions to the rule.  It reads hints to the exceptions from
the sitemap file itself <!-- encoded as simple HTML comments --> and
writes these hints back to its output in the same format.  So one can
hand-edit hints into the sitemap and the hints stay in later
generations of the sitemap as long as they make sense.  (If the files
to which the hints refer disappear, the hints go away.)
There are about eight hints in the file as I have it currently.  This
isn't bad at all, especially since they don't have to be maintained
any more, unless new anomalies are created.  Most of them are hints
not to parse a file (like the site map itself, which otherwise would
become the mother of all files).
In the latest version I've generated here, I exclude image files from
the outline (standalone images were included before) and catch some
files I had been missing before.
So my vision for this is that I make the program platform-independent
(I took a couple of shortcuts in MacPerl) and ship it over to Rainer
and Arthur.  They install a cron task to run it once a week or so.
And then someone should keep an eye on it once in a while to make sure
Who is going to bell this cat?
...
that people are naming files reasonably, and if not, install hints in
the sitemap file itself.
If that doesn't work, I can always do the cron task here and e-mail
the results.  This is less desirable in my view.
One side effect of this exercise is that I have learned that the FAQs
listing /gene/faqs/FAQ.html is referenced only once on our server, in
one of the SwissGen pages.  Somehow this doesn't seem right.  I may be
able to find other such nearly forgotten pages.  If so, I'll make you
aware of it.
OK
...
Of course, all this is really designed so that Rainer can't figure out
what the blazes is going on.  :-)  Who'da thunk that an Ami would be
imposing _more_ order on a German system?  (really big :-)
:)

Well, I won't try to stop you doing it, but I can't help having a
"it's your funeral" reaction.  I have other reasons for naming files
the way that I do and I don't want to feel constrained by additional
rules that I consider unnecessary.  I'm not saying I will go out of
my way to name things in ways that are not useful, but I still don't
consider the filnames part of our public interface.  And I don't 
personally want to work on making this scheme, which I consider built
on shaky foundations, work.  And I do wish you'd put your name at the
bottom of the GTA page. ;)  (Gratuitous throw-in there. ;))

Rick
...
-- 
=Jim Eggert   EggertJ@LL.mit.edu

Re: [webpages-l] sitemap_m.html

Richard Heli