[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] An HTML page indexer program (MacPerl of course!)



Here is a utility I have developed in MacPerl that creates
an alphabetical index of all HTML files in a directory and
below (recursively) based on the <TITLE>...</TITLE> tags.
The user chooses a starting directory and an index of
pages is produced in a file called "cindex.htm".


======================================================================

This is what the final (sample) output looks like in your browser:

A

ACTA (Software)
Albert Einstein - Quotations and Ascii Art
Applied Imagination (Book Review)
Axon Idea Processor

B

Book titles in the Creativity Bookshelf
Braindancing (Book Review)
Brainstorming (Technique)
Buckminster Fuller

C

CK Modeller (Software)
CM1 (Software)
CPSI Conference 1995

======================================================================

The HTML generated by the program looks like:

<H2>A</H2>

<A HREF="Software/ACTA.htm">ACTA (Software)</A><BR>
<A HREF="Genius/einstein.htm">Albert Einstein - Quotations and Ascii
Art</A><BR>
<A HREF="Books/B6058.htm">Applied Imagination (Book Review)</A><BR>
<A HREF="Software/axon.htm">Axon Idea Processor</A><BR>

<H2>B</H2>

<A HREF="Books/index.html">Book titles in the Creativity Bookshelf</A><BR>
<A HREF="Books/B52738.htm">Braindancing (Book Review)</A><BR>
<A HREF="Techniques/brainstorm.htm">Brainstorming (Technique)</A><BR>
<A HREF="Genius/bucky.htm">Buckminster Fuller</A><BR>

<H2>C</H2>

<A HREF="Software/CKModeller.htm">CK Modeller (Software)</A><BR>
<A HREF="Software/CM1.htm">CM1 (Software)</A><BR>
<A HREF="Resources/cpsi95.htm">CPSI Conference 1995</A><BR>

======================================================================


The program:-

# titles.pl  Build up a table of contents of the documents
# TITLE fields.   Written 19th October 1996 by
#   Charles Cave  charles@jolt.mpx.com.au
#
require "GUSI.ph" ;

$basedir = &MacPerl'Choose(&GUSI'AF_FILE, 0,
        "", "", &GUSI'CHOOSE_DIR);

$level = 1;
%titlename = ();   # title * filename values

&listdir($basedir, $level);
&outputindex;

#-----------------------------------------------------------

sub listdir {
 local($cwd, $level) = @_;
 local($indent) = "  " x $level;
 chdir($cwd) || die "couldnt change directory\n";
 opendir(THISDIR,$cwd) || die "couldnt open $cwd\n";
 local(@filenames) = readdir(THISDIR);
 closedir(THISDIR);
 local($entry);
 foreach $entry (@filenames) {
    if (-d $cwd.":".$entry) {
       &listdir($cwd.":".$entry, $level+1);
    } else {
       if ($entry =~ /htm/) {
          &gettitle($cwd.":".$entry);
       }
    }

 }
 return;
}

# ---------------------------------------------------------------

sub gettitle {
  local($filename) = @_;
   open(HTFILE, $filename) || die "could not open $filename\n";
   $notitle = 1;
   while (<HTFILE>) {
     chop;
     if (/\<TITLE\>/i) {
        s/<TITLE>//i;
        s/<\/TITLE>//i;
        s/<HEAD>//i;
        s/<HTML>//i;
        $titlename{"$_}$filename"} = "Y";
        $notitle = 0;
     }
   }
   if ($notitle == 1) { print "$filename\n  *** No title found\n" };
   close(HTFILE);
   return;
}

# -----------------------------------------------------------

sub outputindex {
 open (NDX, ">".$basedir.":cindex.htm") || die "no NDX file!\n";
 $prevchar ="";
 foreach $key (sort keys(%titlename)) {
    $key =~ s/$basedir//;
    $key =~ s/://;
    $key =~ s/:/\//g;   # convert colons to fwd slashes
    ($textval,$url) = split(/}/,$key);
    $firstchar = substr($textval,0,1);
    if ($firstchar ne $prevchar) {
       $prevchar = $firstchar;
       print NDX "\n<H2>$firstchar</H2>\n\n";
    }
    print NDX "<A HREF=\"$url\">$textval</A><BR>\n";
    }
 close(NDX);
 return;
}


------------------------------------------------------
Charles Cave
Sydney, Australia
Email: charles@jolt.mpx.com.au
URL:   http://www.ozemail.com.au/~caveman
------------------------------------------------------