On Sun, 5 May 1996 17:27:51 +0200 (MET DST) Bart Lateur <color@pophost.eunet.be> wrote: > Personally, I detest the use of fixed length records, unless the length of > the fields is garanteed to fall between some narrow limits. If you provide, > say, 40 characters for a name, this will be far to much in most cases, while > sometimes it still isn't enough. >... > Why don't you just *ask* Perl where the current line is > (before reading it) using tell(), and store *this* value instead. If you insist on indexing by line numbers there is a simple scheme that has a fixed auxiliary storage requirement and degrades gracefully with increasing file size. It is called "Accordion Index Sequential" and was kicked around quite a bit here at Maryland in the mainframe days. The basic idea is you keep a table of seek positions. Initially it is every line, but when you fill the table you throw away every odd pointer, keeping only 2, 4, 6, 8 etc and make a note that you now only have half the pointers. When you seek you might have to read ahead one record (that is, when you seek record 9 you follow the pointer to record 8 then read one more record). When you fill the table again, again you throw away every other pointer, now keeping pointers to records 4, 8, 12, 16 etc. Now you might have to read up to three more records on each seek (but they're probably in the same block anyway so you might not have to do any real IO at all). It only requires one fixed-length pointer array and a scalar variable to keep track of how many lines are currently represented by each current pointer entry. Unfortunately I am up to my ears in computational linguistics and don't have time to code this up. Maybe somebody will do a good job and add it to the Perl toolkit... > --- Embracing the KIS principle: Keep It Simple Albert Einstein was supposed to have said: "Everything should be made as simple as possible AND NO SIMPLER!" ---------------------- Charles B. Cranston <zben@ni.umd.edu> http:"//www.wam.umd.edu/~zben"