[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] Searching a VERY large text file



According to Shyam Hegde:
> 
> Is there a fast way in Perl/MacPerl to search these files for a particular
> string??  The Camel book seemed to suggest using Hashes, but doesn't go
> into detail and using the standard 'Open' and <shove into array> requires
> MacPerl to have 20Mbytes of RAM assigned to it and me to wait a very long
> time.
> 

The method I use in cases like this is to make a database
with an index.  I believe the database module (DBFile?) can
also help in this matter.

With the database/index approach you create your database
and use the index to just have the words, a number telling
you where the record starts, and how long the record is.
Since the index is a lot smaller than the database is you
can load the entire thing into memory and then do a fast
search on it to find the record you want.

A fast method of searching can be either a hash array or
simply doing the divide and conquer method of locating a
record.  (ie: check first record, check last record, then
check the middle record.  If the middle record is not the
correct record change the first or last pointer to the
middle pointer and check the new middle record.  Continue
until you find the record.)


***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch