Hi folks, A question was asked about efficient searches of word lookup tables: >The program is searching two very large text files. The first file is a >word lookup which has a list of all words and their associated key value. >The words are not arranged in any kind of alphabetical order. > 100001740,'entity' > 100001740,'something' > 100002086,'life_form' > 100002086,'organism' > >The second file takes the same format, but contains a list of all word >descriptions and their associated key values. Here is an example: > 100001740,'(anything having existence (living or nonliving))' > 100002086,'(any living entity)' > 100002880,'(living things collectively; "the oceans are teeming What is unclear is what you want to accomplish with these lookups. From the look of it, it looks like a huge dictionary translating words into definitions. In this case, I think the most direct solution would be to do a one time translation of the two files into a single Un*x DBM file, where the unique key values are the words 'entity', etc. Then your "search" would become a quick lookup, which it probably is. There is a little section on the DBM routines in the on-line MacPerl book, to which I've been referring as I learn MacPerl myself! Thanks to all involved... it's been a real help. Is there some odd reason to keep the numerical keys? Re: writing your own hash. The Un*x DBM database routines are a pretty implementation of hashing that hides the internal workings very nicely. I'd be inclined to use the built-in hashing and not reinvent the wheel, though I agree this is the best solution. Re: binary search. I love binary search as much as anyone, but the numerical fields are not unique in the first file. Can the built-in binary search work on ASCII-sorted text? That could work as well if you swapped the numerical ID and the words in file #1, and alphabetized. Eric Hsu ***** Want to unsubscribe from this list? ***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch