At 8:45 AM +0000 9/5/99, Bart Lateur wrote: >On Sat, 4 Sep 1999 22:51:28 -0400, Randy M. Zeitman wrote: > >>I'm matching a string to text file of records (rows). Each entry (column) >>in each record is tab separated (a la spreadsheet). >> >>But sometimes I want to only match against columns 2,3,and 4, of all >>records and other times I want to match only against column 7. Any quicker >>way to do this than to parse each record into an array toss the unwanted >>entries and re-string the thing? (a guess...) > >You could. > >>Don't actually need the answer, but a kick in the right direction...how >>might I do this with one elegant match? > > > local $" = "\t"; # why not? > while(<FILE>) { > chomp; > my @field = split /\t/; > if("@field[2,4,7]" =~ /searchterm/) { > print "Got a match in $_\n"; > # prints record > } > } [snip] That's zippy, but I'd prefer to use Perl's grep, although it means slurping in the whole file, which can cost performance-wise if the file is large. The logic of the above statement > if("@field[2,4,7]" =~ /searchterm/) { seems not very useful if you know what your data structure is. For example, lets say your file has records like this (header line first, but not needed in the actual file): FIRSTNAME\tLASTNAME\tPHONE\tSTREET\tCITY\tSTATE\tZIP Fred\tJones\t213-555-1111\t123 Elm St\tLos Angeles\tCA\t90024 Martha\tWashington\t408-555-2222\t123 Front Street\tLos Altos\tCA\t95021 Olive\tOyle\t831-555-9999\t35 Wharf Road\tSanta Cruz\tCA\t95060 Mudhen\tRainbow\t831-555-7777\tTown Clock\tSanta Clara\tCA\t94567 If I didn't mis-type, we have a header and 4 records with seven fields (columns). Open the file and slurp all the lines into an array: open (DATA, $datafile) || die("Can't open data file $datafile\n $! \n"); @all_records = (<DATA>); close DATA; Now, if you were searching your database for records with a certain zip, they'd only appear in the 7th column. You'd get bogus returns if your method allows the seachterm to match data in any other column. Besides that, maybe you want to find records that match search criteria in more than one column, say all the people with last name 'Jones' and zip code '99324'. I haven't tried it, but maybe the above example would allow /searchterm/ to be in a form that would correspond to the multi-field array slice @field[2,4,7] in the if statement, but that seems like a bother -- for example, a comma-separated list wouldn't be good enough because there might be commas _in_ the data. So instead, build a regular expression to use with grep. To find all the records with last name 'Jones' in the above sample database, try: $grep_pat = "/^.*\tJones\t.*\t.*\t.*\t.*\t.*/"; @found_records = grep(/$grep_pat/i, @all_records); Now you have a list of records, @found_records, in the same form they exist in your database, to process as you want. The '^' at the beginning makes sure your fields are registered with the start of the record (line). The 'i' allows case-insensitive matches. To find all the records with last name 'Jones' and zip '94567', it would be: $grep_pat = "/^.*\tJones\t.*\t.*\t.*\t.*\t94567/"; Using this approach also allows you to find records based on your choice of partial/exact/starts-with/ends-with criteria for the data in different fields. For an exact match (sticking with 'Jones' in the last name field), that portion of the regex would be \tJones\t # Matches on: last name eq 'Jones' \tJones.*\t # Matches on: last name starts with 'Jones' \t.*Jones.*\t # Matches on: last name has 'Jones' anywhere in it \t.*Jones\t # Matches on: last name ends with 'Jones' \tJo.*\t # Matches on: last name starts with 'Jo' Also, if you have to deal with variations like 'Bob'/'Robert', try: \t(Bob|Robert)\t # Exact match to 'Bob' OR 'Robert' Remember that these are *portions* of the regex, just showing varieties of matching in a particular field/column. Also, IMPORTANT NOTE: learn about 'greedy' matching, and avoid using .* without some care, or you'll occasionally get unpredictable results. You can build the grep pattern dynamically or hard-code it; this all depends on what your search needs are. You can do successive greps if you want more alternatives: the first grep might match all the records with last name 'Jones' and the second might match all records with state 'New Jersey'. Put 'em together in one list for a group of records for which last name is 'Jones' OR state is 'New Jersey'. Hope this helps. One caveat: Bart and others who responded to his approach might know way more than I do. I've been programming professionally in Perl since 1995, but my knowledge tends to expand in response to the projects I do (or with thanks to folks on this list who correct me), so I don't claim to know the whole territory by any means... Good luck! - Bruce __Bruce_Van_Allen___bva@cruzio.com__831_429_1688_V__ __PO_Box_839__Santa_Cruz_CA__95061__831_426_2474_W__ ===== Want to unsubscribe from this list? ===== Send mail with body "unsubscribe" to macperl-request@macperl.org