At 11:18 -0700 07/31/1999, Brian "L." Matthews wrote: >|> ($id) = split(/\s+/, $record); > >However, if a single space is the column separator and columns can be >empty, then the + isn't just superfluous, it's wrong. Vicki didn't give >us enough information to say which is correct. Of course, she wasn't asking >about the split either... true... the split was only to provide youall with the original record and the first field, more like the actual code. People get unhappy if they don't know where the variables get set :) For those who care, the columns don't matter in this case. For parsing these records, the first field (from the >) to the first white space is the identifier; everything after that up to the first newline is commentary aka description. Everything after that first newline is data, up to the next >. Rinse, repeat The format is known as FASTA format, it's used for DNA sequence data, and the actual form of a record is >identifier descriptive information with possible whitespace\n sequence data\n sequence data\n ... >identifier descriptive information with possible whitespace\n sequence data\n sequence data\n ... It's a little weird to parse because the records contain newlines, but it's very regular. You'll probably see it occasionally from me; it's been in several other puzzles I've posted :-) - V. -- -- |\ _,,,---,,_ Vicki Brown <vlb@cfcl.com> ZZZzz /,`.-'`' -. ;-;;,_ Journeyman Sourceror: Scripts & Philtres |,4- ) )-,_. ,\ ( `'-' P.O. Box 1269 San Bruno CA 94066 '---''(_/--' `-'\_) http://www.cfcl.com/~vlb http://www.macperl.com ==== Want to unsubscribe from Fun With Perl? Well, if you insist... ==== Send email to <fwp-request@technofile.org> with message _body_ ==== unsubscribe