[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Interpreting a columnar text file.



>>>>> "Chaim" == Chaim Frenkel <chaimf@pobox.com> writes:

>>>>> "RLS" == Randal L Schwartz <merlyn@stonehenge.com> writes:
>>>>> "Chaim" == Chaim Frenkel <chaimf@pobox.com> writes:
Chaim> I found this piece of code useful to generate an unpack format when I
Chaim> needed to translate a columnar text file.

Chaim> $fmt = "";
Chaim> while(<>) {chomp; $fmt |= $_};

Chaim> $fmt =~ y/ /X/c;
Chaim> $fmt =~ s/(X+\s+)/"A".(length($1)-1)."x"/ge;

RLS> It seems that "foo     bar" would be incorrectly translated to "A3xA3"
RLS> with this?  I don't think you want to collapse \s+ there.

Chaim> Why? Doesn't (X+\s+) include the whitespace in the length?

Oh, gah.  never post before 10am. :)

If that's the case, I see what you're doing, and you don't even need
the x. :)

$fmt =~ s/\G(\s*)(\S+\s*)/"x".length($1)."A".length($2)/ge;

Every 'x' after the first will have 0.

Chaim> Why the \G ?

Defensive programming, to ensure that I haven't coded the regex wrong
and skip over data.  This ensures that each new round begins where the
previous round left off.

    ## @data = "foo  bar";
    @data = `w`; shift @data;
    $fmt = "";
    for (@data) { chomp; $fmt |= $_ }
    $nul = "\0" x length($fmt); # for padding
    $fmt =~ s/\G(\s*)(\S+\s*)/"x".length($1)."A".length($2)/ge;
    print "fmt = <<$fmt>>\n";
    for (@data) { print join "|", unpack($fmt, $nul | $_), "\n" }

-- 
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <merlyn@stonehenge.com> Snail: (Call) PGP-Key: (finger merlyn@teleport.com)
Web: <A HREF="http://www.stonehenge.com/merlyn/">My Home Page!</A>
Quote: "I'm telling you, if I could have five lines in my .sig, I would!" -- me

==== Want to unsubscribe from Fun With Perl?
==== Well, if you insist... Send mail with body "unsubscribe" to
==== fwp-request@technofile.org