[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Text::CSV_XS revisited



At 22:32 -0500 1999.09.14, Matthew Langford wrote:
>If anybody is interested, I've been digging through Text::CSV_XS.  It
>specifically looks for \012 and \015, and has different recognition states
>for each.  I am looking at trying to generate a Mac patch for it.  Am I
>right in thinking that the last line of a DOS file has the \015 but not a
>\012?  Are there other places where solitary \012s or \015s might occur in
>a DOS text file?
>
>I don't understand the recognition engine (FSM) quite well enough to
>decide whether it could handle a Unix file (\012 only, right?).  I don't
>think it can.
>
>I'm not sure why passing a DOS-formatted text file to it fails...It
>apparently uses getline() to grab chunks of the file, but the recognition
>engine goes byte by byte and should see every character regardless of how
>the buffer is filled.
>
>I think it is possible that Text::CSV_XS might work under Unix on a DOS
>text file, since getline() under Unix will still have \015\012 together,
>while on the Mac \012 is in the next line.  But I don't know, yet.

I suspect that you are not setting the input record separator in Perl.
That is, Text::CSV_XS has its getline method, which then calls IO::File's
getline method, which relies on $/, the input record separator.

  #!perl -w
  use IO::File;
  use Text::CSV_XS;
  chdir "Bourque:Desktop Folder:" or die $!;

  my $csv = new Text::CSV_XS { eol => "\012", binary => 1 };
  my $fh = new IO::File "csvtest" or die $@;

  $/ = "\012";
  until ($fh->eof) {
      my $col = $csv->getline($fh);
      print join("|", @$col), "\n";
  }

That works with Unix files (\012) and DOS files (\015\012).  Strangely, it
fails with Mac files (\015).  Hm.  So maybe there is a problem.

I say forget the whole getline and eol things and do it yourself:

  #!perl -w
  use Text::CSV_XS;
  chdir "Bourque:Desktop Folder:" or die $!;

  my $csv = new Text::CSV_XS;   # eol is undef
  open F, ":csvtest" or die $!;

  while (<F>) {  # get the line
      chomp;     # remove eol
      $csv->parse($_) or die $csv->error_input;
      print join("|", $csv->fields), "\n";
  }


If you have a non-native eol to deal with, such as a DOS file on a Mac or
Unix, wrap the while() in a block with local $/:

  #!perl -w
  use Text::CSV_XS;
  chdir "Bourque:Desktop Folder:" or die $!;

  my $csv = new Text::CSV_XS;
  open F, ":csvtest" or die $!;

  {
      local $/ = "\015\012";
      while (<F>) {
          chomp;
          $csv->parse($_) or die $csv->error_input;
          print join("|", $csv->fields), "\n";
      }
  }

Then YOU are in control of the eols, and it is not much extra code to deal
with or anything.

-- 
Chris Nandor          mailto:pudge@pobox.com         http://pudge.net/
%PGPKey = ('B76E72AD', [1024, '0824090B CE73CA10  1FF77F13 8180B6B6'])

===== Want to unsubscribe from this list?
===== Send mail with body "unsubscribe" to macperl-request@macperl.org