[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Select body of a mailbox



Ken Williams wrote:

> g.valoti@magneticmedia.com (Giorgio Valoti) wrote:
> >
> >Yep it works... here's tjust a test script:
> >
> >#! /usr/bin/perl -w
> >
> >use strict;
> >use Mail::Util;
> >use Mail::Header;
> >
> >my $mailbox = "path:to:mailbox";
> >open MAILBOX, $mailbox
> > or die "Non posso aprire MAILBOX: $!\n";
> >open HEADER, ">Macintosh HD:Desktop Folder:Header"
> > or die "Non posso aprire HEADER: $!";
> >open BODY, ">Macintosh HD:Desktop Folder:Body"
> > or die "Non posso aprire BODY: $!";
> >my $fh = \*MAILBOX;
> >my @msgs = Mail::Util::read_mbox($mailbox);
> >
> >
> >
> >foreach my $msg (@msgs)
> >{
> > my $head = new Mail::Header $msg;
> > my $header = $head -> header ($msg);
> > print HEADER @$header;
> > print BODY @$msg;
> >}
> >
> >close MAILBOX;
> >close HEADER;
> >close BODY;
> >
> >Now, with the test file (>13MB) I had to increase the Perl memory partition
> >to  45MB.... Am I missing something obvious or that's the only way to deal
> >with this problem?
>
> It looks like the only way to deal with it using Mail::Util, but that's
> because it has to read the whole file into memory.  Here's some code I
> wrote several whiles ago (but not tested a whole lot) that doesn't do
> that, it simply notes the byte-location of messages and grabs some
> important headers.  You could modify it to get whatever portions you're
> interested in.
>
>    sub get_mailbox_list {
>      # This routine scans the mbox file and fills up the @messages array.
>      seek FILE, 0, 0;
>      my @messages = ();
>
>      # The goals here: don't read huge portions of the file into memory at
>      # once. We go line-by-line in the body of the message, but read the
>      # headers in one big slurp.  The assumption is that the headers are never
>      # unreasonably long, which is pretty much true.  The body might contain
>      # large attachments, though, so we have to read it line-by-line unless we
>      # want the wrath of the memory manager.
>
>      my $i = 0;
>      while (<FILE>) {
>        if (/^From /) {
>          local $/ = ''; # Paragraph mode - gobble up the headers
>          $_ = <FILE>;
>          while (/^(From|Received|Subject|Date): (.*?)\n(?!\t)/smg) {
>            $messages[$i]{$1} = $2 unless exists $messages[$i]{$1};
>            # If we cared, we could also replace \n\t with a space
>          }
>          $messages[$i]{'location'} = tell(FILE) - length();
>          $i++;
>        }
>      }
>      return @messages;
>    }
>
> # ===== Want to unsubscribe from this list?
> # ===== Send mail with body "unsubscribe" to macperl-request@macperl.org

Oh yeah! that's perfect. Fast and efficient. I'll take a look at the other
suggestion... a quick test led to nothing, probably a regex problem... we'll see.

You guys rock! (and Perl too :-)
Thanks again

--
Giorgio Valoti

MagneticMedia Network



# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org