[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] Got it...



Ok- I've got the program working. Two problems left: 1) It takes about 17mb
of memory just after the initial read, for a 4.7 mb file. Copeable, but a
pain. 2) It does the search at about 3 seconds/line of the initial file.
I'd expect this, but is there a faster way?

#!perl
$ARGV[0] = "seraphX:Desktop Folder:summary.tab"; # open tab-delimited file
select(STDOUT);
$| = 1; # make sure the printing works immidiately, not at the end. =)

print "Reading data... ";
open (IN, "$ARGV[0]"); # for droplet use later
@data = <IN>; # read file into @data (uses about 17mb for a 4.7 mb file)
close IN;
print "done.\n";

print "Parsing... ";
open (OUT, ">nodup.tab");
select(OUT);
$| = 1; # make sure I get a good idea of how much has been done so far
$i = 0;
until ($#data < 0) {
	$line = pop( @data ); # take first record from @data, put into $line
	next if $line eq ""; # skip the rest if it's empty, go to next
	for $i (0 .. $#data) { $data[$i] = "" if $data[$i] eq $line; }
		# read through @data, make empty lines that are duplicates
of the current one
	print OUT $line; # dumps current line into output file
}

close OUT;

print "done.\n";

Later on, I need to edit this file to consolidate records based on the
first and second fields. That might be easy in this layout, but perhaps not
in something faster.

- Strider



***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch