[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] Got it...



According to Strider:
> 
> #!perl
> $ARGV[0] = "seraphX:Desktop Folder:summary.tab"; # open tab-delimited file
> select(STDOUT);
> $| = 1; # make sure the printing works immidiately, not at the end. =)
> 
<snip>

Ok, I still do not understand why you are reading
everything in first before doing anything else.  Why not do
the following:

#!perl
#
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#
	$ARGV[0] = "seraphX:Desktop Folder:summary.tab"; # open tab-delimited file
#
#	Bad pookie here - no die command was originally given
#
	@data = ();
	open( IN, "$ARGV[0]") || die $!;
	while( <IN> ){
#
#	You need the chomp because otherwise the line can contain just "\n".
#
		chomp;
		next if length($_) < 1; # skip the rest if it's empty, go to next

		$theFlag = 0;
		for( $i=0; $i<=$#data; $i++ ){
			if( $data[$i] eq $_ ){
				$theFlag = 1;
				break;
				}
			}

		next if $theFlag > 0;
		$data[++$#data] = $_;
		}

	close( IN );
#
#	Bad pookie here - no die command was originally given
#
	open( OUT, ">nodup.dat" ) || die $!;
	foreach( @data ){
		print OUT $_, "\n";
		}
	close( OUT );
	exit( 0 );
#
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#

The problem with this is that the more data you get in, the
longer it will take to verify that a record is unique.
However, I just grepped the Perl POD and found in
PerlFAQ4.pod the following:

=head2 How can I extract just the unique elements of an array?

There are several possible ways, depending on whether the array is
ordered and whether you wish to preserve the ordering.

=over 4

=item a) If @in is sorted, and you want @out to be sorted:

    $prev = 'nonesuch';
    @out = grep($_ ne $prev && ($prev = $_), @in);

This is nice in that it doesn't use much extra memory,
simulating uniq(1)'s behavior of removing only adjacent
duplicates.

=item b) If you don't know whether @in is sorted:

    undef %saw;
    @out = grep(!$saw{$_}++, @in);

=item c) Like (b), but @in contains only small integers:

    @out = grep(!$saw[$_]++, @in);

=item d) A way to do (b) without any loops or greps:

    undef %saw;
    @saw{@in} = ();
    @out = sort keys %saw;  # remove sort if undesired

=item e) Like (d), but @in contains only small positive integers:

    undef @ary;
    @ary[@in] = @in;
    @out = @ary;

=back


***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch