[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl-Forum] Complexed perl problem.



On Sat, May 13, 2000 at 11:21:15AM +0200, Jimmy Lantz wrote:
> Complexed perl problem.
> 
> Hi, I have the following problem:
> I need to do the following operation on a (huge) data file see sample
> below :
> Strip the first row of the < & > and print it into a output file 
> then I have to read the the rest of the data and to be able to 
> and put it into a hash ??(if that's the best way to go) and associate
> the <PRON(pers,sing)> with the value I and <V(montr,pres)> with think
> and so forth so that I can analyse the values depending on if it's PRON
> or V or something else.

Which should be the keys and which should be the values?  How are you
analyzing the values?


> The following:
> **[main <ADJ(ge)>]**
> is a match and need's to be stripped of the 
> **[ & ]**  and printed to the output file. 
> 
> Everything printed (I need to print more on each row but that I can
> handle myself) has to be delimited by 
> || (double pipe)
> 
> Further below is a perlprog that I started to make until I realized that
> I needed some help.
> 
> <ICE-GB:S2A-016 #2:1:B>
> I <PRON(pers,sing)> think <V(montr,pres)> the <ART(def)> **[main
> <ADJ(ge)>]** things <N(com,plu)> that <PRON(rel)> I <PRON(pers,sing)>
> saw <V(cxtr,past)> as <PREP(ge)> <,> <PAUSE(short)> as <PREP(ge)>
> **[absent <ADJ(ge)>]** from <PREP(ge)> disa <UNTAG> from <PREP(ge)> work
> <N(com,sing)> with <PREP(ge)> with <PREP(ge)> disabled <ADJ(edp)> people
> <N(com,plu)> was <V(cop,past)>

How long will each of these blocks between <ICE-GB... be?  It will probably
be easier to work on a whole block at once, rather than line by line.


> <ICE-GB:S3A-051 #8:1:A>
> **[medium <ADJ(ge)>]** speed <N(com,sing)>
> 
> NB! Data filerows  above can vary in length (see row 2 & 4)
> 
> #!/usr/bin/perl
> $faktor1 = "<ICE-GB:";
> $icedata = "icedata.data";
> 
> open(FILE, "$icedata");

Don't forget to check the return values of system calls!

open(FILE, $icedata) or die "Can't open $icedata: $!\n";


> while(<FILE>) {
> 	$file = $_;
> 	chomp $file;
> 	if ($file =~ /$faktor1/) {
> 		$file =~ s/$faktor1/ /;

That part is redundant; you don't need to match twice.

    if ($file =~ s/$faktor/ /) {


> 		$file =~ s/>/ /;
> 		print "$file\n";
> 	}
> 	elsif ($file !~ /$faktor1/	){

That's redundant too; either /$faktor/ matches or it doesn't.

    } else {



> 	@db_fields = split (/>/, $file); 
> 	foreach $field (@db_fields) { print "$field\n"; } 
> 	}
> 	else{
> 		print "No match\n";

This block will never be entered.


> 	}
> }
> close(FILE);



I have some ideas on ways to do this, but I'd like to know the details
I asked about before I write any code.

Ronald

==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-forum-request@macperl.org