[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

re:[MacPerl] Columns



> I have a (text) file with the following format:
> 
> 123420010110010010101  (and so on this way => )
> 234531111101100101010
> 678940000010010111010
> 340151010101110101010
> and so on...
>
> Now let's say the first four characters are a PIN number,
> the following two characters the subject's age, and so on.
> How can I read such a file split in (fixed) columns ?
> I would like to insert tabs at the end of each field, creating
> a tab-delimited text file.

Try this (untested) code:
##---------------------------- start sample
open(INPUT, "filename") || die;
open(OUTPUT, ">filewithtabs") || die;
while (<INPUT>) {
	## Try to replace 4 digits and 2 digits at the beginning of a line
	## with the matched 4 digits, tab, the matched 2 digits, tab.
	if (s/^(\d{4})(\d\d)/$1\t$2\t/) {	## line 6 (see below)
		print OUTPUT;
	} else {
		## The match failed, there were not 6 digits at the 
		## beginning of the input line
		print "warning: line of unexpected format ($_)\n";
	}
}
close(INPUT);
close(OUTPUT);
##---------------------------- end sample

If you are certain your input data are correct, more minimal code could 
replace the "if" code at line 6 with:

	s/^(.{6})(..)/$1\t$2\t/;
	print;

If you actually want to get values out to manipulate in your program, 
change the "if" code starting at line 6 to:

	if ( ($pin, $age, $the_rest) = /^(\d{4})(\d\d)(.*)/ ) {
		## Now you can work with $age, $pin, or $the_rest
		## Example:  $max_age = $age if ($age > $max_age);
	} else {
		## The match failed.
		print "warning: line of unexpected format ($_)\n";
		## If you prefer, print bad lines to some error file
		## instead of the screen--good for large data sets.
	}

This question could also have been posted to the newsgroup 
comp.lang.perl.misc since it's a general Perl question.  If you have 
email, but no newsgroup access, the Perl-Users Digest is a retransmission 
of the USENET newsgroup comp.lang.perl.misc.  For subscription or 
unsubscription requests, send the single line:
	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

-Ken
Posting to this newsgroup via email is done by sending email to 
Perl-Users@ruby.OCE.ORST.EDU
-=#=-
Ken Tanaka                      Phone: (602) 988-9773 ext 413
Hughes Training, Inc.           Fax:   (602) 988-3556

tanaka@hrlban1.aircrew.asu.edu
tanaka@alhra.af.mil
-=#=-