According to Ken Cartner: > > I'm trying to change words that are in all upercase to upper and lower. > And i can'rt seem to come up with a logical way to achieve this. How about this: 1. Create a file called SUBS.DAT. 2. Write a program to scan the information to be changed. 2a. Each time you do, you read in SUBS.DAT and use it to do your substitutions. 2b. Each time you do, you also write out a new SUBS.DAT file which contains the old list plus any new words you've found in the current document. 3. Then you just edit the SUBS.DAT file to put in the new changes. Example: SUBS.DAT L.A. LAKERS:L.A. Lakers SAN ANTONIO:San Antonio etc.... myProg.pl #!perl # #%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% # # Get the list # @myList = (); open( THEFILE, "SUBS.DAT" ) || die $!; while( <THEFILE> ){ chomp; $myList[++$#myList] = $_; } close( THEFILE ); # # Get the document # @myDoc = (); open( THEFILE, "myDoc.dat" ) || die $!; while( <THEFILE> ){ chomp; $myDoc[++$#myDoc] = $_; } close( THEFILE ); # # Scan it # for( $i=0; $i<=$#myDoc; $i++ ){ for( $j=0; $j<=$#myList; $j++ ){ @theLine = split( /:/, $myList[$j] ); $myDoc[$i] =~ s/$theLine[0]/$theLine[1]/g; } # # Begin looking for new UPPERCASE words to add to the list. # @theLine = split( /\s/, $myDoc[$i] ); for( $j=0; $j<=$#theLine; $j++ ){ # # If there is a trailing non-alphanumeric character - get rid of it. # if( $theLine[$j] =~ /[\,\;\:\-\.\!\?]$/ ){ chop( $theLine[$j] ); } # # Is this word all UPPERCASE characters? # # If so, then check to see if we need to add it to the list as WORD:WORD. # if( $theLine[$j] =~ /^[A-Z]+$/ ){ $theFlag = 0; for( $k=0; $k<=$#myList; $k++ ){ if( $myList[$k] =~ /$theLine[$j]/ ){ $theFlag = 1; last; } } if( $theFlag < 1 ){ $myList[++$#myList] = "$_:$_"; } } } } # # Write out the new document. # open( THEFILE, ">myDoc.new" ) || die $!; for( $i=0; $i<=$#myDoc; $i++ ){ print THEFILE $myDoc[$i], "\n"; } close( THEFILE ); # # Update the substitution list. # open( THEFILE, ">SUBS.DAT" ) || die $!; for( $i=0; $i<=$#myList; $i++ ){ print THEFILE $myList[$i], "\n"; } close( THEFILE ); exit( 0 ); # #%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% # The problem with the above is that things like L.A. LAKERS will wind up in the file as: L:L A:A LAKERS:LAKERS or L.A:L.A LAKERS:LAKERS But seeing as how you only have to correct this information once and then the program does the corrections from then on it's not all that bad. Also, you can work on a way of flagging multiple UPPERCASE words. A simple way to do this is to just have a secondary flag which is checked each time a new word is found to be all in uppercase. The flag is unset each time a non-all uppercase word is located too. Then all you have to do is IF the secondary flag is set, then pop off the last entry on the @myList array, split it up, tack on the new word (with a space between the words), and then put it back together again. In this case, your array would contain something like: L:L then L A:L A then L A LAKERS:L A LAKERS (And hopefully they don't send something like: THE L A LAKERS WIN THE GAME! - or some such.) ***** Want to unsubscribe from this list? ***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch