[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] escape oddities...



Scott Prince wrote:
> 
> Hello all...
> 
> I thought that the sequence below would clean up any nasties from form 
> data submitted through my cgi's. 
> 
> $form_data{$user_entry} =~ s/<(.|\n)*>//g;        # html

s/<.*>//gs;    #  /s on a m// or s/// makes . match newlines.

But...

I <B>don't</B> think this <I>substitution</I> will work <U>very</U> well...


s/<.*?>//gs;

would be closer.  But to parse HTML safely, you really should use the
HTML::Parse module.


> $form_data{$user_entry} =~ s/\t|\n|\r|\|/ /g;     # ht, nl, cr, pipe

s/[\t\n\r|]/ /g;

Don't use alternation where a character class will do.

tr/\t\n\r|/ /;

Don't use substitution where a translation will do.


> $form_data{$user_entry} =~ s/ +/ /g;              # multi spaces

tr/\t\n\r |/ /s;

/s on a translation squashes runs of characters to a single character.


> $form_data{$user_entry} =~ s/^ +| +$//g;          # starting & ending spaces

This is covered in the FAQ.  It is more efficient to do two substitutions:

s/^ +//;
s/ +$//;


> One concern being corruption of database files with \n or | characters. 
> But after retrieving a db file via ftp (mac), I noticed what seemed to be 
> newlines( /r's after fetch ingests them for my mac) breaking my records. 
> The odd thing is that the unix server is ignoring the character and not 
> seeing /n. - which is a good thing :)

Are you retrieving the db file in TEXT/ASCII mode or in BINARY mode?  If
it's a plaintext database file, you should transfer it in TEST mode.  If
it's a binary database file, you should transfer it in BINARY mode.


> The book, "Perl5 by Example" has a table listing all the usual escape 
> char's, but, further into the book there is a code example using the 
> escape /cM. A quick test verifies that MacPerl recognizes this as a /r.

\r and \n in Perl are platform dependent.  \cM and \cJ (and \x0A, \012,
\x0D, and \015) are platform independent.


> The obvious solution for cgi's is to replace any whitespace char with a 
> space. But is there any way to predict the way these things bounce from 
> platform to platform?

Over FTP, using TEXT mode will translate line endings for the system
receiving the file.

The standard line-ending for text being sent between systems is \015\012.


Ronald

***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch