[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Regex to remove whitespace and to capitalize

To: Tim Ayers <tayers@bridge.com>
Subject: Re: [FWP] Regex to remove whitespace and to capitalize
From: Rick <rklement@pacbell.net>
Date: Fri, 07 Jan 2000 11:15:25 -0800
Cc: fwp@technofile.org
References: <200001071529.KAA22400@mnmailhost>

Tim Ayers wrote:
> 
> Hi,
> 
> I'm not even just another Perl hacker so bear with me please.
> 
> I have a bunch of English phrases that I want to normalize by removing
> spaces and underlines and capitalizing each word, e.g.
> 
>   'Hi there'           => 'HiThere'
>   'Top Of The Morning' => 'TopOfTheMorning'
>   '25 or 6 to 4'       => '25Or6To4'
> 
> I started with two regexes
> 
>   $x =~ s/(\b.)/\U\1/g;
>   $x =~ s/[\s_]+//g;
> 
> How unimaginative.
> 
> So I tried
> 
>   $x =~ s/(^.)|[\s_]+(\S)/\U\1/g;
> 
> but that didn't work. For example, 'Hi there' goes to 'Hihere'. I was
> surprised by the loss of the 't'. I would have more expected
> 'HiHhere'. What's going on?
> 
> Finally I tried
> 
>   $x =~ s/(?:^|[\s_]+)(\S)/\U\1/g;
> 
> which seems to do the trick. Does anyone have an improvement or any
> caveats about this regex?
> 

This regex does not delete trailing spaces or underscores, which
your first regex pair does. I don't know if this is a problem for
your data or not...

Also there are slight differences in what gets capitalized, maybe best
pointed
out by asking what the correct answer should be for the following
two cases:

	'good-bye' => 'Good-bye' or 'Good-Bye'
	'one_two'  => 'OneTwo' or 'Onetwo'

These two cause major changes in code...

Rick

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe

References:
- [FWP] Regex to remove whitespace and to capitalize
  - From: Tim Ayers <tayers@bridge.com>

Prev by Date: [FWP] PERL
Next by Date: Re: [FWP] CPAN _Rules_!
Prev by thread: Re: [FWP] Regex to remove whitespace and to capitalize
Next by thread: [FWP] Finding duplicated code
Navigation: Date Index | Thread Index | Search | Other lists at bumppo.net