> -----Original Message----- > From: Tushar Samant [mailto:tushar@i-works.com] > Sent: Saturday, June 19, 1999 14:46 > To: fwp@technofile.org > Subject: Re: [FWP] More Simplification > > > > Problem: Convert all characters not in the set [AaTtCcGg] to [Nn], > > preserving case: ... > > Anyone have a nicer solution? > > How about uglier? > > s/([^ATCG])/'N'^' '&$1/egi; > > And non-portable... It's as portable as the ASCII character set, which means damned near ubiquitous these days. But, as I said before, it and its friends are S-L-O-O-O-W! The following benchmark measures the correct suggestions so far. It was run on perl 5.005_03. #!/usr/local/bin/perl -w use strict; use Benchmark; my $seq = join "" => '0' .. '9', 'a' .. 'z', 'A' .. 'Z'; timethese(1 << (shift || 0), { Ord => sub { (my $x = $seq) =~ s/([^ACTGactg])/ chr(ord('N') - ord(uc($1)) + ord($1))/eg }, Perl4 => sub { (my $x = $seq) =~ s/([^ACTGactg])/ pack('C', ord('N') - ord("\U$1") + ord($1))/eg }, Tr => sub { (my $x = $seq) =~ tr/a-zATCG/N/c; $x =~ tr/A-Zatcg/n/c }, Xor => sub { (my $x = $seq) =~ s/([^ACTGactg])/'N' ^ $1 ^ "\U$1"/eg }, }); __END__ Benchmark: timing 16384 iterations of Ord, Perl4, Tr, Xor... Ord: 16 wallclock secs (15.98 usr + 0.00 sys = 15.98 CPU) Perl4: 19 wallclock secs (18.45 usr + 0.00 sys = 18.45 CPU) Tr: 0 wallclock secs ( 0.39 usr + 0.00 sys = 0.39 CPU) (warning: too few iterations for a reliable count) Xor: 17 wallclock secs (16.47 usr + 0.00 sys = 16.47 CPU) 'Nuff said? -- Larry Rosler Hewlett-Packard Company http://www.hpl.hp.com/personal/Larry_Rosler/ lr@hpl.hp.com ==== Want to unsubscribe from this list? (Don't you love us anymore?) ==== Well, if you insist... Send mail with body "unsubscribe" to ==== fwp-request@technofile.org