[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl-WebCGI] making reg ex pattern searches 'back up' inorder to find overlapping matches



>
>$seq = "CCCAAACCCTTTCCC";
>
>And need to find occurrences of
>
>
>(C|G)\1{2,7}(.){5,10}(C|G)\1{3,8}
>
>so I get
>CCCAAACCC as the first match.

You need to tell Perl what you want to match.
Your code says: find 2-7 C then 5 to 10 anything then 3 to 8 C again.
Using this formula on $seq="CCCAAACCCTTTCCC" gives me the whole string.

I don't know why you write \1 in the match string.
Please explain!

If you want to find CCCAAACCC rather than CCCTTTCCC you first
have to state what you want
3 C than 3 anything than 3 C again.
In Perl this is
$seq =~ /(C{3}.{3}C{3})/;
print $1; # CCCAAACCC

In order to find the last occurence you normally loop through
several matches, like
while( /(C)/g ) { print $1 }
See Perl cookbook for examples.

In your case the match is terminated by "CCC" which is already
part of the next match. This makes things difficult. You would
need to work with positions within a loop.

But there is an easy solution I think:
$rseq = reverse $seq;
$seq =~ /(C{3}.{3}C{3})/;
print $1; # CCCTTTCCC


Axel

----------------------------------------------------------------------
Axel Rose, Springer & Jacoby Digital GmbH & Co. KG, mailto:rose@sj.com
  pub PGP key 1024/A21CB825 E0E4 BC69 E001 96E9  2EFD 86CA 9CA1 AAC5
  "... denn alles, was entsteht, ist wert, daß es zugrunde geht ..."

==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-webcgi-request@macperl.org