> >$seq = "CCCAAACCCTTTCCC"; > >And need to find occurrences of > > >(C|G)\1{2,7}(.){5,10}(C|G)\1{3,8} > >so I get >CCCAAACCC as the first match. You need to tell Perl what you want to match. Your code says: find 2-7 C then 5 to 10 anything then 3 to 8 C again. Using this formula on $seq="CCCAAACCCTTTCCC" gives me the whole string. I don't know why you write \1 in the match string. Please explain! If you want to find CCCAAACCC rather than CCCTTTCCC you first have to state what you want 3 C than 3 anything than 3 C again. In Perl this is $seq =~ /(C{3}.{3}C{3})/; print $1; # CCCAAACCC In order to find the last occurence you normally loop through several matches, like while( /(C)/g ) { print $1 } See Perl cookbook for examples. In your case the match is terminated by "CCC" which is already part of the next match. This makes things difficult. You would need to work with positions within a loop. But there is an easy solution I think: $rseq = reverse $seq; $seq =~ /(C{3}.{3}C{3})/; print $1; # CCCTTTCCC Axel ---------------------------------------------------------------------- Axel Rose, Springer & Jacoby Digital GmbH & Co. KG, mailto:rose@sj.com pub PGP key 1024/A21CB825 E0E4 BC69 E001 96E9 2EFD 86CA 9CA1 AAC5 "... denn alles, was entsteht, ist wert, daß es zugrunde geht ..." ==== Want to unsubscribe from this list? ==== Send mail with body "unsubscribe" to macperl-webcgi-request@macperl.org