[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Comparing strings the hard way



An entity claiming to be John Porter (jdporter@min.net) wrote:
: Jeff Pinyan wrote:
: > 
: > This is somewhat related to the greatest substring problem 
: > 
: >   # greatest_common_substring
: >   sub gcs {
: >     substr($_[0], 0, length(split /[^\0]/, $_[0]^$_[1], 2)[0]);
: >   }
: 
: Aka "longest common subsequence".  A wait.cpan.org search for "lcs"
: found Algorithm::Diff (originally by MJD), which contains an LCS
: function.  (Much to my dismay, none of the other CPAN search engines
: found it!)
: 

Actually, there is a difference between a string and a sequence.  A
sequence is a string that can be obtained by deleting zero or more symbols
from a string.  So, given the strings v and w:

v = "xyaxxbxyydycxdy"
w = "ahtnnbciibddd"

the longest common subsequence of v and w is "abcd", which we get by
deleting all but the marked symbols below:

v = "xyaxxbxyydycxdy"
       ^  ^     ^ ^
w = "ahtnnbciibddd"
     ^    ^^   ^

Mark

-- 
Mark Rogaski                  | "And this is what it said:
wendigo@pobox.com             |  ``You fool, Warren is DEAD!''"
http://www.pobox.com/~wendigo |      -- H. P. Lovecraft
__END__                       |         The Statement of Randolph Carter

PGP signature