[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] japhy had a silly idea...



On Mon, May 28, 2001 at 01:49:36PM -0400, Bernie Cosell wrote:
> On 28 May 2001, at 16:37, Abigail wrote:
> 
> > On Fri, May 25, 2001 at 10:57:46PM -0400, Mark Rogaski wrote:
> > > An entity claiming to be Abigail (abigail@foad.org) wrote:
> > > : 
> > > : BTW, an interesting exercise would be to prove that a certain regex
> > > : only matches a certain set of strings, and no other strings.
> > > 
> > > If you mean that the sets are finite, then the regex cannot include *, +,
> > > or any other unbounded repetition.
> > 
> > No, I don't mean finite. The absense of repetitors might make it 
> > easier to prove a regex matches a set, and nothing but the set,
> > but I don't think the set being finite makes it any easier.
> 
> The problem here is the vagueness of "a certain set".  First, let us 

No, there's nothing vague about that at all. The set you compare against
is part of the input: you get a regex, and you get a set.

> confine the discussion to "real" regular expressions [the matter is 
> difficult enough without Perl's extra stuff].  I am struggling to 
> remember some automata theory from a long time ago...  I believe the 
> answer to Abilgail's question is very simple, but I don't remember what 
> it is.  [sigh].  The choices:

As I said before, for CS regular expressions, the answer is known. But
Perl regular expressions are *NOT* CS regular expressions, and those
techniques you use there, do not apply here.

> 1) it is easy to verify.  There is a 'canonical form' for regular 
> grammars.

For CS regular expressions, yes. But Perl regular expressions? Even if
you can think of a canonical representation of a regex, it may not be
clear how to translate a given regex to such a cononical form.

>            I *think* it is unique up to isomorphism.  If this is true, 
> then comparing to REs consists of translating to the corresponding 
> grammars, putting the grammars into canonical form, and comparing the 
> grammars [if the canonical form for a particular regular language is 
> unique, then if two RLs are the same, they'll have the same unique-
> canonical form, so if the canonical forms differ, then the languages are 
> different].
> 
> 2) it is impossible to verify: It might be the case that the canonical 
> form is *not* unique.  If so, then comparing the two RLs will be 
> recusively undecidable [since about the only way to do it would be to 
> enumerate the resulting RLs and comparing]
> 
> 
> One reason why I characterize it the way I did is that there are troubles 
> with "a certain set" --- if the set is finite, then verify an RE against 
> it is trivial [you simply begin enumerating the sentences that the RE 
> accepts [which is no trouble]

Oh, really? Could you do that for all Perl regexes? It isn't clear to me
at all how to do that.

>                               and confirm whether every generated string 
> is one of the ones in the finite set].  If the set is _infinite_, the 
> before you can even ask the question about "does *this* RE correspond to 
> the set} you need to determine that it is a regular language *at*all* 
> [that is, does *ANY* RE correspond to this set].  The simplest way to be 
> sure that your set is regular is to describe it with an RE.. hence my 
> focusing on the related-question "can I tell if two REs recognize the 
> same language".
> 
> I can't *begin* to guess how you'd even start on the question with Perl's 
> 'enhanced' REs...


Exactly. And that's my question....



Abigail

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe