[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] DNA.pm



Mark Rogaski <wendigo@pobox.com> writes:

> An entity claiming to be Michael G Schwern (schwern@pobox.com) wrote:
> : 
> : > >     CCAA CCAA AAGT CAGT TCCT CGCT ATGT AACA CACA TCTT GGCT TTGT AACA GTGT
> : > 
> : > these would better be arranged in groups of three, as three
> : > bases make a codon, which codes for one amino acid.
> : 
> : True, but that means I could only encode 6 bits of information per
> : group.  God obviously never had to work with high ASCII.
> : 
> 
> Is it necessary to allign on any particular boundaries?  If you treat the
> source as a bit string and break it into 6 bit units, codons work fine
> (unless there is some distribution requirement).

The 6-bit requirement is generally considered the alignment
requirement (bioinformaticians routinely search for "open reading
frames" -- large segments with no stop codons).  But actually it's a
lot worse than that.  In eukaryotes like you, genes are composed of
exons and introns.  The introns drop out of the RNA, leaving only the
concatenation of exons to be translated to protein.  So the alignment
requirement applies only to "code", not to the "comments" that get
stripped during preprocessing.

Only we don't really understand introns.  Oh, and there can be more
than one "parse" of a gene into exons and introns.

The relevance to Perl?  For a start, check out what Lincoln Stein
*really* does, when he's not coding CGI.pm...

-- 
Ariel Scolnicov        |"GCAAGAATTGAACTGTAG"            | ariels@compugen.co.il
Compugen Ltd.          |Tel: +972-2-5713025 (Jerusalem)	\ We recycle all our Hz
72 Pinhas Rosen St.    |Tel: +972-3-7658117 (Main office)`---------------------
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555    http://3w.compugen.co.il/~ariels

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe