[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] HTML to Text Script ?



bklug@abo.rhein-zeitung.de
mac-perl@iis.ee.ethz.ch
Subj:	[MacPerl] HTML to Text Script ?

Boris Klug <bklug@abo.rhein-zeitung.de> writes 6-JAN-1997 

>Hi!
>
>OK, its not a Mac related question, but to somebody know a perl-script 
>which converts HTML to text? I setup a cron job on a unix machine which 
>emails a page to a few accounts.
>
>So I need a script to converts HTML to Text. This means:
>
>1) Remove html tags
>2) Convert entities
>3) Convert line breaks
>
>and maybe
>
>4) Convert some tags to text meanings (e.g. <B>bla</B> to _bla_)
>5) Convert tables
>
>  ... and more ...
>
>If nobody knows about such a script, I will write it on my own and post 
>it to the CSPAN archive.

To achieve the first two tasks look at Tom Christiansen's exceptional 
'striphtml' at his site (and maybe in CPAN?)

  http://www.perl.com/perl/scripts/html-hacking.html

  http://www.perl.com/perl/scripts/striphtml

For text handling perhaps the Text::* modules would be helpful.

Peter Prymmer