[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl-AnyPerl] good idea or bad idea?



On Sun, 16 Jul 2000 21:16:21 -0500, Kevin van Haaren wrote:

>I'm looking at the value of a particular META 
>tag and they had reversed the name= and content= from what I 
>typically see (name first then content).

That shows that using regexes for processing HTML isn't such a great
idea.

Use HTML::Parser (or HTML::TokeParser) instead. Poof, this problem
disappears. ANY legal HTML format is allowed. it will even decode the
HTML entities for you (although, er, maybe into ISO-Latin-1?). This is
the COMPLETE code:

	my $file = shift;
	use HTML::TokeParser;
	my $p = new HTML::TokeParser($file); # or \*handle
	while(my $tag = $p->get_tag("meta")) {
	    if(lc $tag->[1]{name} eq 'originalpublicationdate') {
	        print "Original publication date: $tag->[1]{content}\n";
	    }
	}

-- 
	Bart.

==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-anyperl-request@macperl.org