Tuesday, April 21, 2009

EXTRACTING DATA FROM TABLE using Perl

erl has a module that does this: HTML::TableExtract (http://kobesearch.cpan.org:/htdocs/HTML-TableExtract/HTML/TableExtract.html), the examples are a good start, don't forget to add "my $te" to each variable declaration when using "use strict".

Also, to download the HTML data directly you could use:

#!/usr/bin/perl
use strict;
use warnings;
use HTML::TableExtract;
use LWP::Simple;
my $html = get("http://ubuntuforums.org");
my $table = HTML::TableExtract->new;
$table->parse($html);
# Table parsed, extract the data.

No comments:

Post a Comment