The script I created uses the HTML::TreeBuilder module and
can be viewed below.
When I run this perl script (finstat.pl) the output of each cell
(in the array) is filled with
"HTML::Element=HASH(0x2859d8)".
I would like the output to be arranged in the same order as
viewed in the webpage (noted above).
I appreciate your help on this because it is driving me a bit
nuts...
Best Regards,
Roman
(perl version 5.8.3 / Mac OS 10.2.8)
******* below is my script finstat.pl *****************
#!/usr/bin/perl -w
use strict;
use LWP::Simple;
use HTML::TreeBuilder;
my $url = 'http://moneycentral.msn.com/investor/
research/sreport.asp?Symbol=xom&ISA=1&Type=Equity';
my $page = get($url) or die $!;
my $p = HTML::TreeBuilder->new_from_content( $page );
my @links = $p->look_down(
_tag => 'td'
);
my @rows = map { $_->parent->parent } @links;
my @finstat;
for my $row (@rows) {
my %acct;
my @cells = $row->look_down( _tag => 'td' );
$acct{title} = $cells[0];
$acct{first} = $cells[1];
$acct{second} = $cells[2];
$acct{third} = $cells[3];
$acct{fourth} = $cells[4];
$acct{fifth} = $cells[5];
I am not familiar with the HTML::TreeBuilder package, when I need to do this kind of thing I build the code myself. This may not be too helpful to you, and your approach using the package may be the proper way to go about it, only I can't help with that package as said.
Basically if I were wanting to do this my approach might start like so:
$page = get $url; # fetch the web page
for(split $page,"\n")
{ next unless /Financial data in U\.S\. dollars/ }
for $line(split $page,"\n")
{
for(split $line,"\s+")
{
# now start assigning data to a hash
}
}
Sorry if this isn't specific enough. There is a limit to what I do as a volunteer, however if you want to use an approach like this and write back with a more specific question I'll have another look.