Perl/LWP

From Wsms

Jump to: navigation, search

LWP is the "Library for WWW in Perl". You can use LWP to automate client-side operations for the web.

Contents

documentation

There are a few books on LWP. The best place to get started is with the pod pages for the lwp tutorial and the lwp cookbook:

[ggeller@arthur ~]$ perldoc lwpcook
[ggeller@arthur ~]$ perldoc lwptut

One critical feature of LWP is that it handles cookies. This lets you fetch stuff from most password-protected websites.

installation

On Fedora Core 5, LWP is installed by virtue of the perl-libwww-perl package.

howto steal Netscape cookies and use LWP to get data from password-protected sites

I wanted to use LWP to grab some stuff from the webct site for our class. The site is password-protected and uses a lot of JavaScript. The usual LWP form handling routines probably don't work. Lynx can't navigate the login screen either.

So, I used Firefox to log into the site. I looked at the cookies and saw that the site had stored a cookie for the current session only. Firefox does not write cookies that expire at the end of the session to disk, but only keeps them in memory. I used the AnEC cookie editor to change the expiration to a date about a month hence, then I located the cookie in a file called cookies.txt under the .mozilla directory.

The url for the cookie editor is: https://addons.mozilla.org/en-US/firefox/addon/573

I made a copy of cookies.txt, because it says in perldoc lwptut that lwp might wipe out Netscape cookies. Next I made a little shell script to automate things a bit:

cp cookies.txt.hold cookies.txt
./lwp01.pl <<EOF
http://super-secret-password-protected-url.tld
EOF

Finally, here is the script that uses LWD and manages the cookie. The output goes to stdout. You could generalize this to pull down the whole site it you wanted to.

#!/usr/bin/env perl
use strict;
use warnings;

$|++;

use LWP;
use HTTP::Cookies;

my $browser = LWP::UserAgent->new;
$browser->cookie_jar(HTTP::Cookies::Netscape->new('file' => 'cookies.txt', 'autosave' => 1));

my @ns_headers = (
                  'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
                  'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*','Accept-Charset' => 'iso-8859-1,*,utf-8',
                  'Accept-Language' => 'en-US',
                 );

my $url = <>;
chomp $url;
my $response = $browser->get($url, @ns_headers);

print $response->decoded_content;

You could probably use the same method in class if you steal a cookie by sniffing the network.

see also

Perl
perldoc lwpcook
perldoc lwptut

Personal tools