# NAME Web::Query - Yet another scraping library like jQuery # SYNOPSIS use Web::Query; wq('http://google.com/search?q=foobar') ->find('h2') ->each(sub { my $i = shift; printf("%d) %s\n", $i+1, $_->text }); # DESCRIPTION Web::Query is a yet another scraping framework, have a jQuery like interface. Yes, I know Ingy's [pQuery](http://search.cpan.org/perldoc?pQuery). But it's just a alpha quality. It doesn't works. Web::Query built at top of the CPAN modules, [HTML::TreeBuilder::XPath](http://search.cpan.org/perldoc?HTML::TreeBuilder::XPath), [LWP::UserAgent](http://search.cpan.org/perldoc?LWP::UserAgent), and [HTML::Selector::XPath](http://search.cpan.org/perldoc?HTML::Selector::XPath). So, this module uses [HTML::Selector::XPath](http://search.cpan.org/perldoc?HTML::Selector::XPath) and only supports the CSS 3 selector supported by that module. Web::Query doesn't support jQuery's extended queries(yet?). __THIS LIBRARY IS UNDER DEVELOPMENT. ANY API MAY CHANGE WITHOUT NOTICE__. # FUNCTIONS - `wq($stuff)` This is a shortcut for `Web::Query->new($stuff)`. This function is exported by default. # METHODS - my $q = Web::Query->new($stuff, \\%options ) Create new instance of Web::Query. You can make the instance from URL(http, https, file scheme), HTML in string, URL in string, [URI](http://search.cpan.org/perldoc?URI) object, and instance of [HTML::Element](http://search.cpan.org/perldoc?HTML::Element). This method throw the exception on unknown $stuff. This method returns undefined value on non-successful response with URL. Currently, the only option valid option is _indent_, which will be used as the indentation string if the object is printed. - my $q = Web::Query->new\_from\_element($element: HTML::Element) Create new instance of Web::Query from instance of [HTML::Element](http://search.cpan.org/perldoc?HTML::Element). - `my $q = Web::Query->new_from_html($html: Str)` Create new instance of Web::Query from HTML. - my $q = Web::Query->new\_from\_url($url: Str) Create new instance of Web::Query from URL. If the response is not success(It means /^20\[0-9\]$/), this method returns undefined value. You can get a last result of response, use the `$Web::Query::RESPONSE`. Here is a best practical code: my $url = 'http://example.com/'; my $q = Web::Query->new_from_url($url) or die "Cannot get a resource from $url: " . Web::Query->last_response()->status_line; - my $q = Web::Query->new\_from\_file($file\_name: Str) Create new instance of Web::Query from file name. - my @html = $q->html(); - my $html = $q->html(); - $q->html('

foo

'); Get/Set the innerHTML. - $q->as\_html(); Return the elements associated with the object as strings. If called in a scalar context, only return the string representation of the first element. - my @text = $q->text(); - my $text = $q->text(); - $q->text('text'); Get/Set the inner text. - my $attr = $q->attr($name); - `$q->attr($name, $val);` Get/Set the attribute value in element. - $q = $q->find($selector) This method find nodes by $selector from $q. $selector is a CSS3 selector. - $q->each(sub { my ($i, $elem) = @\_; ... }) Visit each nodes. `$i` is a counter value, 0 origin. `$elem` is iteration item. `$_` is localized by `$elem`. - $q->map(sub { my ($i, $elem) = @\_; ... }) Creates a new array with the results of calling a provided function on every element. - $q->filter(sub { my ($i, $elem) = @\_; ... }) Reduce the elements to those that pass the function's test. - $q->end() Back to the before context like jQuery. - my $size = $q->size() : Int Return the number of DOM elements matched by the Web::Query object. - my $parent = $q->parent() : Web::Query Return the parent node from `$q`. - my $first = $q->first() Return the first matching element. This method constructs a new Web::Query object from the first matching element. - my $last = $q->last() Return the last matching element. This method constructs a new Web::Query object from the last matching element. - $q->remove() Delete the elements associated with the object from the DOM. # remove all tags from the document $q->find('blink')->remove; - $q->replace\_with( $replacement ); Replace the elements of the object with the provided replacement. The replacement can be a string, a `Web::Query` object or an anonymous function. The anonymous function is passed the index of the current node and the node itself (with is also localized as `$_`). my $q = wq( '

Abracadabra

' ); $q->find('b')->replace_with('Ocus); #

Ocuscadabra

$q->find('u')->replace_with($q->find('b')); #

cadaAbra

$q->find('i')->replace_with(sub{ my $name = $_->text; return "<$name>"; }); #

Abrabra

# HOW DO I CUSTOMIZE USER AGENT? You can specify your own instance of [LWP::UserAgent](http://search.cpan.org/perldoc?LWP::UserAgent). $Web::Query::UserAgent = LWP::UserAgent->new( agent => 'Mozilla/5.0' ); # INCOMPATIBLE CHANGES 0. 10 new\_from\_url() is no longer throws exception on bad response from HTTP server. # AUTHOR Tokuhiro Matsuno # SEE ALSO [pQuery](http://search.cpan.org/perldoc?pQuery) # LICENSE Copyright (C) Tokuhiro Matsuno This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.