# NAME

Web::Query - Yet another scraping library like jQuery

# SYNOPSIS

    use Web::Query;

    wq('http://www.w3.org/TR/html401/')
        ->find('div.head dt')
        ->each(sub {
            my $i = shift;
            printf("%d %s\n", $i+1, $_->text);
        });

# DESCRIPTION

Web::Query is a yet another scraping framework, have a jQuery like interface.

Yes, I know Ingy's [pQuery](https://metacpan.org/pod/pQuery). But it's just a alpha quality. It doesn't works.
Web::Query built at top of the CPAN modules, [HTML::TreeBuilder::XPath](https://metacpan.org/pod/HTML::TreeBuilder::XPath), [LWP::UserAgent](https://metacpan.org/pod/LWP::UserAgent), and [HTML::Selector::XPath](https://metacpan.org/pod/HTML::Selector::XPath).

So, this module uses [HTML::Selector::XPath](https://metacpan.org/pod/HTML::Selector::XPath) and only supports the CSS 3
selector supported by that module.
Web::Query doesn't support jQuery's extended queries(yet?).

__THIS LIBRARY IS UNDER DEVELOPMENT. ANY API MAY CHANGE WITHOUT NOTICE__.

# FUNCTIONS

- `wq($stuff)`

    This is a shortcut for `Web::Query->new($stuff)`. This function is exported by default.

# METHODS

## CONSTRUCTORS

- my $q = Web::Query->new($stuff, \\%options )

    Create new instance of Web::Query. You can make the instance from URL(http, https, file scheme), HTML in string, URL in string, [URI](https://metacpan.org/pod/URI) object, and instance of [HTML::Element](https://metacpan.org/pod/HTML::Element).

    This method throw the exception on unknown $stuff.

    This method returns undefined value on non-successful response with URL.

    Currently, the only option valid option is _indent_, which will be used as
    the indentation string if the object is printed.

- my $q = Web::Query->new\_from\_element($element: HTML::Element)

    Create new instance of Web::Query from instance of [HTML::Element](https://metacpan.org/pod/HTML::Element).

- `my $q = Web::Query->new_from_html($html: Str)`

    Create new instance of Web::Query from HTML.

- my $q = Web::Query->new\_from\_url($url: Str)

    Create new instance of Web::Query from URL.

    If the response is not success(It means /^20\[0-9\]$/), this method returns undefined value.

    You can get a last result of response, use the `$Web::Query::RESPONSE`.

    Here is a best practical code:

        my $url = 'http://example.com/';
        my $q = Web::Query->new_from_url($url)
            or die "Cannot get a resource from $url: " . Web::Query->last_response()->status_line;

- my $q = Web::Query->new\_from\_file($file\_name: Str)

    Create new instance of Web::Query from file name.

## TRAVERSING

### add

Add elements to the set of matched elements.

- add($html)

    An HTML fragment to add to the set of matched elements.

- add(@elements)

    One or more @elements to add to the set of matched elements.

- add($wq)

    An existing Web::Query object to add to the set of matched elements.

- add($selector, $context)

    $selector is a string representing a selector expression to find additional elements to add to the set of matched elements.

    $context is the point in the document at which the selector should begin matching

### contents

Get the immediate children of each element in the set of matched elements, including text and comment nodes.

### each

Visit each nodes. `$i` is a counter value, 0 origin. `$elem` is iteration item.
`$_` is localized by `$elem`.

    $q->each(sub { my ($i, $elem) = @_; ... })

### end

Back to the before context like jQuery.

### filter

Reduce the elements to those that pass the function's test.

    $q->filter(sub { my ($i, $elem) = @_; ... })

### find

Get the descendants of each element in the current set of matched elements, filtered by a selector.

    my $q2 = $q->find($selector); # $selector is a CSS3 selector.
    

__NOTE__ If you want to match the element itself, use ["filter"](#filter).

__INCOMPATIBLE CHANGE__ 
From v0.14 to v0.19 (inclusive) find() also matched the element itself, which is not jQuery compatible.
You can achieve that result using `filter()`, `add()` and `find()`:

    my $wq = wq('<div class="foo"><p class="foo">bar</p></div>'); # needed because we don't have a global document like jQuery does
    print $wq->filter('.foo')->add($wq->find('.foo'))->as_html; # <div class="foo"><p class="foo">bar</p></div><p class="foo">bar</p>

### first

Return the first matching element.

This method constructs a new Web::Query object from the first matching element.

### last

Return the last matching element.

This method constructs a new Web::Query object from the last matching element.

### map

Creates a new array with the results of calling a provided function on every element.

    $q->map(sub { my ($i, $elem) = @_; ... })

### parent

Get the parent of each element in the current set of matched elements.

### prev

Get the previous node of each element in the current set of matched elements.

    my $prev = $q->prev;

### next

Get the next node of each element in the current set of matched elements.

    my $next = $q->next;

## MANIPULATION

### add\_class

Adds the specified class(es) to each of the set of matched elements.

    # add class 'foo' to <p> elements
    wq('<div><p>foo</p><p>bar</p></div>')->find('p')->add_class('foo'); 

### after

Insert content, specified by the parameter, after each element in the set of matched elements.

    wq('<div><p>foo</p></div>')->find('p')
                               ->after('<b>bar</b>')
                               ->end
                               ->as_html; # <div><p>foo</p><b>bar</b></div>
    

The content can be anything accepted by ["new"](#new).

### append

Insert content, specified by the parameter, to the end of each element in the set of matched elements.

    wq('<div></div>')->append('<p>foo</p>')->as_html; # <div><p>foo</p></div>
    

The content can be anything accepted by ["new"](#new).

### as\_html

Return the elements associated with the object as strings. 
If called in a scalar context, only return the string representation
of the first element.

### ` attr `

Get/Set the attribute value in element.

    my $attr = $q->attr($name);

    $q->attr($name, $val);

### before

Insert content, specified by the parameter, before each element in the set of matched elements.

    wq('<div><p>foo</p></div>')->find('p')
                               ->before('<b>bar</b>')
                               ->end
                               ->as_html; # <div><b>bar</b><p>foo</p></div>
    

The content can be anything accepted by ["new"](#new).

### clone

Create a deep copy of the set of matched elements.

### detach

Remove the set of matched elements from the DOM.

### has\_class

Determine whether any of the matched elements are assigned the given class.

### ` html `

Get/Set the innerHTML.

    my @html = $q->html();

    my $html = $q->html(); # 1st matching element only

    $q->html('<p>foo</p>');

### insert\_before

Insert every element in the set of matched elements before the target.

### insert\_after

Insert every element in the set of matched elements after the target.

### ` prepend `

Insert content, specified by the parameter, to the beginning of each element in the set of matched elements. 

### remove

Delete the elements associated with the object from the DOM.

    # remove all <blink> tags from the document
    $q->find('blink')->remove;

### remove\_class

Remove a single class, multiple classes, or all classes from each element in the set of matched elements.

### replace\_with

Replace the elements of the object with the provided replacement. 
The replacement can be a string, a `Web::Query` object or an 
anonymous function. The anonymous function is passed the index of the current 
node and the node itself (with is also localized as `$_`).

    my $q = wq( '<p><b>Abra</b><i>cada</i><u>bra</u></p>' );

    $q->find('b')->replace_with('<a>Ocus</a>);
        # <p><a>Ocus</a><i>cada</i><u>bra</u></p>

    $q->find('u')->replace_with($q->find('b'));
        # <p><i>cada</i><b>Abra</b></p>

    $q->find('i')->replace_with(sub{ 
        my $name = $_->text;
        return "<$name></$name>";
    });
        # <p><b>Abra</b><cada></cada><u>bra</u></p>

### size

Return the number of elements in the Web::Query object.

    wq('<div><p>foo</p><p>bar</p></div>')->find('p')->size; # 2

### text

Get/Set the text.

    my @text = $q->text();

    my $text = $q->text(); # 1st matching element only

    $q->text('text');
    

If called in a scalar context, only return the string representation
of the first element

# HOW DO I CUSTOMIZE USER AGENT?

You can specify your own instance of [LWP::UserAgent](https://metacpan.org/pod/LWP::UserAgent).

    $Web::Query::UserAgent = LWP::UserAgent->new( agent => 'Mozilla/5.0' );

# INCOMPATIBLE CHANGES

- 0.10

    new\_from\_url() is no longer throws exception on bad response from HTTP server.

# AUTHOR

Tokuhiro Matsuno <tokuhirom AAJKLFJEF@ GMAIL COM>

# SEE ALSO

[pQuery](https://metacpan.org/pod/pQuery)

# LICENSE

Copyright (C) Tokuhiro Matsuno

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.