Proposal for "HTTP_Browser"

» Metadata » Status
  • Category: HTTP
  • Proposer: Dylan Doxey 
  • License: PHP License 3.01
  • Status: Draft
» Description
This package extends HTTP_Client and provides a an API which represents a web browser interface.
For example, there are methods which represent the basic web browser buttons.
  1. go
  2. back
  3. goHome
  4. reload

This package also provides methods which are inspired by the Perl WWW::Mechanize module to provide an intuitive interface for writing programs which interact with websites.
  1. followLink
  2. selectForm
  3. submitForm
  4. getContent

There are also methods which provide a convenient way of analyzing and manipulating the content of web pages.
  1. contentContains
  2. contentMatches
  3. contentTags
  4. saveContent
  5. updateContent

Example program using HTTP_Browser:

<?php

include 'HTTP/Browser.php';

$q = 'HTTP_Request';

$config = array(
'agent_alias' => 'Windows IE 6',
'home_page' => 'http://pear.php.net/',
);
$browser = new HTTP_Browser( $config );

$form = array(
'method' => 'GET',
'action' => '/search.php',
'fields' => array(
'q' => $q,
'in' => 'packages',
),
);
$browser->submit( $form );

if ( $browser->success() ) {

$browser->followLink($q);

$regex
= '/'
. '<li>'
. '<a \s+ href=" [^"]+ ">'
. '( [^<]+ )'
. '<\/a>&nbsp; \s+ [\(] lead [\)]'
. '<\/li>'
. '/xms';

$developer_match = $browser->contentMatches( $regex );

echo "$q -- Lead Developer: ";
echo $developer_match[1][0];
echo "\n";
}
else {

echo $browser->getContent();
}
?>

Known shortcomings:
  1. back & reload won't resubmit a form
  2. clickButton is unimplemented
  3. ... surely more to come
» Dependencies » Links
  • HTTP_Client
» Timeline » Changelog
  • First Draft: 2009-09-26