<?xml version="1.0"?>
<?xml-stylesheet 
 href="http://www.w3.org/2000/08/w3c-synd/style.css" type="text/css"
?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel rdf:about="http://pear.php.net/bugs/9829/bug">
    <title>PEAR Bug #9829</title>
    <link>http://pear.php.net/bugs/9829</link>
    <description>[Open] Support for limiting the size of the response body</description>
    <dc:language>en-us</dc:language>
    <dc:creator>pear-webmaster@lists.php.net</dc:creator>
    <dc:publisher>pear-webmaster@lists.php.net</dc:publisher>
    <admin:generatorAgent rdf:resource="http://pear.php.net/bugs"/>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
    <items>
     <rdf:Seq>
      <rdf:li rdf:resource="http://pear.php.net/bugs/9829"/>
      <rdf:li rdf:resource="http://pear.php.net/bugs/9829/2007-01-15+12%3A54%3A49#2007-01-15+12%3A54%3A49"/>
      <rdf:li rdf:resource="http://pear.php.net/bugs/9829/2007-01-15+08%3A53%3A43#2007-01-15+08%3A53%3A43"/>
     </rdf:Seq>
    </items>
  </channel>
    <item rdf:about="http://pear.php.net/bugs/9829">
      <title>webmaster@... [2007-01-14 16:15:36]</title>
      <link>http://pear.php.net/bugs/9829</link>
      <description><![CDATA[<pre>HTTP_Request Feature/Change Request
Reported by webmaster@...
2007-01-14T21:15:36-00:00
PHP: 5.1.6 OS: Any Package Version: 1.4.0

Description:
------------
It would be nice if the there was a mechanism in the class to limit the size of the request body. If a remote website is requested, not limiting the maximum size of the request body could either trigger the memory limit or - if that is disabled - cause the script to use more and more memory until it is exhausted (HTTP does not pose a general limit on the size of request bodies). Since it is not clear how much somebody using this class would want to download with it, the limit should be configurable.

Furthermore, it should be discussed how the class should behave if the limit is exceeded, I could imagine three different actions:

1) Immediately close the connection to the remote server and throw an error

2) Immediately close the connection to the server but save the first $limit bytes so that they may be used by the application.

3) Save the first $limit byes, leave the connection open and continue reading (while discarding the data) until the resource was completely transferred (this would only apply if Keep-Alive connections were used, else, action 2 would suffice)

One could either settle on one of these three actions or create a setting to let the programmer decide (which I would prefer, since it may depend on your situation which of these methods is the best; but even then one should look for a sane default).

Implementing this feature would mean:

a) Checking the Content-Length Header field if set, so it would be possible to close the connection immediately after the request header was transfered - this would only apply to case 1 above. Or - if one would choose case 2, this could cause the class to read exact $limit bytes and then close the connection. Or - in case 3 - one could read $limit bytes, save them and then read the rest without saving them.

b) Keeping track of the total size of the request body sent until now when using chunked encoding and either closing the connection if the limit is exceeded (cases 1 and 2) or continue reading but discarding data until the last chunk was transfered.

c) Passing the limit on to the _decodeGzip function so it may cancel if the length it determines in the header of the stream ($dataSize) is larger than the limit. Since according to the PHP manual gzinflate() will always return false if the data stream is larger than the specified limit, it will not be possible to just pass on $limit bytes of the uncompressed content if it is to large. I see no way around that limitation.

Also, it has to be discussed what shall happen if listeners are attached. In my eyes the best possibility would be to ignore them while implementing this and leave it to the programmer how to react to this; in case 3 the listeners would get all the data and would be responsible themselves for managing the maximum size, in case 2 the listeners would get only $limit bytes of the data and in case 1 the listeners will mostly get none of the data unless no content length header was specified - in which case this would be as in case 2. For cases 1 and 2 an additional event 'canceled' would be necessary.</pre>]]></description>
      <content:encoded><![CDATA[<pre>HTTP_Request Feature/Change Request
Reported by webmaster@...
2007-01-14T21:15:36-00:00
PHP: 5.1.6 OS: Any Package Version: 1.4.0

Description:
------------
It would be nice if the there was a mechanism in the class to limit the size of the request body. If a remote website is requested, not limiting the maximum size of the request body could either trigger the memory limit or - if that is disabled - cause the script to use more and more memory until it is exhausted (HTTP does not pose a general limit on the size of request bodies). Since it is not clear how much somebody using this class would want to download with it, the limit should be configurable.

Furthermore, it should be discussed how the class should behave if the limit is exceeded, I could imagine three different actions:

1) Immediately close the connection to the remote server and throw an error

2) Immediately close the connection to the server but save the first $limit bytes so that they may be used by the application.

3) Save the first $limit byes, leave the connection open and continue reading (while discarding the data) until the resource was completely transferred (this would only apply if Keep-Alive connections were used, else, action 2 would suffice)

One could either settle on one of these three actions or create a setting to let the programmer decide (which I would prefer, since it may depend on your situation which of these methods is the best; but even then one should look for a sane default).

Implementing this feature would mean:

a) Checking the Content-Length Header field if set, so it would be possible to close the connection immediately after the request header was transfered - this would only apply to case 1 above. Or - if one would choose case 2, this could cause the class to read exact $limit bytes and then close the connection. Or - in case 3 - one could read $limit bytes, save them and then read the rest without saving them.

b) Keeping track of the total size of the request body sent until now when using chunked encoding and either closing the connection if the limit is exceeded (cases 1 and 2) or continue reading but discarding data until the last chunk was transfered.

c) Passing the limit on to the _decodeGzip function so it may cancel if the length it determines in the header of the stream ($dataSize) is larger than the limit. Since according to the PHP manual gzinflate() will always return false if the data stream is larger than the specified limit, it will not be possible to just pass on $limit bytes of the uncompressed content if it is to large. I see no way around that limitation.

Also, it has to be discussed what shall happen if listeners are attached. In my eyes the best possibility would be to ignore them while implementing this and leave it to the programmer how to react to this; in case 3 the listeners would get all the data and would be responsible themselves for managing the maximum size, in case 2 the listeners would get only $limit bytes of the data and in case 1 the listeners will mostly get none of the data unless no content length header was specified - in which case this would be as in case 2. For cases 1 and 2 an additional event 'canceled' would be necessary.</pre>]]></content:encoded>
      <dc:date>2007-01-14T21:15:36-00:00</dc:date>
    </item>
    <item rdf:about="http://pear.php.net/bugs/9829/2007-01-15+12%3A54%3A49#2007-01-15+12%3A54%3A49">
      <title>webmaster@... [2007-01-15 17:54]</title>
      <link>http://pear.php.net/bugs/9829#1168883689</link>
      <description><![CDATA[<pre>Err... yes, of course I mean response body, sorry for the glitch.

I know of the possibility to use listeners (also before having submitted this feature request) - but in my eyes there are three major drawbacks with this method:

1) Having to implement a listener just in order to do a simple request (just having - additionally to the timeout - a limitation of the response body size) is overkill - at least from my point of view.

2) If the remote server is about to send a _really_ large resource, this will consume a LOT of traffic AND a LOT of time - possibly even causing the time limit of PHP to respond (since on every tick a little bit code WILL be executed, the resource just has to be large enough). If one had the possibility to close the connection before the resource was completely transfered this problem would not occur.

3) If gzip compression is used, the listener will have to completely re-implement the decompression code your package provides, because that code is only used if $saveBody is true. Only workaround for that would be to disable gzip compression which I don't like either.</pre>]]></description>
      <content:encoded><![CDATA[<pre>Err... yes, of course I mean response body, sorry for the glitch.

I know of the possibility to use listeners (also before having submitted this feature request) - but in my eyes there are three major drawbacks with this method:

1) Having to implement a listener just in order to do a simple request (just having - additionally to the timeout - a limitation of the response body size) is overkill - at least from my point of view.

2) If the remote server is about to send a _really_ large resource, this will consume a LOT of traffic AND a LOT of time - possibly even causing the time limit of PHP to respond (since on every tick a little bit code WILL be executed, the resource just has to be large enough). If one had the possibility to close the connection before the resource was completely transfered this problem would not occur.

3) If gzip compression is used, the listener will have to completely re-implement the decompression code your package provides, because that code is only used if $saveBody is true. Only workaround for that would be to disable gzip compression which I don't like either.</pre>]]></content:encoded>
      <dc:date>2007-01-15T17:54:49-00:00</dc:date>
    </item>
    <item rdf:about="http://pear.php.net/bugs/9829/2007-01-15+08%3A53%3A43#2007-01-15+08%3A53%3A43">
      <title>avb [2007-01-15 13:53]</title>
      <link>http://pear.php.net/bugs/9829#1168869223</link>
      <description><![CDATA[<pre>There is a means to prevent saving the complete *response* body (I assume here that you actually meant *response* instead of *request*). The response chunks will be processed by a Listener attached to the Request object in this case.

Please consult the download-progress.php example which shows this technique.</pre>]]></description>
      <content:encoded><![CDATA[<pre>There is a means to prevent saving the complete *response* body (I assume here that you actually meant *response* instead of *request*). The response chunks will be processed by a Listener attached to the Request object in this case.

Please consult the download-progress.php example which shows this technique.</pre>]]></content:encoded>
      <dc:date>2007-01-15T13:53:43-00:00</dc:date>
    </item>
</rdf:RDF>