chunked编码问题
PHP采集到的数据是chunked传输编码,gzip压缩格式的
chunk编码的思路貌似是:将数据分块传输,每一块分为头部和主体字段,头部包含主体信息的长度且以16进制表示,头部和主体以回车换行符分隔,最后一块以单行的0表示分块结束。。
响应头信息:
<br><br>Array<br>(<br> [0] => HTTP/1.1 200 OK<br><br> [1] => Server: Dict/34002<br><br> [2] => Date: Wed, 17 Dec 2014 06:49:22 GMT<br><br> [3] => Content-Type: text/html; charset=utf-8<br><br> [4] => Transfer-Encoding: chunked<br><br> [5] => Connection: keep-alive<br><br> [6] => Keep-Alive: timeout=60<br><br> [7] => Cache-Control: private<br><br> [8] => Last-Modified: Wed, 17 Dec 2014 04:57:49 GMT<br><br> [9] => Expires: Wed, 17 Dec 2014 06:49:22 GMT<br><br> [10] => Set-Cookie: uvid=VJEncoTSVYJC; expires=Thu, 31-Dec-37 23:55:55 GMT; domain=.dict.cn; path=/<br><br> [11] => Content-Encoding: gzip<br><br>)<br><br>
<br><br>if($this->response_num==200)<br> {<br> if($this->is_chunked)<br> {<br> //读取chunk头部信息,获取chunk主体信息的长度<br> $chunk_size = (int)hexdec(fgets($this->conn));<br> //<br> while(!feof($this->conn) && $chunk_size > 0) <br> { <br> //读取chunk头部指定长度的信息<br> $this->response_body .= fread( $this->conn, $chunk_size ); <br> fseek($this->conn, 2, SEEK_CUR);<br> $chunk_size = (int)hexdec(fgets( $this->conn,4096)); <br> } <br> }<br> else<br> {<br> $len=0;<br> //读取请求返回的主体信息<br> while($items = fread($this->conn, $this->response_body_length))<br> {<br> $len = $len+strlen($items);<br> $this->response_body = $items;<br> <br> //当读取完请求的主体信息后跳出循环,不这样做,貌似会被阻塞!!!<br> if($len >= $this->response_body_length)<br> {<br> break;<br> }<br> }<br> }<br> <br> if($this->is_gzip)<br> {<br> $this->response_body = gzinflate(substr($this->response_body,10));<br> }<br> <br> $this->getTrans($this->response_body);<br><br> }<br>基本上每次都会出现这个提示:
warning: gzinflate(): data error in e:\codeedit\php\http\dict.php on line 384
偶尔能正常解析,应该是chunked解码有问题,查看过一些资料,也变换过集中解码方式,但还是功亏一篑
------解决思路----------------------
你可用 gzdecode 解码










