如题 :一下是代码 。
<?php
include "Snoopy.class.php";
$snoopy = new Snoopy;
$sss="http://gggudf.b2b.hc360.com/shop/company.html";
$snoopy->fetch($sss);
$comp_file_content = $snoopy->results;
echo $comp_file_content;
?>
首先声明 :http://gggudf.b2b.hc360.com/shop/company.html
这个链接 在浏览器地址栏输入时有效地
<?php
include "Snoopy.class.php";
$snoopy = new Snoopy;
$sss="http://gggudf.b2b.hc360.com/shop/company.html";
$snoopy->fetch($sss);
$comp_file_content = $snoopy->results;
echo $comp_file_content;
?>
首先声明 :http://gggudf.b2b.hc360.com/shop/company.html
这个链接 在浏览器地址栏输入时有效地
file_get_contents()试过了,获取单个网页的时候还是可以的 ,不过一到了循环读取多个url的时候返回总是超时,估计是跟该网站有关系。所以想到了用 snoopy (传闻snoopy可是采集利器哦,可以模拟浏览器动作)。
可眼下又遇到了这么道坎儿。
……
所以拜托各位大大啦
一下为$comp_name数组的元素:
nichuanyin
a188a188
cgm6551987
dongdonghuazhen
fjzm814178
nz0808http_adr='http://'.$comp_name[i].'.b2b.hc360.com/shop/company.html';
谁能提供代码 用循环数组的方式 把网页的内容取出,并且存入 :d:\comp_files.txt文件分全部相送(不管是用 snoopy,还是curl 还是file_get_contents) 。注:灌水的予以忽略
//cookie保存文件,可忽略
$cookie_jar = '/tmp/cookie.tmp';function request($url,$cookie_jar,$referer){
$ch = curl_init();
$options = array(CURLOPT_URL => $url,
CURLOPT_HEADER => 0,
CURLOPT_NOBODY => 0,
CURLOPT_PORT => 80,
CURLOPT_POST => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_USERAGENT => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)',
CURLOPT_COOKIEJAR => $cookie_jar,
CURLOPT_COOKIEFILE => $cookie_jar,
CURLOPT_REFERER => $referer
);
curl_setopt_array($ch, $options);
$code = curl_exec($ch);
curl_close($ch);
return $code;
}
$comp_name = array('nichuanyin','a188a188','cgm6551987','dongdonghuazhen','fjzm814178','nz0808');$max = count($comp_name);$fp=fopen("./comp_files.txt","a");for($i=0;$i<$max;$i++){
$http_adr = 'http://'.$comp_name[$i].'.b2b.hc360.com/shop/company.html';
$response = request($http_adr,$cookie_jar,''); //输出到浏览器
echo $comp_name[$i],":\n";
echo "====================================================================================================================================================================\n";
echo $response;
echo "\n\n"; //写入文件
fwrite($fp,$comp_name[$i].":\r\n");
fwrite($fp,"=============================================================================================================================\r\n"); fwrite($fp,$response."\r\n");
}echo "\n\n执行完毕!";
fclose($fp);
?>
Fatal error: Maximum execution time of 30 seconds exceeded in C:\wamp\www\test_test.php on line 29
即:$code = curl_exec($ch); 这一行 。
咋回事你 ??是不是与我网络速率有关系 ??
楼上的哥哥 你在你的机器上调试通过了吗 ??
8楼的GG不就是用的 curl吗 ??
file_get_contents()就可以了
防止PHP页面执行超时set_time_limit(0);
$comp_name = array('nichuanyin','a188a188','cgm6551987','dongdonghuazhen','fjzm814178','nz0808');foreach($comp_name as $key=>$value)
{
$http_adr = 'http://'.$value.'.b2b.hc360.com/shop/company.html';
$response = file_get_contents($http_adr);
//输出到浏览器
echo $value . ":<br/>";
echo "====================================================================================================================================================================\n";
echo $response;
echo "<br/><br/>"; //写入文件
file_put_contents("./test.txt", $response ."\n\n\n", FILE_APPEND);
}
设为 120 只取到 4个 就返回超时。
结贴。来 shadowsinper哥哥 接分
<?
$url='http://www.163.com';
ob_start(); //打开输出缓冲区
$ch = curl_init(); //初始化会话
curl_setopt( $ch, CURLOPT_URL, $url ); //设定目标URL
curl_exec( $ch ); //发送请求
//$retrievedhtml = ob_get_contents(); //返回内部缓冲区的内容
//ob_end_clean(); //删除内部缓冲区的内容并关闭内部缓冲区
curl_close( $ch ); //会话结束
?>