初学者,想去爬一下小说网站的免费章节的小说内容,但是我也不知道是哪个参数没设置好,抓来的网页没有小说的内容。?>
<?php
$curlobj=curl_init();
//$fp = fopen("a1.html", "w");
//$header = array("Host:chuangshi.qq.com",""Referer);curl_setopt($curlobj,CURLOPT_URL,"http://chuangshi.qq.com/bk/xh/AGkEM11hVjcAPVRnATgBZg-r-4.html");
curl_setopt($curlobj,CURLOPT_RETURNTRANSFER,1);
//curl_setopt ($curlobj, CURLOPT_REFERER, "http://chuangshi.qq.com/bk/xh/AGkEM11hVjcAPVRnATgBZg-r-4.html");
//curl_setopt($curlobj,CURLOPT_HTTPHEADER,$header);
//curl_setopt($curlobj, CURLOPT_FILE,$fp);$output=curl_exec($curlobj);
curl_close($curlobj);
//fclose($fp);
echo $output;
差不多就是这样 上面的url是某本小说的某个具体章节的url。然后我中间试着加了很多参数进去 什么referer header host这些,因为我发现我的host和referer是本地的,和直接在浏览器登录的是不一样的,还有就是抓完以后我想把这个网页内容保存到一个文件里面去
<?php
$curlobj=curl_init();
//$fp = fopen("a1.html", "w");
//$header = array("Host:chuangshi.qq.com",""Referer);curl_setopt($curlobj,CURLOPT_URL,"http://chuangshi.qq.com/bk/xh/AGkEM11hVjcAPVRnATgBZg-r-4.html");
curl_setopt($curlobj,CURLOPT_RETURNTRANSFER,1);
//curl_setopt ($curlobj, CURLOPT_REFERER, "http://chuangshi.qq.com/bk/xh/AGkEM11hVjcAPVRnATgBZg-r-4.html");
//curl_setopt($curlobj,CURLOPT_HTTPHEADER,$header);
//curl_setopt($curlobj, CURLOPT_FILE,$fp);$output=curl_exec($curlobj);
curl_close($curlobj);
//fclose($fp);
echo $output;
差不多就是这样 上面的url是某本小说的某个具体章节的url。然后我中间试着加了很多参数进去 什么referer header host这些,因为我发现我的host和referer是本地的,和直接在浏览器登录的是不一样的,还有就是抓完以后我想把这个网页内容保存到一个文件里面去
<?php
$curlobj=curl_init();
//$fp = fopen("a1.html", "w");
//$header = array("Host:chuangshi.qq.com",""Referer);
curl_setopt($curlobj,CURLOPT_URL,"http://chuangshi.qq.com/bk/xh/AGkEPV1lVjUAP1RkATgBZw-r-3.html?data-user-action=rd024");
curl_setopt($curlobj,CURLOPT_RETURNTRANSFER,1);
curl_setopt ($curlobj, CURLOPT_REFERER, "http://chuangshi.qq.com/bk/xh/AGkEPV1lVjUAP1RkATgBZw-r-2.html?data-user-action=rd024");
//curl_setopt($curlobj,CURLOPT_HTTPHEADER,$header);
//curl_setopt($curlobj, CURLOPT_FILE,$fp);
$output=curl_exec($curlobj);
curl_close($curlobj);
//fclose($fp);
echo $output;?>
但实际文章内容地址是http://chuangshi.qq.com/index.php/Bookreader/14831299/2