这个页面地址可以在ie当中打开。
我想通过以下代码远程获取其页面源代码,但是不成功,不知道原因请高手指教。
<?php
$url = "http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0";
echo file_get_contents($url);
?>
我想通过以下代码远程获取其页面源代码,但是不成功,不知道原因请高手指教。
<?php
$url = "http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0";
echo file_get_contents($url);
?>
以下是我测试代码和结果,希望对LZ有点用吧。虽然没能解决任何问题。curl的<?php$url = "http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0";$ch = curl_init();curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);$a = curl_exec($ch); //输出结果保存在$a中
$info = curl_getinfo($ch); //响应的headercurl_close($ch);
var_dump($a); //输出,可以看到结果里是 string(603) "<html><body style="background-color:transparent"></body></html> echo '<pre>';
print_r($info); //输出响应的header,可以看到[http_code] => 400
echo '</pre>';
?>
file_get_contents的<?php$url = "http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0";echo file_get_contents($url);//出错echo '<pre>';
print_r($http_response_header);//输出响应header,400 bad request
echo '</pre>';?>
curl的
string(603) "<html><body style="background-color:transparent"></body></html> "
<pre>Array
(
[url] => http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0
[content_type] => text/html; charset=ISO-8859-1
[http_code] => 400
[header_size] => 471
[request_size] => 420
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.157
[namelookup_time] => 0
[connect_time] => 0
[pretransfer_time] => 0
[size_upload] => 0
[size_download] => 603
[speed_download] => 3840
[speed_upload] => 0
[download_content_length] => -1
[upload_content_length] => -1
[starttransfer_time] => 0.157
[redirect_time] => 0
)</pre>file_get_contents的<br />
<b>Warning</b>: file_get_contents(http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0) [<a href='function.file-get-contents'>function.file-get-contents</a>]: failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request
in <b>D:\wamp\www\test\test.php</b> on line <b>5</b><br /><pre>Array
(
[0] => HTTP/1.0 400 Bad Request
[1] => P3P: policyref="http://googleads.g.doubleclick.net/pagead/gcn_p3p_.xml", CP="CURa ADMa DEVa TAIo PSAo PSDo OUR IND UNI PUR INT DEM STA PRE COM NAV OTC NOI DSP COR"
[2] => Content-Type: text/html; charset=ISO-8859-1
[3] => Set-Cookie: test_cookie=CheckForPermission; expires=Tue, 07-Jul-2009 15:15:32 GMT; path=/; domain=.doubleclick.net
[4] => Date: Tue, 07 Jul 2009 15:00:32 GMT
[5] => Server: cafe
[6] => Cache-Control: private, x-gzip-ok=""
)
</pre>
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11"
));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);$a = curl_exec($ch); //输出结果保存在$a中
$info = curl_getinfo($ch); //响应的headercurl_close($ch);
var_dump($a); //输出,可以看到结果里是 string(603) "<html><body style="background-color:transparent"></body></html> echo '<pre>';
print_r($info); //输出响应的header,可以看到[http_code] => 400
echo '</pre>';
?>
我完全能获取到,只是要伪装成浏览器就行.
<pre>Array
(
[url] => http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0
[content_type] => text/html; charset=UTF-8
[http_code] => 200
[header_size] => 497
[request_size] => 526
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 4.269
[namelookup_time] => 1.005
[connect_time] => 4.046
[pretransfer_time] => 4.046
[size_upload] => 0
[size_download] => 2112
[speed_download] => 494
[speed_upload] => 0
[download_content_length] => 0
[upload_content_length] => 0
[starttransfer_time] => 4.268
[redirect_time] => 0
)
</pre>
$url = "http://googleads.g.doubleclick.net/pagead/sdo?client=dist-aff-pub-8581564299250417&output=html&dt=1246957659171&format=js_sdo&h=35&w=500&correlator=1246957659171&same_win=2&logo=left&rl_pos=right&cts_mode=rs&num_cts=2&box_h=26&box_w=215&u_h=768&u_w=1024&u_ah=738&u_aw=1024&u_cd=32&u_tz=480&u_his=0&u_java=true&u_nplug=0&u_nmime=0&frm=0&lmt=1246957658&url=http://127.0.0.1/search.html&dtd=0";$ch = curl_init();//使目标服务器认为请求方是浏览器
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11"
));curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);$a = curl_exec($ch); //输出结果保存在$a中curl_close($ch);echo $a; //输出结果
?>