用php能不能对论坛的数据抓取,然后存到本地数据库上? 可以啊用readfile()读取,再用正则筛选 解决方案 » 免费领取超大流量手机卡,每月29元包185G流量+100分钟通话, 中国电信官方发货 可以,无非就是file_get_contents,正则分析不过,如果要求登陆,建议用curl 不好意思应该是file_get_contents才对<?php$page = file_get_contents('http://community.csdn.net/Expert/topic/5147/5147294.xml?temp=.922558');if (!$page) exit();ereg('<PostUserNickName>(.*)</PostUserNickName>.*<ranknum>(.+)</ranknum>.*<TopicName>(.+)</TopicName>.*<PostUserName>(.+)</PostUserName>.*<Content>(.+)</Content>.*</Issue>', $page, $post);echo '主题:'.$post[3].'<br/>';echo '楼主:'.$post[4].'('.$post[1].')<img src="http://community.csdn.net/expert/images/rank/'.$post[2].'.gif" /> <br />';echo '内容:'.$post[5];echo '<hr />';?> 完整的例子,虽然写得不怎么好。````<?php$page = file_get_contents('http://community.csdn.net/Expert/topic/5147/5147294.xml?temp=.922558');if (!$page) exit; ereg('<PostUserNickName>(.*)</PostUserNickName>.*<ranknum>(.+)</ranknum>.*<TopicName>(.+)</TopicName>.*<PostUserName>(.+)</PostUserName>.*<Content>(.+)</Content>.*</Issue>', $page, $post);echo '主题:'.$post[3].'<br/>';echo '楼主:'.$post[4].'('.$post[1].')<img src="http://community.csdn.net/expert/images/rank/'.$post[2].'.gif" /> <br />';echo '内容:'.$post[5];echo '<hr />';preg_match_all('|<Reply>.*?<PostUserNickName>([^<]*?)</PostUserNickName>.*?<ranknum>([^<]+?)</ranknum>.*?<PostUserName>([^<]+?)</PostUserName>.*?<Content>([^<]+?)</Content>|', $page, $reply, PREG_SET_ORDER);$count = 1;foreach ($reply as $row){ echo $count++.'楼:'; echo $row[3].'('; echo $row[1].')'; echo '<img src="http://community.csdn.net/expert/images/rank/'.$row[2].'.gif" /> <br />'; echo '内容:'.nl2br($row[4]); echo '<hr />';}?> 想问问iasky, 能给写CURL的例子么? 星期五时运时 willko 的完整例子,很OK,但今天就不行拉,真是怪事! 使用CURL库访问代理服务器<?function curl_string ($url,$user_agent,$proxy){ $ch = curl_init(); curl_setopt ($ch, CURLOPT_PROXY, $proxy); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent); curl_setopt ($ch, CURLOPT_COOKIEJAR, "c:\cookie.txt"); curl_setopt ($ch, CURLOPT_HEADER, 1); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt ($ch, CURLOPT_TIMEOUT, 120); $result = curl_exec ($ch); curl_close($ch); return $result;}$url_page = "http://www.google.com";$user_agent = "Mozilla/4.0";$proxy = "http://192.11.222.124:8000";$string = curl_string($url_page,$user_agent,$proxy);echo $string;?> 借问一下,可以将baidu、google的搜索结果保存到本地数据库上吗? 不需要用curl。file_get_contents+preg_match_all即可。 php js 控制弹出窗口问题 广州诚招PHP高级工程师 一个php高难度问题!!! 请教换行问题 xml 与smarty问题 pdo继承出错在线等 100分请指教如何解决404错误转向另一页面 一个非常经典的问题!!!望高手解答~~~!! 在分页的时候遇到的问题 问下PHP 定时执时 代码怎么实现 请教一条SQL cookie保存MD5后的密码安全吗
<?php
$page = file_get_contents('http://community.csdn.net/Expert/topic/5147/5147294.xml?temp=.922558');
if (!$page)
exit();ereg('<PostUserNickName>(.*)</PostUserNickName>.*<ranknum>(.+)</ranknum>.*<TopicName>(.+)</TopicName>.*<PostUserName>(.+)</PostUserName>.*<Content>(.+)</Content>.*</Issue>', $page, $post);echo '主题:'.$post[3].'<br/>';
echo '楼主:'.$post[4].'('.$post[1].')<img src="http://community.csdn.net/expert/images/rank/'.$post[2].'.gif" /> <br />';
echo '内容:'.$post[5];echo '<hr />';
?>
$page = file_get_contents('http://community.csdn.net/Expert/topic/5147/5147294.xml?temp=.922558');if (!$page)
exit;
ereg('<PostUserNickName>(.*)</PostUserNickName>.*<ranknum>(.+)</ranknum>.*<TopicName>(.+)</TopicName>.*<PostUserName>(.+)</PostUserName>.*<Content>(.+)</Content>.*</Issue>', $page, $post);echo '主题:'.$post[3].'<br/>';
echo '楼主:'.$post[4].'('.$post[1].')<img src="http://community.csdn.net/expert/images/rank/'.$post[2].'.gif" /> <br />';
echo '内容:'.$post[5];
echo '<hr />';preg_match_all('|<Reply>.*?<PostUserNickName>([^<]*?)</PostUserNickName>.*?<ranknum>([^<]+?)</ranknum>.*?<PostUserName>([^<]+?)</PostUserName>.*?<Content>([^<]+?)</Content>|', $page, $reply, PREG_SET_ORDER);$count = 1;
foreach ($reply as $row)
{
echo $count++.'楼:';
echo $row[3].'(';
echo $row[1].')';
echo '<img src="http://community.csdn.net/expert/images/rank/'.$row[2].'.gif" /> <br />';
echo '内容:'.nl2br($row[4]);
echo '<hr />';
}
?>
<?
function curl_string ($url,$user_agent,$proxy){ $ch = curl_init();
curl_setopt ($ch, CURLOPT_PROXY, $proxy);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt ($ch, CURLOPT_COOKIEJAR, "c:\cookie.txt");
curl_setopt ($ch, CURLOPT_HEADER, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_TIMEOUT, 120);
$result = curl_exec ($ch);
curl_close($ch);
return $result;}$url_page = "http://www.google.com";
$user_agent = "Mozilla/4.0";
$proxy = "http://192.11.222.124:8000";
$string = curl_string($url_page,$user_agent,$proxy);
echo $string;
?>