用php能不能对论坛的数据抓取,然后存到本地数据库上？

可以啊用readfile()读取，再用正则筛选

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

可以，无非就是file_get_contents,正则分析不过，如果要求登陆，建议用curl
不好意思应该是file_get_contents才对
<?php
$page = file_get_contents('http://community.csdn.net/Expert/topic/5147/5147294.xml?temp=.922558');
if (!$page)
exit();ereg('<PostUserNickName>(.*)</PostUserNickName>.*<ranknum>(.+)</ranknum>.*<TopicName>(.+)</TopicName>.*<PostUserName>(.+)</PostUserName>.*<Content>(.+)</Content>.*</Issue>', $page, $post);echo '主题：'.$post[3].' ';
echo '楼主：'.$post[4].'('.$post[1].')<img src="http://community.csdn.net/expert/images/rank/'.$post[2].'.gif" /> ';
echo '内容：'.$post[5];echo '<hr />';
?>
完整的例子，虽然写得不怎么好。````<?php
$page = file_get_contents('http://community.csdn.net/Expert/topic/5147/5147294.xml?temp=.922558');if (!$page)
exit;

ereg('<PostUserNickName>(.*)</PostUserNickName>.*<ranknum>(.+)</ranknum>.*<TopicName>(.+)</TopicName>.*<PostUserName>(.+)</PostUserName>.*<Content>(.+)</Content>.*</Issue>', $page, $post);echo '主题：'.$post[3].' ';
echo '楼主：'.$post[4].'('.$post[1].')<img src="http://community.csdn.net/expert/images/rank/'.$post[2].'.gif" /> ';
echo '内容：'.$post[5];
echo '<hr />';preg_match_all('|<Reply>.*?<PostUserNickName>([^<]*?)</PostUserNickName>.*?<ranknum>([^<]+?)</ranknum>.*?<PostUserName>([^<]+?)</PostUserName>.*?<Content>([^<]+?)</Content>|', $page, $reply, PREG_SET_ORDER);$count = 1;
foreach ($reply as $row)
{
echo $count++.'楼：';
echo $row[3].'(';
echo $row[1].')';
echo '<img src="http://community.csdn.net/expert/images/rank/'.$row[2].'.gif" /> ';
echo '内容：'.nl2br($row[4]);
echo '<hr />';
}
?>
想问问iasky, 能给写CURL的例子么?
星期五时运时 willko 的完整例子，很OK，但今天就不行拉，真是怪事！
使用CURL库访问代理服务器
<?
function curl_string ($url,$user_agent,$proxy){ $ch = curl_init();
 curl_setopt ($ch, CURLOPT_PROXY, $proxy);
 curl_setopt ($ch, CURLOPT_URL, $url);
 curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
 curl_setopt ($ch, CURLOPT_COOKIEJAR, "c:\cookie.txt");
 curl_setopt ($ch, CURLOPT_HEADER, 1);
 curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
 curl_setopt ($ch, CURLOPT_TIMEOUT, 120);
 $result = curl_exec ($ch);
 curl_close($ch);
 return $result;}$url_page = "http://www.google.com";
$user_agent = "Mozilla/4.0";
$proxy = "http://192.11.222.124:8000";
$string = curl_string($url_page,$user_agent,$proxy);
echo $string;
?>
借问一下，可以将baidu、google的搜索结果保存到本地数据库上吗？
不需要用curl。file_get_contents＋preg_match_all即可。