怎么取出一个网页里的所有链接

<?
$URL = "http://www.zwon.net/"; //start HTML page
print("<HTML>\n");
print("<HEAD>\n");
print("<TITLE>取得页面的链接</TITLE>\n");
print("</HEAD>\n");

print("<BODY>\n"); $page = fopen($URL, "r");//打开URL print("链接 $URL \n");
print("<UL>\n"); while(!feof($page))//在页面中循环
{
$line = fgets($page, 255);
while(eregi("HREF=\"[^\"]*\"", $line, $match))
{
//打印出URL链接
print("<Li>");
print($match[0]);
print(" \n");
}
} print("</UL>\n"); fclose($page); //关闭页面
print("</BODY>\n");
print("</HTML>\n");
?>

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

<?
$URL = "http://www.zwon.net/"; //start HTML page
print("<HTML>\n");
print("<HEAD>\n");
print("<TITLE>取得页面的链接</TITLE>\n");
print("</HEAD>\n");

print("<BODY>\n"); $page = fopen($URL, "r");//打开URL print("链接 $URL \n");
print("<UL>\n"); while(!feof($page))//在页面中循环
{
$line = fgets($page, 255);
while(eregi("HREF=\"[^\"]*\"", $line, $match))
{
//打印出URL链接
print("<Li>");
print($match[0]);
print(" \n");
}
} print("</UL>\n"); fclose($page); //关闭页面
print("</BODY>\n");
print("</HTML>\n");
?>
但是,我想把连接的说明也取出来!!!!
<a href=XXX>YYYY</a>把XXX与YYY都取出来
<?
 $f=file('http://www.sohu.com/index.html');
 $str="";
 while(list($line_num,$line)=each($f))
 {
$str.=$line;
 }
 $count=0;
 $ptn="@(.*?)<a\s([^>]*?)href=([\'\"\s]?)([^>\'\"\s]+)([\'\"\s]?)([^>]*?)>(.+?)</a>(.*)@is";
while(preg_match($ptn,$str,$reg))
{
 echo "-----------------------------------------------------------------------------------";
 $count++;
 echo "〖".$reg[4]."〗 ";
 echo $count."【".$reg[7]."】 ";
 $str=$reg[1].$reg[8];
}
?>
程序呆子,你的代码棒极啦,我爱死你啦(我是男的,爱有很多意思不要误会),我把你的代码又改了一点,我先把<a </a>替换成别的,然后用strip_tags去掉所有的HTM标签,然后,现把<a </a>换回来,我发现这样成功率会高一些!
/**
* 从一个HTML代码中取出所有的链接
*
* @access public/pravite
* @param string
* @return void
*/
function RexFindLink($str) {
 $str = str_replace ("<a ", "{#mya}", $str);
 $str = str_replace ("</a>", "{#@mya}", $str);
 $str = str_replace ("<A ", "{#mya}", $str);
 $str = str_replace ("</A>", "{#@mya}", $str);
 $str = strip_tags($str);
 $str = str_replace ( "{#mya}", "<a ",$str);
 $str = str_replace ( "{#@mya}", "</a>",$str);
 $ptn="@(.*?)<a\s([^>]*?)href=([\'\"\s]?)([^>\'\"\s]+)([\'\"\s]?)([^>]*?)>(.+?)</a>(.*)@is";
 while(preg_match($ptn,$str,$reg)){
 $str=$reg[1].$reg[8];
 $result1[] = $reg[7];
 $result2[] = $reg[4]; }
 Return array($result1,$result2);
}
wyx726 的输出结果：
链接 http://yep/temp/top.phphref="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"href="style.css"
programdolt 的结果：( 显示正确）
-----------------------------------------------------------------------------------〖http://register.mail.sohu.com/reg/Reg1.jsp〗
1【注册免费邮件】
-----------------------------------------------------------------------------------〖http://mail.sohu.net/〗
2【企业邮箱】
-----------------------------------------------------------------------------------〖http://host.sohu.net/〗
3【域名虚机】
-----------------------------------------------------------------------------------〖http://sh.sohu.com/〗
4【上海站】
-----------------------------------------------------------------------------------〖http://gd.sohu.com/〗
5【广东站】
-----------------------------------------------------------------------------------〖http://www.sohu.com/〗
6【首页】
-----------------------------------------------------------------------------------〖http://sms.sohu.com/〗
7【短信】
-----------------------------------------------------------------------------------〖http://mms.sohu.com〗
8【彩信】
-----------------------------------------------------------------------------------〖http://login.mail.sohu.com/〗
9【邮件】
-----------------------------------------------------------------------------------〖http://alumni.sohu.com/〗
10【校友录】
-----------------------------------------------------------------------------------〖http://dir.sohu.com/〗
11【搜索】
-----------------------------------------------------------------------------------〖http://store.sohu.com/〗
<?php
$song='01.Because You\'re
 Good To Me </TD>
 <TD class=unnamed2 align=middle width="9%" height=12><A
 href="ftp://ftp.jl.cninfo.net/pub/mp3/eason/0401.mp3">下载</A></TD>
 <TD class=unnamed2 align=middle width="9%" height=12><A
 href="http://music.jl.cninfo.net/newzj/eason/04/01.rm">试听</A></TD></TR>
 <TR>';
$patt="@<a\s([^>]*?)href=([\'\"\s]?)([^>\'\"\s]+)([\'\"\s]?)([^>]*?)>(.+?)</a>@is";
while(preg_match($patt,$song,$reg)){
echo $reg[3];
}
?>
这个为什么会死循环呀？？
请高手做答。