就这几行吗??
<?php
$txt = <<<eee
<a href="1.htm" class='name'>asdasddd</a>
<a href="1.htm" class='numb'>1233</a>
<div class="address">zhejiang china</div>
eee;
$txt = strip_tags( $txt );
print_r(split("\n", $txt ));
?>
<?php
$txt = <<<eee
<a href="1.htm" class='name'>asdasddd</a>
<a href="1.htm" class='numb'>1233</a>
<div class="address">zhejiang china</div>
eee;
$txt = strip_tags( $txt );
print_r(split("\n", $txt ));
?>
Array
(
[0] => asdasddd
[1] => 1233
[2] => zhejiang china
)
而且
<a href="1.htm" class='name'>asdasddd</a>
<a href="1.htm" class='numb'>1233</a>
<div class="address">zhejiang china</div>
这个也有很多!但是有规律的 !
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
</head><body>
<table width="200" border="1">
<tr>
<td>
<a href="1.htm" class='name'>asdasddd</a><br>
<a href="1.htm" class='numb'>1233</a><br>
<div class="address">zhejiang china</div> </td>
</tr>
<tr>
<td><a href="1.htm" class='name'>asdasddd</a><br />
<a href="1.htm" class='numb'>1233</a><br />
<div class="address">zhejiang china</div></td>
</tr>
</table>
<table width="200" border="1">
<tr>
<td><a href="1.htm" class='name'>asdasddd</a><br />
<a href="1.htm" class='numb'>1233</a><br />
<div class="address">zhejiang china</div></td>
</tr>
<tr>
<td><a href="1.htm" class='name'>asdasddd</a><br />
<a href="1.htm" class='numb'>1233</a><br />
<div class="address">zhejiang china</div></td>
</tr>
</table>
</body>
</html>
<a href="1.htm" class='numb'>1233</a>
<div class="address">zhejiang china</div>
strings;preg_match_all('/class\=[\"|\'](.+?)[\"|\']\>(.+?)\</is',$str,$ar);$ar[1][0]=numb
$ar[1][1]=address
$ar[2][0]=1233
$ar[2][1]=zhejiang china
$mrStr=<<<EOT
<a href= "1.htm" class= 'numb'>1233</a>
<div class="address">zhejiang china</div>
EOT;
print preg_match_all('/class=[\s|\t]*[\"|\'](.*?)[\"|\'][\s|\t]*>(.+?)<\//is',$mrStr,$mrStr)."<br />";
echo $mrStr[1][0]."<br />";
echo $mrStr[1][1]."<br />";
echo $mrStr[2][0]."<br />";
echo $mrStr[2][1]."<br />";
?> 这样才对
To: zeroleonhart(Strong Point:Algorithm)
你没有考虑 空格和制表符等.
还有什么要改善的没?
$mrStr=$HTML;// $HTML就是你要转换的数据
preg_match_all('/class=[\s|\t]*[\"|\']?([^\'\"<>]*?)[\"|\']?[\s|\t]*>(.*?)</i',$mrStr,$mrStr);
var_dump($mrStr);
?>更新了下~~如果
class=classname >content<
class=classname" >content<
可以纠错:)
strip_tags 在这里比较难控制先.