现在有一组HTML:<a href="show.php?hash=asdfasfe23edsadf" target="_blank"><span style="color:red;font-weight:bold;">asdasdfasdfsad</span></a>
<a href="show.php?hash=223rwefdsv" target="_blank"><span style="color:red;font-weight:bold;">asdaew32</span></a>
<a href="show.php?hash=sdefqwtg5423" target="_blank"><span style="color:red;font-weight:bold;">sdfasv534</span></a>
<a href="show.php?hash=23123rfsadfd" target="_blank">asfasdc312</a>
<a href="show.php?hash=sagdbh54537564" target="_blank">asdfasv23312</a>
<a href="show.php?hash=asgdhgnetr4231" target="_blank"><span style="color:red;font-weight:bold;">agfeh43</span></a>
<a href="show.php?hash=awdfsagr424354" target="_blank"><span style="color:red;font-weight:bold;">ujk6454</span></a>
<a href="show.php?hash=23123rswefds" target="_blank">3435gdsds</a>
...如上面,有的A标签中还包含SPAN,有的不包含。我现在想用正则取出标签里的值。请问正则该如何写。谢谢。。正则

解决方案 »

  1.   

    (?is)(?<=<a)[^<>]+(?=>)  取不包含span的
      

  2.   


      <tr class="alt1 text_center" onmouseover="highlight(this, 'alt1');">
        <td nowrap="nowrap">09/02 17:29</td>
        <td><a href="index.aspx?sort_id=36">流行</a></td>
        <td class="text_left">
    <img src="images/put_top.gif" alt="置顶" title="置顶" align="absmiddle" />&nbsp;
    <a href="show.aspx?hash=758dc94e56956b6e4cc78d12d1bb07a7af38217f" target="_blank">
    <span style="color:red;font-weight:bold;">东风破</span></a>
    </td>
        <td nowrap="nowrap"><span class="bts_1">12</span></td>
        <td nowrap="nowrap"><span class="btl_1">75</span></td>
        <td nowrap="nowrap"><span class="btc_1">7010</span></td>
      </tr>
        <tr class="alt1 text_center" onmouseover="highlight(this, 'alt1');">
        <td nowrap="nowrap">09/02 17:29</td>
        <td><a href="index.aspx?sort_id=37">流行</a></td>
        <td class="text_left">
    <img src="images/put_top.gif" alt="置顶" title="置顶" align="absmiddle" />&nbsp;
    <a href="show.aspx?hash=315dc94e56956b6e4cc78d12d1bb07a7af38217f" target="_blank">菊花台</a>
    </td>
        <td nowrap="nowrap"><span class="bts_1">12</span></td>
        <td nowrap="nowrap"><span class="btl_1">75</span></td>
        <td nowrap="nowrap"><span class="btc_1">7010</span></td>
      </tr>例如是上面的这种,我想取出东风破,菊花台这2条,应该用什么正则?
      

  3.   

    (?i)<a\b[^>]*?>(?:<span[^>]*?>)?([^<>]+)(?:</span>)?</a>不论有没有span都可以取到
    要取m.Groups[1].Value
      

  4.   

    string pattern=@"(?is)(?<=<td\s*class=""text_left"">.*?<a[^>]*?href=[""'])[^""']+(?=[""'])";
      

  5.   

    (?i)<a\b[^>]*?>(?:\s*<span[^>]*?>)?([^<>]+)(?:</span>\s*)?</a>
      

  6.   


    我试了一下,没SPAN的可以取到,有SPAN的取不到。