已经获得了html的源代码,然后我需要一个正则表达式,要在一堆html中提取:
以<div class="tpc_content" id="read_tpc">字样开头的,最近紧跟着的</th>字样为结尾的,中间的所有的jpg图片地址,如提取示例html代码中符合条件的
http://abc.com/download/BoJa_zgeE79Yz38V7dukjw/28482/284811795/99502909933561762654.jpg
http://abc/download/dt25AwN3f7aaes1AtfZCtA/28482/284811788/96380353037821193833.jpg不要全部提取全部html的jpg图片的正则。只要符合上述条件的。==========================
HTML代码如下:
htmlxxxxxxxxxxx<div class="tpc_content" id="read_tpc"><br>【资源名】:21253 時間】:00:59:55<br><br>【影片格式】:MP4<br><br>【影片大小】:940 MB (986,146,433 字节)<br><br>【aabb】:<br><img src="http://abc.com/download/BoJa_zgeE79Yz38V7dukjw/28482/284811795/99502909933561762654.jpg" border="0" onclick="if(this.width>screen.width-461) window.open('http://abc.com/download/BoJa_zgeE79Yz38V7dukjw/28482/284811795/99502909933561762654.jpg');" ><br><img src="http://abc/download/dt25AwN3f7aaes1AtfZCtA/28482/284811788/96380353037821193833.jpg" border="0" onclick="if(this.width>screen.width-461) window.open('http://abc.com/download/dt25AwN3f7aaes1AtfZCtA/28482/284811788/96380353037821193833.jpg');" rtyu><br><br><br><br></div>   </th>htmlxxxxxxxxxxx
=================
正则如下:
string strM = @"";
foreach (Match m in Regex.Matches(strHtmlBody, strM))
                    {}

解决方案 »

  1.   


    var s = "htmlxxxxxxxxxxx<div class=\"tpc_content\" id=\"read_tpc\"><br>【资源名】:21253 時間】:00:59:55<br><br>【影片格式】:MP4<br><br>【影片大小】:940 MB (986,146,433 字节)<br><br>【aabb】:<br><img src=\"http://abc.com/download/BoJa_zgeE79Yz38V7dukjw/28482/284811795/99502909933561762654.jpg\" border=\"0\" onclick=\"if(this.width>screen.width-461) window.open('http://abc.com/download/BoJa_zgeE79Yz38V7dukjw/28482/284811795/99502909933561762654.jpg');\" ><br><img src=\"http://abc/download/dt25AwN3f7aaes1AtfZCtA/28482/284811788/96380353037821193833.jpg\" border=\"0\" onclick=\"if(this.width>screen.width-461) window.open('http://abc.com/download/dt25AwN3f7aaes1AtfZCtA/28482/284811788/96380353037821193833.jpg');\" rtyu><br><br><br><br></div>   </th>htmlxxxxxxxxxxx";
    var matches = Regex.Matches(s, @"\<img\s+src=""(?<imgfilename>[^""]+)""");
    foreach (Match match in matches)
    {
        Console.WriteLine(match.Groups["imgfilename"].Value);
    }