<div class="ry_box">
<dl class="ry_boxleft">
<p class="ry_picbd"><a href="http://www.m1905.com/yx/film/c1f509553.html"><img src="http://image11.m1905.cn/uploadfile/2012/0406/thumb_1_98_137_20120406050041947.jpg" alt="我是中国人" title="我是中国人" /></a></p>
</dl>
<dl class="line-h24 ry_boxright">
<p class="f14 color_blue h24px">
<span class="fl"><a href="http://www.m1905.com/yx/film/c1f509553.html" title="我是中国人">我是中国人</a></span>
<span class="sm_star1 fl ml05 mt02"><span class="sm_star2" style="width:58%;"></span><span class="star_cont" style="display:none">5.8分</span></span>
</p>
<p class="mt02"><span class="color_gray1">主演:</span><a href="http://www.m1905.com/mdb/film/list/starring-2994182/" target="_blank" title="查看该演员参加的影片">李乾铭</a> / <a href="http://www.m1905.com/mdb/film/list/starring-1014/" target="_blank" title="查看该演员参加的影片">颜丹晨</a> / <a href="http://www.m1905.com/mdb/film/list/starring-2991308/" target="_blank" title="查看该演员参加的影片">张岩</a></p>
<p><span class="color_gray1">类型:</span><a href="http://www.m1905.com/mdb/film/list/mtype-15/" target="_blank">剧情</a> <a href="http://www.m1905.com/mdb/film/list/mtype-30/" target="_blank">战争</a> </p>
<p><span class="color_gray1">上映时间:</span>2012-04-19</p>
<p class="time_net mt10 color_blue"><a href="http://www.m1905.com/yx/film/c1f509553.html">放映时间表</a></p>
</dl>
</div>用正则表达式获取 <dl class="line-h24 ry_boxright">
里边a标签上 "http://www.m1905.com/yx/film/c1f509553.html" 以及title
请假各位大侠。
<dl class="ry_boxleft">
<p class="ry_picbd"><a href="http://www.m1905.com/yx/film/c1f509553.html"><img src="http://image11.m1905.cn/uploadfile/2012/0406/thumb_1_98_137_20120406050041947.jpg" alt="我是中国人" title="我是中国人" /></a></p>
</dl>
<dl class="line-h24 ry_boxright">
<p class="f14 color_blue h24px">
<span class="fl"><a href="http://www.m1905.com/yx/film/c1f509553.html" title="我是中国人">我是中国人</a></span>
<span class="sm_star1 fl ml05 mt02"><span class="sm_star2" style="width:58%;"></span><span class="star_cont" style="display:none">5.8分</span></span>
</p>
<p class="mt02"><span class="color_gray1">主演:</span><a href="http://www.m1905.com/mdb/film/list/starring-2994182/" target="_blank" title="查看该演员参加的影片">李乾铭</a> / <a href="http://www.m1905.com/mdb/film/list/starring-1014/" target="_blank" title="查看该演员参加的影片">颜丹晨</a> / <a href="http://www.m1905.com/mdb/film/list/starring-2991308/" target="_blank" title="查看该演员参加的影片">张岩</a></p>
<p><span class="color_gray1">类型:</span><a href="http://www.m1905.com/mdb/film/list/mtype-15/" target="_blank">剧情</a> <a href="http://www.m1905.com/mdb/film/list/mtype-30/" target="_blank">战争</a> </p>
<p><span class="color_gray1">上映时间:</span>2012-04-19</p>
<p class="time_net mt10 color_blue"><a href="http://www.m1905.com/yx/film/c1f509553.html">放映时间表</a></p>
</dl>
</div>用正则表达式获取 <dl class="line-h24 ry_boxright">
里边a标签上 "http://www.m1905.com/yx/film/c1f509553.html" 以及title
请假各位大侠。
Dictionary<string, string> dic = new Dictionary<string, string>();
MatchCollection mc = Regex.Matches(input, @"(?is)<dl\s*class=""line-h24 ry_boxright"">\s*<p [^>]*>\s*<span[^>]*><a\s*href=""([^""]*)""\s*title=""([^""]*)"">.*?</a></span>.*?\s*</p>");
foreach (Match mx in mc )
{
Console.WriteLine(mx.Groups[1].Value);//http://www.m1905.com/yx/film/c1f509553.html
Console.WriteLine(mx.Groups[2].Value); //我是中国人
dic.Add(mx.Groups[1].Value, mx.Groups[2].Value);
}
贴出来的html文本是从网页上获取的即时变化的数据,不是固定的文本文件,这个需要怎么处理呢?
非常感谢 这个正则表达式取出了我想要的数据。问题补充:如果要获取所有的数据,即取到数据总数该怎么设置。完整代码如下:
<div class="ry_box">
<dl class="ry_boxleft">
<p class="ry_picbd"><a href="http://www.m1905.com/yx/film/c1f497330.html"><img src="http://image11.m1905.cn/uploadfile/2012/0323/thumb_1_98_137_20120323112327980.jpg" alt="关于爱情和那些魔鬼" title="关于爱情和那些魔鬼" /></a></p>
</dl>
<dl class="line-h24 ry_boxright">
<p class="f14 color_blue h24px">
<span class="fl"><a href="http://www.m1905.com/yx/film/c1f497330.html" title="关于爱情和那些魔鬼">关于爱情和那些魔鬼</a></span>
<span class="sm_star1 fl ml05 mt02"><span class="sm_star2" style="width:68%;"></span><span class="star_cont" style="display:none">6.8分</span></span>
</p>
<p class="mt02"><span class="color_gray1">主演:</span><a href="http://www.m1905.com/mdb/film/list/starring-2340/" target="_blank" title="查看该演员参加的影片">杨童舒</a> / <a href="http://www.m1905.com/mdb/film/list/starring-1567/" target="_blank" title="查看该演员参加的影片">莫少聪</a> / <a href="http://www.m1905.com/mdb/film/list/starring-15997/" target="_blank" title="查看该演员参加的影片">洪剑涛</a></p>
<p><span class="color_gray1">类型:</span><a href="http://www.m1905.com/mdb/film/list/mtype-25/" target="_blank">喜剧</a> <a href="http://www.m1905.com/mdb/film/list/mtype-1/" target="_blank">爱情</a> <a href="http://www.m1905.com/mdb/film/list/mtype-15/" target="_blank">剧情</a> </p>
<p><span class="color_gray1">上映时间:</span>2012-04-01</p>
<p class="time_net mt10 color_blue"><a href="http://www.m1905.com/yx/film/c1f497330.html">放映时间表</a></p>
</dl>
</div>
<div class="blue14fontbold" id="new_page">
<span><img height="11" width="6" src="http://www.m1905.com/m_images/images/pageleft.jpg"> </span><span>总数:<b>23</b></span> <a class="pre" href="http://www.m1905.com/yx/film/c1p0.html">上一页</a><u><b>1</b></u> <a href="http://www.m1905.com/yx/film/c1p2.html">2</a> <a href="http://www.m1905.com/yx/film/c1p3.html">3</a> <a class="next" href="http://www.m1905.com/yx/film/c1p2.html">下一页</a><span> <img height="11" width="6" src="http://www.m1905.com/m_images/images/pageright.jpg"></span>
</div>
如果只取a标签上 "http://www.m1905.com/yx/film/c1f509553.html" c1f509553以及title这个正则表达式@"(?is)<dl\s*class=""line-h24 ry_boxright"">\s*<p [^>]*>\s*<span[^>]*><a\s*href=""([^""]*)""\s*title=""([^""]*)"">.*?</a></span>.*?\s*</p>");
怎么修改一下。我对正则一片茫然啦!
Dictionary<string, string> dic = new Dictionary<string, string>();
MatchCollection mc = Regex.Matches(input, @"(?is)<dl\s*class=""line-h24 ry_boxright"">\s*<p [^>]*>\s*<span[^>]*><a\s*href="".*?/([^/]*).html""\s*title=""([^""]*)"">.*?</a></span>.*?\s*</p>");
foreach (Match mx in mc )
{
Console.WriteLine(mx.Groups[1].Value);//c1f509553
Console.WriteLine(mx.Groups[2].Value);//我是中国人
dic.Add(mx.Groups[1].Value, mx.Groups[2].Value);
}
</div>
在以上取到“c1f509553”以及“我是中国人”之后 在从这个div里获取a标签里href上的内容,即:http://www.m1905.com/yx/film/c1p3.html的正则怎么写,他们可以合并么?