正则匹配问题

html页面内容，我想匹配 <li><a href="http://news.sina.com.cn/c/2013-06-26/152627504374.shtml" target="_blank">原重庆大学校长林建华任浙江大学校长</a></li>这里面的URL地址，它有两个特点：
1、.shtml 结尾；
2、最后的.shtml之前的名称是数字，如上面：152627504374正则

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

(?i)<li>\s*?<a[^>]*?href=(['""]?)(?<Url>[^'""]*?/\d+?\.shtml)\1[^>]*?>(?<Text>[^<>]*?)</a>\s*?</li>整体匹配Value url分组Groups[Url] 内容分组Groups[Text]
(?is)<li>\s*<a\s*href=(["']?)(?<href>[^"']*?/(?<num>\d+)\.shtml)\1[^>]*?>.*?</a>\s*</li>
string pattern=@"(?is)<li>\s*<a\s*href=([""']?)(?<href>[^""']*?/(?<num>\d+)\.shtml)\1[^>]*?>.*?</a>\s*</li>";取Groups["href"].Value和Groups["num"].Value
http://news.sina.com.cn/china/匹配不到啊。
@"(?<=href=")[^\s"]+?(?=")"
会匹配不到？
呃，改正一下，上面那个是Console.WriteLine出来的结果。下面才是程序代码用的：
"(?<=href=\")[^\s\"]+?(?=\")"
再改...
           string ass ="(?<=href=\")[^\\s\"]+?(?=\")";
上面的匹配不到啊。   Regex detailrg = new Regex("(?<=href=\")[^\\s\"]+?(?=\")", RegexOptions.Multiline | RegexOptions.Singleline);
           string a=  detailrg.Match(htmldata).Value;
你想要什么？
string htmldata = @"<li><a href=""http://news.sina.com.cn/c/2013-06-26/152627504374.shtml"" target=""_blank"">原重庆大学校长林建华任浙江大学校长</a></li>";
                Regex detailrg = new Regex(@"(?i)<a[^>]*?href=(['""]?)(?<Url>[^'""]*?/(\d+?)\.shtml)\1[^>]*?>(?<Text>[^<>]*?)</a>", RegexOptions.Multiline | RegexOptions.Singleline);
                Match _m = detailrg.Match(htmldata);
                string url = _m.Groups["Url"].Value;//http://news.sina.com.cn/c/2013-06-26/152627504374.shtml
                string Num = _m.Groups[2].Value;//152627504374
                string text = _m.Groups["Text"].Value;//原重庆大学校长林建华任浙江大学校长
我要匹配http://news.sina.com.cn/china/ 里面的所有URL...
string url = "http://news.sina.com.cn/china/ ";//要获取的页面地址
                WebRequest req = WebRequest.Create(url);
                using (WebResponse res = req.GetResponse())    // GetResponse blocks until the response arrives
                {
                    using (Stream ReceiveStream = res.GetResponseStream())    // Read the stream into a string
                    {
                        StreamReader sr = new StreamReader(ReceiveStream);
                        string resultstring = sr.ReadToEnd();
                        var list = Regex.Matches(resultstring, @"(?i)<a[^>]*?href=(['""]?)(?<Url>[^'""]*?/(\d+?)\.shtml)\1[^>]*?>(?<Text>[^<>]*?)</a>").OfType<Match>().Select(a=>a.Groups["Url"].Value).ToList();
                        /*
                          [0] "http://news.sina.com.cn/c/2013-06-26/150027504392.shtml" string
                        [1] "http://news.sina.com.cn/c/2013-06-26/150027504392.shtml" string
                        [2] "http://news.sina.com.cn/c/2013-06-26/180027505390.shtml" string
                        [3] "http://news.sina.com.cn/c/2013-06-27/023927507832.shtml" string
                         *       .
                         *       .
                         *       .
                         *       .

                         */
                    }
                }
<li><a\s*.*href="(?<url>[^"]*/\d*\.shtml)".*?</a></li>
可以了，但如何再加一个条件呢？比如加：.html 结尾的URL地址；
var list = Regex.Matches(resultstring, @"(?i)<a[^>]*?href=(['""]?)(?<Url>[^'""]*?/(\d+?)\.(shtml|html))\1[^>]*?>(?<Text>[^<>]*?)</a>").OfType<Match>().Select(a=>a.Groups["Url"].Value).ToList();
为什么要那么麻烦呢？HTML里面，"href="后面两个引号之间的就是url了。