有这样一段html代码<html>
<head>
<title>
</title></head>
<body>
<form id="form1">
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接1</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接2</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接3</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接4</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接5</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接6</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接7</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接8</a>
<a href="http://www.165doc.com/view/asdfasd2ldfgpo.html">链接9</a>
</form>
</body>
</html>
我怎么才能把这个源文件中的所有链接(http://www.165doc.com/view/asdfasd21dfgpo.html)给取出来呢???

解决方案 »

  1.   

    Regex re = new Regex("href *= *['"]*(\S+)["']",RegexOptions.None);
      

  2.   

                string str = @"<html>
    <head>
    <title>
    </title></head>
    <body>
    <form id=""form1"">
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html\"">链接1</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接2</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接3</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接4</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接5</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接6</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接7</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接8</a>
    <a href=""http://www.165doc.com/view/asdfasd2ldfgpo.html"">链接9</a>
    </form>
    </body>
    </html>";
                Regex reg = new Regex(@"(?is)<a[^>]*?href=(['""\s]?)([^'""\s]+)\1[^>]*?>");
                foreach (Match m in reg.Matches(str))
                {
                    Response.Write(m.Groups[2].Value + "<br/>");
                }