我获取了一个html页面,里面有
<html><table>
<tr><td>dsfasdfasdfasdfsadfsadfas</td></tr>
</table><table>
<tr><td>
<a href="detial.asp?id=83&type=嘿嘿">详细</a>
<a href="detial.asp?id=84&type=嘿嘿">详细</a>
<a href="detial.asp?id=85&type=嘿嘿">详细</a>
<a href="detial.asp?id=86&type=嘿嘿">详细</a> </td>
</tr>
</table></html>我只获取到 "detial.asp?id=数字&type=嘿嘿" 这个地方添加到listbox里去就行了这个要怎么遍历还有写正规表达式
<html><table>
<tr><td>dsfasdfasdfasdfsadfsadfas</td></tr>
</table><table>
<tr><td>
<a href="detial.asp?id=83&type=嘿嘿">详细</a>
<a href="detial.asp?id=84&type=嘿嘿">详细</a>
<a href="detial.asp?id=85&type=嘿嘿">详细</a>
<a href="detial.asp?id=86&type=嘿嘿">详细</a> </td>
</tr>
</table></html>我只获取到 "detial.asp?id=数字&type=嘿嘿" 这个地方添加到listbox里去就行了这个要怎么遍历还有写正规表达式
MatchCollection mc = reg.Matches("");
foreach (Match m in mc)
{
Console.Write(m.Groups["url"].Value);
}
假设你在webbrowser控件中访问了该页,然后可以这么获取:HtmlElement Submit;
HtmlElementCollection hec = this.webBrowser1.Document.GetElementsByTagName("a");
foreach (HtmlElement he in hec)
{
if (he.GetAttribute("href") == "detial.asp?id=数字&type=嘿嘿")
{
Submit = he;//找到按钮
listbox.Add(he.InnerText);
break;
}
}
private static void TestRegex03()
{
string html = @"<html><table>
<tr><td>dsfasdfasdfasdfsadfsadfas</td></tr>
</table><table>
<tr><td>
<a href=""detial.asp?id=83&type=嘿嘿"">详细</a>
<a href=""detial.asp?id=84&type=嘿嘿"">详细</a>
<a href=""detial.asp?id=85&type=嘿嘿"">详细</a>
<a href=""detial.asp?id=86&type=嘿嘿"">详细</a> </td>
</tr>
</table></html>";
MatchCollection mc = Regex.Matches(html, @"(?<=<a\s*href=(['""]?))[^ '""]+?(?=\1>)");
foreach (Match m in mc)
{
//m.Value就是你要的
Console.WriteLine(m.Value + " [位置在:" + m.Index.ToString() + "]");
}
}
Console.WriteLine(m.Value + " [位置在:" + m.Index.ToString() + "]");
换为
ListBox1.Items.Add(m.Value);
<tr bgcolor="#FFFFFF" onMouseOver="this.style.backgroundColor='#999999'" onMouseOut="this.style.backgroundColor='#ffffff'" style="CURSOR: hand" title="--查看详细内容--" onClick="javascript:see('Detial.asp?id=36&type=嘿嘿')" >
</tr> <tr bgcolor="#FFFFFF" onMouseOver="this.style.backgroundColor='#999999'" onMouseOut="this.style.backgroundColor='#ffffff'" style="CURSOR: hand" title="--查看详细内容--" onClick="javascript:see('Detial.asp?id=26&type=嘿嘿')" >
</tr>
<tr bgcolor="#FFFFFF" onMouseOver="this.style.backgroundColor='#999999'" onMouseOut="this.style.backgroundColor='#ffffff'" style="CURSOR: hand" title="--查看详细内容--" onClick="javascript:see('Detial.asp?id=48&type=嘿嘿')" >
</tr>
<tr bgcolor="#FFFFFF" onMouseOver="this.style.backgroundColor='#999999'" onMouseOut="this.style.backgroundColor='#ffffff'" style="CURSOR: hand" title="--查看详细内容--" onClick="javascript:see('Detial.asp?id=40&type=嘿嘿')" >
</tr>
我只要获取 Detial.asp?id=数字&type=嘿嘿 上面 4个
MatchCollection mc = Regex.Matches(yourHtml,@"(?i)(?<=<tr.+?javascript:see\(')[^']+");
<tr><td>dsfasdfasdfasdfsadfsadfas</td></tr>
</table><table>
<tr><td>
<a href="detial.asp?id=83&type=嘿嘿">详细</a>
<a href="detial.asp?id=84&type=嘿嘿">详细</a>
<a href="detial.asp?id=85&type=嘿嘿">详细</a>
<a href="detial.asp?id=86&type=嘿嘿">详细</a> </td>
</tr>
</table></html>
我只要获取html页面上的像这种地址---> detial.asp?id=83&type=嘿嘿
detial.asp?id=84&type=嘿嘿
detial.asp?id=85&type=嘿嘿 其它东西都不要的。
MatchCollection mc = Regex.Match(yourHtml,@"(?is)detial.asp\?id=[^""']+");