<div class="Pages">
<span class="PageSel">1</span>
<a href="/search/category/2/10/g10r2734d1p2/g10r14r1471r2734" class="PageLink" title="2">2</a>
<a href="/search/category/2/10/g10r2734d1p3/g10r14r1471r2734" class="PageLink" title="3">3</a>
<a href="/search/category/2/10/g10r2734d1p4/g10r14r1471r2734" class="PageLink" title="4">4</a>
<a href="/search/category/2/10/g10r2734d1p5/g10r14r1471r2734" class="PageLink" title="5">5</a>
<a href="/search/category/2/10/g10r2734d1p6/g10r14r1471r2734" class="PageLink" title="6">6</a>
<a href="/search/category/2/10/g10r2734d1p7/g10r14r1471r2734" class="PageLink" title="7">7</a>
<a href="/search/category/2/10/g10r2734d1p8/g10r14r1471r2734" class="PageLink" title="8">8</a>
<a href="/search/category/2/10/g10r2734d1p9/g10r14r1471r2734" class="PageLink" title="9">9</a>
<span class="PageMore">...</span>
<a href="/search/category/2/10/g10r2734d1p11/g10r14r1471r2734" class="PageLink" title="11">11</a>
<a href="/search/category/2/10/g10r2734d1p2/g10r14r1471r2734" class="NextPage" title="下一页">下一页</a>
</div>需要匹配href中的链接,和a标签中的文本
如 <a href="/search/category/2/10/g10r2734d1p2/g10r14r1471r2734" class="PageLink" title="2">2</a>
匹配结果: /search/category/2/10/g10r2734d1p2/g10r14r1471r2734 和 2
谢谢!!
<span class="PageSel">1</span>
<a href="/search/category/2/10/g10r2734d1p2/g10r14r1471r2734" class="PageLink" title="2">2</a>
<a href="/search/category/2/10/g10r2734d1p3/g10r14r1471r2734" class="PageLink" title="3">3</a>
<a href="/search/category/2/10/g10r2734d1p4/g10r14r1471r2734" class="PageLink" title="4">4</a>
<a href="/search/category/2/10/g10r2734d1p5/g10r14r1471r2734" class="PageLink" title="5">5</a>
<a href="/search/category/2/10/g10r2734d1p6/g10r14r1471r2734" class="PageLink" title="6">6</a>
<a href="/search/category/2/10/g10r2734d1p7/g10r14r1471r2734" class="PageLink" title="7">7</a>
<a href="/search/category/2/10/g10r2734d1p8/g10r14r1471r2734" class="PageLink" title="8">8</a>
<a href="/search/category/2/10/g10r2734d1p9/g10r14r1471r2734" class="PageLink" title="9">9</a>
<span class="PageMore">...</span>
<a href="/search/category/2/10/g10r2734d1p11/g10r14r1471r2734" class="PageLink" title="11">11</a>
<a href="/search/category/2/10/g10r2734d1p2/g10r14r1471r2734" class="NextPage" title="下一页">下一页</a>
</div>需要匹配href中的链接,和a标签中的文本
如 <a href="/search/category/2/10/g10r2734d1p2/g10r14r1471r2734" class="PageLink" title="2">2</a>
匹配结果: /search/category/2/10/g10r2734d1p2/g10r14r1471r2734 和 2
谢谢!!
解决方案 »
- 网页上图表的缩放是如何实现的?
- win2008服务器 Thread.Sleep丢失
- 怎么理解状态管理?
- 哪位帮我改下存储过程,就是除数为零时,怎么处理下,谢谢 急求!!
- 急!关于Infragistics NetAdvantage 2005 Vol 1 ASP NET Final的问题!
- 水晶报表布署问题
- 写了个msgbox(...),但系统提示.........,我刚开始使用vb.net开发asp.net望帮助?
- 如何下载制定url地址的图片
- 有谁做过对树型菜单操作的例子呀?????????????????????
- 关于jmail邮件发送的问题
- Blog的自定义布局和个性设置怎么弄
- 如何过滤文本框中的html字符
MatchCollection mc = reg.Matches(yourStr);
foreach (Match m in mc)
{
richTextBox2.Text += m.Groups["url"].Value + "\n";
richTextBox2.Text += m.Groups["text"].Value + "\n";
}
那就要进行两次匹配了
Regex regDiv = new Regex(@"(?is)<div\s*class=""Pages""[^>]*>.*?</div>");
Regex regA = new Regex(@"(?is)<a[^>]*?href=(['""]?)(?<url>[^'""\s>]+)\1[^>]*>(?<text>(?:(?!</?a\b).)*)</a>");
MatchCollection mcDiv = regDiv.Matches(yourStr);
foreach (Match mDiv in mcDiv)
{
MatchCollection mcA = regA.Matches(mDiv.Value);
foreach (Match mA in mcA)
{
richTextBox2.Text += mA.Groups["url"].Value + "\n";
richTextBox2.Text += mA.Groups["text"].Value + "\n";
}
}
我看能不能找到最后一页的数字和URL, 比如只找到最后一个
<a href="/search/category/2/10/g10r2734d1p11/g10r14r1471r2734" class="PageLink" title="11">11 </a> 然后前面的链接我可以跟据最后一个循环取出。 这样速度是不是会快一点?
不知道有没有什么好办法。
这个是没有超过10页的标签
<div class="Pages">
<span class="PageSel">1</span>
<a href="/search/category/2/10/g10r2783d1p2/g10r16r1482r2783" class="PageLink" title="2">2</a>
<a href="/search/category/2/10/g10r2783d1p3/g10r16r1482r2783" class="PageLink" title="3">3</a>
<a href="/search/category/2/10/g10r2783d1p4/g10r16r1482r2783" class="PageLink" title="4">4</a>
<a href="/search/category/2/10/g10r2783d1p5/g10r16r1482r2783" class="PageLink" title="5">5</a>
<a href="/search/category/2/10/g10r2783d1p6/g10r16r1482r2783" class="PageLink" title="6">6</a>
<a href="/search/category/2/10/g10r2783d1p2/g10r16r1482r2783" class="NextPage" title="下一页">下一页</a>
</div>
Regex regA = new Regex(@"(?is)<a[^>]*?href=(['""]?)(?<url>[^'""\s>]+)\1[^>]*>(?<text>(?:(?!</?a\b).)*)</a>");
MatchCollection mcDiv = regDiv.Matches(text);
foreach (Match mDiv in mcDiv)
{
MatchCollection mcA = regA.Matches(mDiv.Value);
Match m = mcA[mcA.Count - 2];
string url = m.Groups["url"].Value;
string text = m.Groups["text"].Value;
}