两个关于正则表达式的问题 为什么要用正则,用innerHTML不是更简单吗。 解决方案 » 免费领取超大流量手机卡,每月29元包185G流量+100分钟通话, 中国电信官方发货 <head>(.*?)</head><body>(.*?)</body> string inputString = "<html>IIII<head><title>塞北的雪</title></head>UUUU<body><table><tr><td><a class='m' href='http://www.csdn.net'>CSDN</a></td><td><a class='m' href='http://blog.csdn.net/precipitant'>塞北的雪</a></td><td><a class='m' href='http://blog.csdn.net/net_lover'>好人</a></td></tr></table>我市一个好人,你是不是好人呢?</body></html>"; StringBuilder sb = new StringBuilder(); Regex reg = null; Match mch = null; reg = new Regex(@"<\s*?head\s*?>(.*?)</head\s*?>.*?<\s*?body\s*?>(.*?)</body.*?>", RegexOptions.IgnoreCase | RegexOptions.Compiled); for (mch = reg.Match(inputString); mch.Success; mch = mch.NextMatch()) { sb.AppendLine("head:" + mch.Groups[1]); sb.AppendLine("body:" + mch.Groups[2]); } MessageBox.Show(sb.ToString()); string inputString = "<html>IIII<head style=''><title>塞北的雪</title></head>UUUU<body style=''ssssss><table><tr><td><a class='m' href='http://www.csdn.net'>CSDN</a></td><td><a class='m' href='http://blog.csdn.net/precipitant'>塞北的雪</a></td><td><a class='m' href='http://blog.csdn.net/net_lover'>好人</a></td></tr></table>我市一个好人,你是不是好人呢?</body></html>"; StringBuilder sb = new StringBuilder(); Regex reg = null; Match mch = null; reg = new Regex(@"<\s*?head(\s+.*?)?>(.*?)</head\s*?>.*?<\s*?body(\s+.*?)?>(.*?)</body.*?>", RegexOptions.IgnoreCase | RegexOptions.Compiled); for (mch = reg.Match(inputString); mch.Success; mch = mch.NextMatch()) { sb.AppendLine("attribute of head:" + mch.Groups[1]); sb.AppendLine("attribute of body:" + mch.Groups[3]); sb.AppendLine("head:" + mch.Groups[2]); sb.AppendLine("body:" + mch.Groups[4]); } MessageBox.Show(sb.ToString()); 我的HTML不是写在一行中的,匹配不出来 原文可能会是这样:<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd"><html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf8" /> <title>AST</title> </head> <body> <div style="background-color:#8796BF"> <img src="logo.gif" alt="ACCESS CHINA TEST SITE" width="212" height="37" /> </div> <h2 style="font-family:verdana">Welcome to Access Test Repository</h2> <h5><a href="readme.htm">Read Me</a></h5> <hr /> <ul> <li> <a href="Browser/index.htm">Browser</a></li> <li> <a href="email/index.htm">Email</a></li> <li> <a href="MMS/index.htm">MMS</a> </li> <li> <a href="Multimedia/index.htm">Multimedia</a></li> <li> <a href="DocViewer/index.htm">Doc Viewer</a></li> <li> <a href="RSS/index.htm">RSS</a></li> <li> <a href="SyncML/index.htm">SyncML</a></li> </ul> <hr /> <p> <small>Copyright 2005-2006 AccessChina Corp. All Rights Reserved Nanjing QA Dept. </small> </p> </body></html> 就是说有可能有一些页面中标记(head或body)后面有换行,而有一些页面的标记后面没有换行 注释属性,内容,都要考虑,这个东西好麻烦的说最简单的方法就是,把所有的注释全替换掉,所有的非head,body,的html标记全替换掉(就是替换属性),所有的js全替换掉然后在把用正则取贪婪的head 和body即可,然后在把刚所以替换掉的东西,替换回来就可以了这种问题,原先的帖子里有好多的说 这里好像没有我的发言权啊!~呵呵就我级别低.还是说句吧..可能对你有用..你以经定位到<head> </head> <body > </body>这里面的内容啦..为何不去写个XML用于取这里面的值啊!~我记得用<call-param name="itemXPath">//head/这就是你想要的内容</call-param> <call-param name="itemXPath">//table/body/同样的道理</call-param>不知道有没有帮助. 如果用正则的话,就参照http://topic.csdn.net/t/20061108/11/5141765.html里M2前辈的回复就可以了 private void button3_Click(object sender, EventArgs e) { string inputString = this.textBox1.Text.Trim().Replace("\r\n","#@$"); StringBuilder sb = new StringBuilder(); Regex reg = null; Match mch = null; reg = new Regex(@"<\s*?head(\s+.*?)?>(.*?)</head\s*?>.*?<\s*?body(\s+.*?)?>(.*?)</body.*?>", RegexOptions.IgnoreCase | RegexOptions.Compiled); for (mch = reg.Match(inputString); mch.Success; mch = mch.NextMatch()) { sb.AppendLine("attribute of head:" + mch.Groups[1]); sb.AppendLine("attribute of body:" + mch.Groups[3]); sb.AppendLine("head:" + mch.Groups[2]); sb.AppendLine("body:" + mch.Groups[4]); } MessageBox.Show(sb.ToString().Replace("#@$","/r/n")); } 求解释两句正则表达式 模态窗口问题,高手请进 自动适应页面高度的框架 求验证正则 每个25分 有能力的都来看看,我在这里做的很happy,需要找个拍档 问一个VBSCRIPT的问题,谢谢 我把层的宽度设为100%,可为什么还是不能撑满浏览器呢? 广告图片随滚动条移动 迷茫中,请指点 请问如何在服务器端的Jscript 中 判断一个session 变量存不存在? ---------------- JS 中实现静态属性------------------- JS 菜单再次求救
<body>(.*?)</body>
string inputString = "<html>IIII<head><title>塞北的雪</title></head>UUUU<body><table><tr><td><a class='m' href='http://www.csdn.net'>CSDN</a></td><td><a class='m' href='http://blog.csdn.net/precipitant'>塞北的雪</a></td><td><a class='m' href='http://blog.csdn.net/net_lover'>好人</a></td></tr></table>我市一个好人,你是不是好人呢?</body></html>"; StringBuilder sb = new StringBuilder();
Regex reg = null;
Match mch = null; reg = new Regex(@"<\s*?head\s*?>(.*?)</head\s*?>.*?<\s*?body\s*?>(.*?)</body.*?>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
for (mch = reg.Match(inputString); mch.Success; mch = mch.NextMatch())
{
sb.AppendLine("head:" + mch.Groups[1]);
sb.AppendLine("body:" + mch.Groups[2]); }
MessageBox.Show(sb.ToString());
string inputString = "<html>IIII<head style=''><title>塞北的雪</title></head>UUUU<body style=''ssssss><table><tr><td><a class='m' href='http://www.csdn.net'>CSDN</a></td><td><a class='m' href='http://blog.csdn.net/precipitant'>塞北的雪</a></td><td><a class='m' href='http://blog.csdn.net/net_lover'>好人</a></td></tr></table>我市一个好人,你是不是好人呢?</body></html>"; StringBuilder sb = new StringBuilder();
Regex reg = null;
Match mch = null; reg = new Regex(@"<\s*?head(\s+.*?)?>(.*?)</head\s*?>.*?<\s*?body(\s+.*?)?>(.*?)</body.*?>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
for (mch = reg.Match(inputString); mch.Success; mch = mch.NextMatch())
{
sb.AppendLine("attribute of head:" + mch.Groups[1]);
sb.AppendLine("attribute of body:" + mch.Groups[3]);
sb.AppendLine("head:" + mch.Groups[2]);
sb.AppendLine("body:" + mch.Groups[4]); }
MessageBox.Show(sb.ToString());
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf8" />
<title>AST</title>
</head>
<body>
<div style="background-color:#8796BF">
<img src="logo.gif" alt="ACCESS CHINA TEST SITE" width="212" height="37" />
</div>
<h2 style="font-family:verdana">Welcome to Access Test Repository</h2>
<h5><a href="readme.htm">Read Me</a></h5>
<hr />
<ul>
<li>
<a href="Browser/index.htm">Browser</a></li>
<li>
<a href="email/index.htm">Email</a></li>
<li>
<a href="MMS/index.htm">MMS</a>
</li>
<li>
<a href="Multimedia/index.htm">Multimedia</a></li>
<li>
<a href="DocViewer/index.htm">Doc Viewer</a></li>
<li>
<a href="RSS/index.htm">RSS</a></li>
<li>
<a href="SyncML/index.htm">SyncML</a></li>
</ul>
<hr />
<p>
<small>Copyright 2005-2006 AccessChina Corp. All Rights Reserved Nanjing QA Dept. </small>
</p>
</body>
</html>
呵呵
就我级别低.还是说句吧..可能对你有用..你以经定位到<head> </head> <body > </body>这里面的内容啦..为何不去写个XML用于取这里面的值啊!~我记得用<call-param name="itemXPath">//head/这就是你想要的内容</call-param>
<call-param name="itemXPath">//table/body/同样的道理</call-param>不知道有没有帮助.
http://topic.csdn.net/t/20061108/11/5141765.html里M2前辈的回复就可以了
{
string inputString = this.textBox1.Text.Trim().Replace("\r\n","#@$"); StringBuilder sb = new StringBuilder();
Regex reg = null;
Match mch = null; reg = new Regex(@"<\s*?head(\s+.*?)?>(.*?)</head\s*?>.*?<\s*?body(\s+.*?)?>(.*?)</body.*?>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
for (mch = reg.Match(inputString); mch.Success; mch = mch.NextMatch())
{
sb.AppendLine("attribute of head:" + mch.Groups[1]);
sb.AppendLine("attribute of body:" + mch.Groups[3]);
sb.AppendLine("head:" + mch.Groups[2]);
sb.AppendLine("body:" + mch.Groups[4]); }
MessageBox.Show(sb.ToString().Replace("#@$","/r/n"));
}