我想解析html,从下面的代码String中返回String“鲜花村现正为视障儿童招募义工,欢迎大家参与,谢谢”,即 返回 “我的公告栏”的内容,这个正则表达式怎么写呢?
谢谢!<div class="title">我的公告栏</div></td></tr><tr id="callboardBody46f37fb505000a0r"><td><table border="0" cellpadding="0" cellspacing="0"><tr><td class="mid"><table class="sysBr180" align="center" border="0" cellpadding="0" cellspacing="0"><tr><td><div align="center">公告</div><div><DIV><FONT FACE="黑体" COLOR="#660000"><FONT SIZE="3">鲜花村现正为视障儿童招募义工,欢迎大家参与,谢谢</FONT>
谢谢!<div class="title">我的公告栏</div></td></tr><tr id="callboardBody46f37fb505000a0r"><td><table border="0" cellpadding="0" cellspacing="0"><tr><td class="mid"><table class="sysBr180" align="center" border="0" cellpadding="0" cellspacing="0"><tr><td><div align="center">公告</div><div><DIV><FONT FACE="黑体" COLOR="#660000"><FONT SIZE="3">鲜花村现正为视障儿童招募义工,欢迎大家参与,谢谢</FONT>
string res="";
foreach( Match m in reg.Matches(str))
{
res+=m.Value+" ";
}
正则(?<=<FONT SIZE="3">).*(?=</FONT>)
如果不 贴出更多的代码
那么下面一段呢:
<a target="_blank" href="/u/46f37fb5010005gk" title=" 围脖儿原来是一只超级小懒猫,熟悉了环境之后,除了吃就是睡,并且已经知道自己跳到沙">日记 [2006年10月08日]</a>返回“围脖儿原来是一只超级小懒猫,熟悉了环境之后,除了吃就是睡,并且已经知道自己跳到沙”
前面的 ">日记 [日期]</a> 是固定的。 前面的话就算截至到 ;吧
title=".+"
或者
title=".+">日记\s\[\d{4}年\d{2}月\d{2}日\]
static void Main()
{
WebRequest request = WebRequest.Create ("http://blog.sina.com.cn/u/1190363061");
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse ();
Console.WriteLine (response.StatusDescription);
Stream dataStream = response.GetResponseStream ();
StreamReader reader = new StreamReader (dataStream);
string responseFromServer = reader.ReadToEnd ();
/*正则表达式 获取网页内容*/
//Console.WriteLine (responseFromServer); //console返回即可
reader.Close ();
dataStream.Close ();
response.Close ();
}
写完整些好吗?运行后立刻给分
{
HtmlWindowCollection hwc = webBrowser1.Document.Window.Frames;
HtmlDocument hd = null;
HTMLDocumentClass hdc = null;
if (hwc != null && hwc.Count == 2)
{
hd = hwc[1].Document;
hdc = (HTMLDocumentClass)hd.DomDocument;
}
else
return;
// Debug.WriteLine(hd.Cookie);
//click the first link
HtmlElementCollection hec = hd.Links;
if (hec.Count >0)
{
for (int i = 0; i < hec.Count; i++)
{
HTMLAnchorElement hae = (HTMLAnchorElement)hec[i].DomElement;
if (hae.outerHTML.Contains("makeResZipRemote"))
{
/*
//"javascript:makeResZipRemote('thin.100406.180841.837(5)_38.txt','thin.100406.180841.837(5)','39219')"
string href = hae.href;
//"('thin.100406.180841.837(5)_38.txt','thin.100406.180841.837(5)','39219')"
href = href.Substring(
href.IndexOf("("));
String delim = "()";
href = href.Trim(delim.ToCharArray());
string[] parameters = href.Split(",".ToCharArray());
delim = "'";
string resdata=parameters[0].Trim(delim .ToCharArray());
string datafile=parameters[1].Trim(delim .ToCharArray());
string eidx=parameters[2].Trim(delim .ToCharArray());
//https://www.thineditest.com/online_core/common/displaySpokeFiles.asp
//function makeResZipRemote(resdata, datafile, eidx)
//{
// document.location.href='../results/standard/ResultsMain.asp?build=true&error_file=' + escape(resdata) + '&data_file=' + escape(datafile) + '&color_file=results%5Fcolors%5Ffsdeflt88%2Ecss' + '&eidx=' + eidx
//}
string DownloadFrameURL = string.Format(
"https://www.thineditest.com/online_core/results/standard/ResultsMain.asp?build=true&error_file={0}&data_file={1}&color_file=results_colors_fsdeflt88.css&eidx={2}"
, Microsoft.JScript.GlobalObject.escape(resdata)
, Microsoft.JScript.GlobalObject.escape(datafile)
, Microsoft.JScript.GlobalObject.escape(eidx));
*/
hae.click();
break;
}
}
}
[\u4E00-\u9FA5]*$
using System.Text.RegularExpressions;System.Text.RegularExpressions.Regex reg=new System.Text.RegularExpressions.Regex(@"[\u4E00-\u9FA5]*$");
string res="";
foreach( Match m in reg.Matches(str))
{
res+=m.Value+" ";
}
res.Replace("我的公告栏","");
res.Replace("公告","");
res.Replace("黑体","");
我也没有“让”你截,我是菜鸟。不过还是谢谢,虽然你没给我什么启示。