the result is same except some missing newlines, also why (?<text>.[^\x01]*?)??can you try string pattern = @"<table width=\""100%\"" border=\""0\"" cellspacing=\""8\"" cellpadding=\""8\"">(?<text>.*?)<TD COLSPAN=10><BR>"; Regex re = new Regex(pattern, RegexOptions.Singleline | RegexOptions.IgnoreCase);
public static string GetRemoteData(string Url)
{
WebClient client = new WebClient();
Stream stream = client.OpenRead(Url);
StreamReader sr = new StreamReader(stream,System.Text.Encoding.GetEncoding("GB2312"));
return sr.ReadToEnd();
}下载文件是用这个
public static void GetRemoteFile(string Url,string LocalFile)
{
WebClient client = new WebClient();
client.DownloadFile(Url,System.Web.HttpContext.Current.Server.MapPath(LocalFile));
}目前只发现http://www.xicn.net/joke/special/gxsy.php?page=1这个页面出问题,其他网站的都挺好
<%@ Import Namespace="System.Net" %>
<script language="C#" runat="server">
void Page_Load(Object sender, EventArgs e)
{
string Url = "http://www.xicn.net/joke/special/gxsy.php?page=1";
WebClient client = new WebClient();
Stream stream = client.OpenRead(Url);
StreamReader sr = new StreamReader(stream,System.Text.Encoding.GetEncoding("GB2312"));
Response.Write(sr.ReadToEnd());
sr.Close();
}
</script>看内容是否确实是你想象的那样
pattern = @" <table width=\""100%\"" border=\""0\"" cellspacing=\""8\"" cellpadding=\""8\"">(?<text>.[^\x01]*?)<TD COLSPAN=10><BR>";
上面是我的匹配规则,你可以分别用程序下载把内容保存到文件,然后在手动复制内容保存到文件(注:要新建文件方可)然后看看他们的匹配结果是否相同。
(?<text>.[^\x01]*?)??can you try
string pattern = @"<table width=\""100%\"" border=\""0\"" cellspacing=\""8\"" cellpadding=\""8\"">(?<text>.*?)<TD COLSPAN=10><BR>";
Regex re = new Regex(pattern, RegexOptions.Singleline | RegexOptions.IgnoreCase);