WebClient的超级怪问题

how are you calling? are you converting the bytes[] into a string with System.Text.Encoding.GetEncoding("GB2312")?

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

获取内容是用这个
public static string GetRemoteData(string Url)
{
WebClient client = new WebClient();
Stream stream = client.OpenRead(Url);
StreamReader sr = new StreamReader(stream,System.Text.Encoding.GetEncoding("GB2312"));
return sr.ReadToEnd();
}下载文件是用这个
public static void GetRemoteFile(string Url,string LocalFile)
{
WebClient client = new WebClient();
client.DownloadFile(Url,System.Web.HttpContext.Current.Server.MapPath(LocalFile));
}目前只发现http://www.xicn.net/joke/special/gxsy.php?page=1这个页面出问题，其他网站的都挺好
what if you use a page like<%@ Import Namespace="System.IO" %>
<%@ Import Namespace="System.Net" %>
<script language="C#" runat="server">
void Page_Load(Object sender, EventArgs e)
{
string Url = "http://www.xicn.net/joke/special/gxsy.php?page=1";
WebClient client = new WebClient();
Stream stream = client.OpenRead(Url);
StreamReader sr = new StreamReader(stream,System.Text.Encoding.GetEncoding("GB2312"));
Response.Write(sr.ReadToEnd());
sr.Close();
}
</script>看内容是否确实是你想象的那样
内容的确和我截取的一样，你尝试用正则来匹配中间图片列表的那部分。
pattern = @"      <table width=\""100%\"" border=\""0\"" cellspacing=\""8\"" cellpadding=\""8\"">(?<text>.[^\x01]*?)<TD COLSPAN=10><BR>";
上面是我的匹配规则，你可以分别用程序下载把内容保存到文件，然后在手动复制内容保存到文件（注：要新建文件方可）然后看看他们的匹配结果是否相同。
the result is same except some missing newlines, also why
(?<text>.[^\x01]*?)??can you try
string pattern = @"<table width=\""100%\"" border=\""0\"" cellspacing=\""8\"" cellpadding=\""8\"">(?<text>.*?)<TD COLSPAN=10><BR>";
Regex re = new Regex(pattern, RegexOptions.Singleline | RegexOptions.IgnoreCase);