我的方法getPageContent():public String getPageContent(String urlStr)
{
URL url;
StringBuffer srb = new StringBuffer();
String s = "";
try {
url = new URL(urlStr);
BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
while((s = br.readLine()) != null)
{
srb.append(s);
}
} catch (MalformedURLException e) {
e.printStackTrace();
}catch (IOException e)
{
e.printStackTrace();
}
return srb.toString();
}所抓的网页http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=296-16793-6-ND使用getPageContent("http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=296-16793-6-ND");得到页面结果。可是中文字符在抓取的内容中没有, 很是奇怪 ,哪位高手指点下~~
{
URL url;
StringBuffer srb = new StringBuffer();
String s = "";
try {
url = new URL(urlStr);
BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
while((s = br.readLine()) != null)
{
srb.append(s);
}
} catch (MalformedURLException e) {
e.printStackTrace();
}catch (IOException e)
{
e.printStackTrace();
}
return srb.toString();
}所抓的网页http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=296-16793-6-ND使用getPageContent("http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=296-16793-6-ND");得到页面结果。可是中文字符在抓取的内容中没有, 很是奇怪 ,哪位高手指点下~~
解决办法:
BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream(),"UTF-8"));
把读取的内容转换成utf-8的格式。
ok 搞定。