<table class="tblist" cellpadding="0" cellspacing="0"><tr><td class="t"><a href="http://sz.58.com/zufang/5758837487494x.shtml" target="_blank" class="t">心仪的房子 就要和心仪的人一起住</a><span class="ico area"><a href='/lianhuabeicun/zufang/'>莲花北村</a> - <a href="http://sz.58.com/xiaoqu/lianhua/lianhuabeicun/chuzu/" target="_blank">莲花北村</a></span><span class="ico tu"></span><a title="置顶信息" target="_blank" href="http://about.58.com/zhiding.html"><span class="ico ding"></span></a></td>
<td class="tc" ><b class="pri">3300</b></td>
<td class="tc" >2室2厅1卫</td>
<td class="tc" >今天</td>
</tr><tr><td class="t"><a href="http://sz.58.com/zufang/5760595473410x.shtml" target="_blank" class="t">福田中心区Coco Park附近信托花园带独立阳台</a><span class="ico area"><a href='/zhongxinqu/zufang/'>中心区</a> - <a href="http://sz.58.com/xiaoqu/shixia/xintuohuayuan/chuzu/" target="_blank">信托花园</a></span><span class='ico biz'>(个人)</span><span class="ico tu"></span><a title="置顶信息" target="_blank" href="http://about.58.com/zhiding.html"><span class="ico ding"></span></a></td>
<td class="tc" ><b class="pri">1100</b></td>
<td class="tc" >1室0厅0卫</td>
<td class="tc" >今天</td>
</tr>
<tr>
<td class="t"><a href="http://sz.58.com/zufang/5420888250119x.shtml" target="_blank" class="t">()深圳大学生,上班族之家()</a><span class="ico area"><a href='/kejiyuan/zufang/'>科技园</a> - 限学生、白领...</span><span class='ico biz'>(个人)</span><span class="ico tu"></span><a title="置顶信息" target="_blank" href="http://about.58.com/zhiding.html"><span class="ico ding"></span></a></td>
<td class="tc" ><b class="pri">600</b></td>
<td class="tc" >4室2厅2卫</td>
<td class="tc" >今天</td>
</tr></table>请问我如何或得这3个tr中的详细页面的链接,也就是下面3个链接,请大虾们指教,很急着要http://sz.58.com/zufang/5758837487494x.shtml
http://sz.58.com/zufang/5760595473410x.shtml
http://sz.58.com/zufang/5420888250119x.shtml
<td class="tc" ><b class="pri">3300</b></td>
<td class="tc" >2室2厅1卫</td>
<td class="tc" >今天</td>
</tr><tr><td class="t"><a href="http://sz.58.com/zufang/5760595473410x.shtml" target="_blank" class="t">福田中心区Coco Park附近信托花园带独立阳台</a><span class="ico area"><a href='/zhongxinqu/zufang/'>中心区</a> - <a href="http://sz.58.com/xiaoqu/shixia/xintuohuayuan/chuzu/" target="_blank">信托花园</a></span><span class='ico biz'>(个人)</span><span class="ico tu"></span><a title="置顶信息" target="_blank" href="http://about.58.com/zhiding.html"><span class="ico ding"></span></a></td>
<td class="tc" ><b class="pri">1100</b></td>
<td class="tc" >1室0厅0卫</td>
<td class="tc" >今天</td>
</tr>
<tr>
<td class="t"><a href="http://sz.58.com/zufang/5420888250119x.shtml" target="_blank" class="t">()深圳大学生,上班族之家()</a><span class="ico area"><a href='/kejiyuan/zufang/'>科技园</a> - 限学生、白领...</span><span class='ico biz'>(个人)</span><span class="ico tu"></span><a title="置顶信息" target="_blank" href="http://about.58.com/zhiding.html"><span class="ico ding"></span></a></td>
<td class="tc" ><b class="pri">600</b></td>
<td class="tc" >4室2厅2卫</td>
<td class="tc" >今天</td>
</tr></table>请问我如何或得这3个tr中的详细页面的链接,也就是下面3个链接,请大虾们指教,很急着要http://sz.58.com/zufang/5758837487494x.shtml
http://sz.58.com/zufang/5760595473410x.shtml
http://sz.58.com/zufang/5420888250119x.shtml
//提取HTML文件的文本内容
private static String getDocument(File html) {
String text = "";
try {
//设置编码集
// org.jsoup.nodes.Document doc = Jsoup.parse(html, "UTF-8");
org.jsoup.nodes.Document doc = Jsoup.parse(html,"GBK"); //提取标题信息
Elements title = doc.select("title");
for (org.jsoup.nodes.Element link : title) {
text += link.text() + " ";
}
//提取table中的文本信息
Elements links = doc.select("table");
for (org.jsoup.nodes.Element link : links) {
text += link.text() + " ";
}
//提取div中的文本信息
Elements divs = doc.select("div[class=post]");
for (org.jsoup.nodes.Element link : divs) {
text += link.text() + " ";
}
} catch (IOException e) {
e.printStackTrace();
} return text;
}一个例子 你自己改改
如果你对正则有兴趣 也可以直接用正则去提取a标签的值
个人感觉jsoup还是比较省事的