java正则表达式截取字符串

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

public static void main(String[] args) {
String str = "<div><h3 ..>dsijiswer*dfhjgf</h3></div><table><h3>sdsd</h3></table>";
Pattern p = Pattern.compile("<h3.*?/h3>");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group());
}
}
使用htmlparaser就可以了，不用正则表达式那么麻烦Parser parser = Parser.createParser(content, ENCODE);
NodeFilter tableTag = new TagNameFilter("h3");
NodeList nodes = parser.extractAllNodesThatMatch(tableTag);
if (nodes != null) {
for (int i = 0; i < nodes.size(); i++) {
Node textnode = (Node) nodes.elementAt(i);
String temp1 = textnode.toHtml();
String temp2 = textnode.toPlainTextString();
}
}
谢谢，我就是不知道正则怎么用，开始用
Pattern p = Pattern.compile("(<h3(.*)\\</h3>)");
         Matcher m = p.matcher(line);
         while (m.find()) {
            System.out.println(m.group(1));
         }  这样子就只出样一个<h3>，可以追问一个问题吗？比如<h3 class="t"><a href="http://www.baidu.com/s?tn=baidurt&rtt=1&bsst=1&wd=%B3%C9%B6%BC%C3%C0%C5%AE" target="_blank"><em>成都美女</em>的最新相关信息</a><span class="tsuf tsuf-op" data="{title : '', link : ''}"></span></h3>里面有这么一串信息，我如何分别取出地址，也就是http://这个，还有<a>标签中的文字。总之很感谢你
public static void main(String[] args) {
String str = "<h3 class=\"t\"><a href=\"http://www.baidu.com/s?tn=baidurt&rtt=1&bsst=1&wd=%B3%C9%B6%BC%C3%C0%C5%AE\" target=\"_blank\"><em>成都美女</em>的最新相关信息</a><span class=\"tsuf tsuf-op\" data=\"{title : '', link : ''}\"></span></h3>";
Pattern p = Pattern.compile("<a.*href=\"(.*?)\".*?>(.*?)</a>");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
}
}