空格快把我整吐了,先发个半成品上来吧。import java.util.regex.*; public class test { public static void main(String[] args) { String s = "http://item.slide.com/r/1/157/.jpg http://item.slide.com/r/1/217/i/ http://item.slide.com/r/1/157/.jpg "; Pattern p = Pattern.compile("http://[^\\s]+?(?<!\\.jpg)(?=\\s*http://|$|\\s+)"); Matcher m = p.matcher(s); while(m.find()) { System.out.println(m.group()); } }}
http://item.slide.com/r/1/217/i/
http://item.slide.com/r/1/157/如果就是这三个的话,如下:
/^htpp://item\.slide\.com/r/1/((123)|(217|(157))){1}/((i/){0,1})$/
还是把你的需求再说明白点吧.要是就你写的情况来说.
http://item.slide.com/r/\d{1}/\d{3}\s/$
就可以了,ls那个在这三个url来说可以,但是也太专用了吧.
如你说的:“每个URL都不一样”那么你要的是不是就是一个匹配格式如(http://***.***.***)
但最后一个URL 不是就把最后一个URL找出来
既然你都知道最后一个URL不是以.jpg结尾,那就应该不用找了!!!
如果象包函N多特殊字符,你就在说一声
怎么我感觉你越说越乱呢?如果就是想要最后一个:
import java.util.regex.*;public class Main { public static void main(String[] args) {
try {
String str = "http://item.slide.com/r/1/123/i/ http://item.slide.com/r/1/217/i/ http://item.slide.com/r/1/157/";
Pattern pat = Pattern.compile("http://(?!.*http://.*).+?$");
Matcher mat = pat.matcher(str); while (mat.find())
{
System.out.println(mat.group());
}
} catch (Exception e) {
e.printStackTrace();
} }
}
{ String s = "http://item.slide.com/r/1/157/.jpg http://item.slide.com/r/1/217/i/ http://item.slide.com/r/1/157/.jpg ";
Pattern p = Pattern.compile("(http://*(?:(?!.jpg).)*?)\\s");
Matcher m = p.matcher(s);
m.matches();
while(m.find())
{
System.out.print(" "+m.group(1)); } }
String s = "http://item.slide.com/r/1/157/.jpg http://item.slide.com/r/1/217/i/ http://item.slide.com/r/1/157/.jpg ";
如果把 \\s 去掉怎么就匹配不到了呢 同时把最后一个空格也去掉了
public class test
{
public static void main(String[] args)
{ String s = "http://item.slide.com/r/1/157/.jpg http://item.slide.com/r/1/217/i/ http://item.slide.com/r/1/157/.jpg ";
Pattern p = Pattern.compile("http://[^\\s]+?(?<!\\.jpg)(?=\\s*http://|$|\\s+)");
Matcher m = p.matcher(s);
while(m.find())
{
System.out.println(m.group()); } }}
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test1 { public static void main(String[] args) throws Exception {
cc();
} public static void cc() throws IOException{
String content = getString("pattem.html"); //读取网页内容
String pattern = "(http://[^\\s]*[^\\.jpg]\\s)"; //以‘http://’开头,以空格等空白字符结尾,中间部分没有空格或空白字符且不以‘.jpg’结尾
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(content);
String temp=null;
while(m.find()){
temp = m.group(1);
}
System.out.println(temp.replaceAll("\"|\'", "")); //去掉" ' (打印最后一个)
}
private static String getString(String path) throws IOException{
StringBuffer sb = new StringBuffer();
BufferedReader br = new BufferedReader(new FileReader(new File(path)));
String temp=br.readLine() ;
while(temp!=null){
sb.append(temp).append("\n");
temp = br.readLine();
}
return sb.toString();
}
}
测试通过。
{
String str = "<a>sdfa<b>s<br/>df<a><form><a/></form>xxxx<br/><b>";
str = str.replaceAll("</?[^>]+?(?<!br/?)>","");
System.out.println(str);
}
catch (Exception e)
{
e.printStackTrace();
}