我现在有这样的一堆字符:
<ul>
...
<OBJECT type="text/sitemap">
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
<OBJECT type="text/sitemap">
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
<OBJECT type="text/sitemap">
<param name="Name" value="c">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
</ul>
1.要求能找到<object>标签类型(即text/sitemap) 及其包含的内容:即:
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
和
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
和
<param name="Name" value="c">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
2.在找到了内容之后,要取出内容标签里面的属性值:
如:<param name="ImageNumber" value="11">
得出:imagenumbere 及11
谢谢!!!
<ul>
...
<OBJECT type="text/sitemap">
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
<OBJECT type="text/sitemap">
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
<OBJECT type="text/sitemap">
<param name="Name" value="c">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
</ul>
1.要求能找到<object>标签类型(即text/sitemap) 及其包含的内容:即:
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
和
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
和
<param name="Name" value="c">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
2.在找到了内容之后,要取出内容标签里面的属性值:
如:<param name="ImageNumber" value="11">
得出:imagenumbere 及11
谢谢!!!
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
<OBJECT type="text/sitemap">
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
...
<OBJECT type="text/sitemap">
<param name="Name" value="c">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>如果是这样的格式,readbuffered 读取一行取出引号里面的内容不就可以了?
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.io.FileNotFoundException;class readFile {
BufferedReader buff;
public void read(String sFileName) {
try {
FileReader reader = new FileReader(sFileName);
buff = new BufferedReader(reader);
} catch (IOException e) {
e.printStackTrace();
}
} public String result() {
String str = "";
try {
str = buff.readLine();
} catch (IOException e) {
e.printStackTrace();
}
return str;
}
}public class reg {
public static void main(String[] args) {
String patPara = "<\\s*param\\s*[a-z]+=\"([^\"]*)\"\\s*[a-z]+=\"([^\"]*)\">";
Pattern pat = Pattern.compile(patPara);
readFile file = new readFile();
file.read("test.txt");
String str = "";
Matcher mat;
while ((str = file.result()) != null) {
mat = pat.matcher(str);
while (mat.find()) {
System.out.print(mat.group(1)+" "+mat.group(2));
}
System.out.println("");
} }
文件内容:
<OBJECT type="text/sitemap">
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
<OBJECT type="text/sitemap">
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
<OBJECT type="text/sitemap">
<param name="Name" value="c">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
<ul>
<OBJECT type="text/sitemap">
<param name="Name" value="a">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
<ul>
<OBJECT type="text/sitemap">
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>
<OBJECT type="text/sitemap">
<param name="Name" value="C">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT> </ul>
</ul>如上面的,b,c是放在同一层的,我要知道在这一层目录里面有两个实体,这个实体是通过<OBJECT type="text/sitemap"></OBJECT>来体现的。。所以。先要匹配到了<OBJECT type="text/sitemap"></OBJECT>中的<param>,再截取里面的字符串,再次匹配取出进而 的name与value谢谢这个兄弟的回答。
<OBJECT type="text/sitemap">
<param name="Name" value="b">
<param name="Local" value="988/16.htm">
<param name="ImageNumber" value="11">
</OBJECT>然后再用3楼的方法取出里面的name与value
其实对这种形式比较固定的,方法很多,循环一步到位也可以。
取object标签方式如下
Pattern p = Pattern.compile("<OBJECT type=\"text/sitemap\">[\\d\\D]*?</OBJECT>");
Matcher m = p.matcher(sourcestr);
while (m.find()) {
String oneobject = m.group();
//用group来取得
}
可能是我还没理解楼主想要什么。
package com.tony.regex;import java.util.regex.Matcher;
import java.util.regex.Pattern;public class Test_002 {
public static void regex_Object( String s) {
s = s.replace("\\r", "");
s = s.replace("\\n", "");
Pattern p = Pattern.compile(".*?<OBJECT type=\"(.+?)\">(.*?)</OBJECT>.*?");
Matcher m = p.matcher(s);
if( m.lookingAt() ) {
String type = "";
String content = "";
do{
type = m.group(1);
content = m.group(2);
System.out.println("=========================================");
System.out.println("OBJECT.TYPE=" + type + ";\nCONTENT=" + content);
regex_param(content);
} while (m.find());
}
}
public static void regex_param( String s) {
s = s.trim();
Pattern p = Pattern.compile("<param name=\"(.+?)\" value=\"(.+?)\">");
Matcher m = p.matcher(s);
if( m.lookingAt() ) {
String name = "";
String value = "";
do{
name = m.group(1);
value = m.group(2);
System.out.println("=========="); System.out.println("param.name=" + name + ";\nparam.value=" + value);
} while (m.find());
}
}
public static void main(String[] args) {
String s = ""
+ "<ul>"
+ " <OBJECT type=\"text/sitemap\">"
+ "<param name=\"Name\" value=\"a\">"
+ "<param name=\"Local\" value=\"988/16.htm\">"
+ "<param name=\"ImageNumber\" value=\"11\">"
+ "</OBJECT>"
+ " <ul>" + " <OBJECT type=\"text/sitemap\">"
+ " <param name=\"Name\" value=\"b\">"
+ " <param name=\"Local\" value=\"988/16.htm\">"
+ " <param name=\"ImageNumber\" value=\"11\">"
+ " </OBJECT>"
+ " <OBJECT type=\"text/sitemap\">"
+ " <param name=\"Name\" value=\"C\">"
+ " <param name=\"Local\" value=\"988/16.htm\">"
+ " <param name=\"ImageNumber\" value=\"11\">"
+ " </OBJECT>"
+ " </ul>"
+ "</ul>" + "";
Test_002.regex_Object(s);
}
}
OBJECT.TYPE=text/sitemap;
CONTENT=<param name="Name" value="a"><param name="Local" value="988/16.htm"><param name="ImageNumber" value="11">
==========
param.name=Name;
param.value=a
==========
param.name=Local;
param.value=988/16.htm
==========
param.name=ImageNumber;
param.value=11
=========================================
OBJECT.TYPE=text/sitemap;
CONTENT= <param name="Name" value="b"> <param name="Local" value="988/16.htm"> <param name="ImageNumber" value="11">
==========
param.name=Name;
param.value=b
==========
param.name=Local;
param.value=988/16.htm
==========
param.name=ImageNumber;
param.value=11
=========================================
OBJECT.TYPE=text/sitemap;
CONTENT= <param name="Name" value="C"> <param name="Local" value="988/16.htm"> <param name="ImageNumber" value="11">
==========
param.name=Name;
param.value=C
==========
param.name=Local;
param.value=988/16.htm
==========
param.name=ImageNumber;
param.value=11