我想把一个text文件转成xml文件。
比如text里写
title:aaa
item:
date:bbb
item:
date:ccc
我想把它转成
<title>aaa</title>
<item>
<date>bbb</date>
</item>
<item>
<date>ccc</date>
</item>
用replaceAll怎么做呢?
比如text里写
title:aaa
item:
date:bbb
item:
date:ccc
我想把它转成
<title>aaa</title>
<item>
<date>bbb</date>
</item>
<item>
<date>ccc</date>
</item>
用replaceAll怎么做呢?
public static void main(String[] args) {
String s = "title:aaa\nitem:\ndate:bbb\nitem:\ndate:ccc";
String s1 = s.replaceAll(":\n", ":");
String[] as = s1.split("\n");
for (String s2 : as)
System.out.print(convert(s2).replaceFirst("\n", ""));
} static String convert(String s) {
String[] as = s.split(":", 2);
if (as.length == 1)
return as[0];
return "\n<" + as[0] + ">" + convert(as[1]) + "</" + as[0] + ">\n";
}
}
String str=tf.mread("E:/test.txt"); str=str.replaceAll("title:(.+)","<title>$1</title>");
str=str.replaceAll("date:(.+)","<date>$1</date>\n</item>");
str=str.replaceAll("item:","<item>");
System.out.println(str);
}
//读文件
public String read(String path){
InputStreamReader in;
char []ch=new char[1024];
StringBuffer cb=new StringBuffer();
try {
in=new InputStreamReader(new FileInputStream(path),"UTF8");
int len=0;
while((len=in.read(ch))!=-1){
cb.append(ch,0,len);
}
} catch (Exception e) {
e.printStackTrace();
}
return cb.toString();
}
}
//----------------结果如下:
<title>aaa</title>
<item>
<date>bbb</date>
</item>
<item>
<date>ccc</date>
</item>
"title:(.+)"
title:aaa
item:
date:bbb
data:ccc
date:ddd
item:
data:eee如果说一个item只有一个data,那我觉得item就没有必要了
public static void main(String[] args) {
String s = "title:aaa\n" +
"item:\n" +
"date:bbb\n" +
"date:ddd\n" +
"item:\n" +
"date:ccc\n" +
"date:ddd\n" +
"item:\n" +
"date:ccc";
String s1 = s.replaceAll("title:(.+)","<title>$1</title>");
s1 = s1.replaceAll("date:(.+)", "<date>$1</date>");
s1 = s1.replaceAll("(item):((?s).+?)((?=[^.]\\1)|\\z)", "<$1>$2\n</$1>");
System.out.println(s1);
}
}
不过请解释一下(item):((?s).+?)((?=[^.]\\1)|\\z)的意思好吗?再次感谢
"item:\n" +
"date:bbb\n"+
"item:\n"+
"date:ccc\n";
s.replaceAll("(.+):(.*)\n", "<$1>$2</$1>\n")<title>aaa</title>
<item></item>
<date>bbb</date>
<item></item>
<date>ccc</date>
"item:\n" +
"date:aaa\n"+
"item:\n" +
"date:bbb\n"+
"date:bbbBB\n"+
"date:bbbBBBBB\n"+
"item:\n"+
"date:ccc\n" +
"date:cccCCC\n";
System.out.println(
s.replaceAll(
"(.+):\n((.|\n)+?)((?=(\\1:\n))|\\z)",
"<$1>\n$2</$1>\n").
replaceAll(
"(.+):((.+)|())\n",
"<$1>$2</$1>\n")
);
public class ConvertToXml {
public static void main(String[] args) {
String s = "title:aaa\nitem:\ndate:bbb\nitem:\ndate:ccc";
String s1 = s.replaceAll(":\n", ":");
String[] as = s1.split("\n");
for (String s2 : as)
System.out.print(convert(s2).replaceFirst("\n", ""));
}static String convert(String s) {
String[] as = s.split(":", 2);
if (as.length == 1)
return as[0];
return "\n<" + as[0] + ">" + convert(as[1]) + "</" + as[0] + ">\n";
}
}
((.|\n)+?) 匹配后续的部分,并且使用非贪婪匹配(也就是尽可能的少匹配)((?=(\\1:\n))|\\z)
其中\\z表示匹配字符串结束,
\\1表示匹配此pattern的第一组,也就是(.+)所匹配到的字符串,也就是匹配item:\n前者中?=表示向前查找但并不包含匹配,也就是说找到下一个item:\n的位置,但是后面这段并不包含在查找的字符中中(真拗口,不太会解释)
<$1>\n$2</$1>\n
$1表示此pattern中的第一组,也就是(.+),即item
$2表示((.|\n)+?)所匹配的所有字符replaceAll后就生成了<item>\ndate:ccc...</item>这样的字符
后面的(.+):((.+)|())\n就是匹配date:ccc、title:aaa等部分,比较好理解
String s1=s.replaceAll(":\n",":");
String[] as = s1.split("/n");
for(String s1:as)
{System.out.println(convert(s2).replaceFirst("\n",""));
}
static String convert(String s)
{
String[] as = s.split("",2);
if(as.length==1)
return as[0];
return "\n<"+as[0]+">"+convert(as[1])+"</"+as[0]+">\n";
}
}
str=str.replaceAll("title:(.+)","<title>$1</title>"); str=str.replaceAll("item:(?<=item:)((\ndate:.*)+)","<item>$1\n</item>"); str=str.replaceAll("date:(.+)","<date>$1</date>"); System.out.println(str);
title:aaatitle:bbb
怎么让它变成<title>aaatitle:bbb</title>
而不是<title>aaa</title><title>bbb</title>
也就是说,只匹配每行最前面的:,以后出现的:当做内容。
其实我最终目的是将一个text文本,解析成xml。
感谢大家的帮助。