自己写了好久也没有写出来。想要从字符串中过滤掉下面内容
<style>
<!--
/* Font Definitions */
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@宋体";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;}
@page Section1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
{page:Section1;}
-->
</style>
使用 String.replaceAll(pattern,"")。谢谢
<style>
<!--
/* Font Definitions */
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@宋体";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;}
@page Section1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
{page:Section1;}
-->
</style>
使用 String.replaceAll(pattern,"")。谢谢
<!--
/* Font Definitions */
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:宋体;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@宋体";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;}
@page Section1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
{page:Section1;}
-->
</style>
private static ArrayList<String> EXCEPTION_WORD = null;
static {
EXCEPTION_WORD = new ArrayList<String>();
EXCEPTION_WORD.add("\\");
EXCEPTION_WORD.add("*");
EXCEPTION_WORD.add("+");
EXCEPTION_WORD.add(".");
EXCEPTION_WORD.add("?");
EXCEPTION_WORD.add("{");
EXCEPTION_WORD.add("}");
EXCEPTION_WORD.add("(");
EXCEPTION_WORD.add(")");
EXCEPTION_WORD.add("[");
EXCEPTION_WORD.add("]");
EXCEPTION_WORD.add("^");
EXCEPTION_WORD.add("$");
EXCEPTION_WORD.add("-");
EXCEPTION_WORD.add("|");
EXCEPTION_WORD.add("\"");
EXCEPTION_WORD.add("&");
} /**
* @param args
*/
public static void main(String[] args) {
String text = "<style><!-- /* Font Definitions */ @font-face {font-family:宋体;--></style>aaa";
String pattern = "<style>.+</style>";
System.out.println(text);
text = turnToRegularString(text).replaceAll(pattern,"");
System.out.println(text);
}
public static String turnToRegularString(String old) {
if (old == null || old.trim().length() == 0) {
return old;
}
String regularStr = old.toString(); for (String word : EXCEPTION_WORD) {
if (regularStr.indexOf(word) >= 0) {
regularStr = regularStr.replace(word, "\\" + word);
}
}
return regularStr;
}}
String result = test.replaceAll(pattern, "");
System.out.println(result);
你能解释一下你的pattern吗?
<[^>]*>
一样,表示匹配“<”开头,中间不是“>”的任意字符,一直匹配到“>”为止<style>(?:(?!</?style\\b).)*</style>
就表示匹配“<style>”开头,中间不是“<style”或“</style”的任意字符,一直匹配到“</style>”为止[^>] 排除的是任意无序字符,而(?!</?style\\b).排除的是一个有序的子串(?i) 表示忽略大小写
(?s) 表示单行模式,使得小数点可以匹配任意字符
这一段还不是很懂!嘿嘿,再麻烦一下呗