再求Java正则 本帖最后由 sunxingtao 于 2011-10-15 10:38:00 编辑 解决方案 » 免费领取超大流量手机卡,每月29元包185G流量+100分钟通话, 中国电信官方发货 这个用js分析不就可以了吗或者dom也可以啊为啥非要当作字符串来处理呢,慢得很 这个是从word中复制内容 粘贴到html编辑器中,再写到页面中的。我想在保存到数据库的时候就把问题处理掉 用正则,如果文本内容很大,处理很慢的,因为不光是检查匹配,还涉及到替换,所以自己处理说不定会更好一些,如果非要用正则,可以试试看大概思路,就是把字符串分成几段来看待,然后依次判断是否符合替换条件,然后再进行替换(实际上是插入数据)String s = "<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left>\n <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN>\n</P>";Pattern p = Pattern.compile("(?i)(<p\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*class=)(\\w*)(\\s*.*?\\s*>\\s*.*?\\s*<span\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*>\\s*.*?\\s*</span>\\s*.*?\\s*</p>)");Matcher m = p.matcher(s);while(m.find()) { if (!m.group(2).matches("(?i).*?line-height.*?") && m.group(4).equalsIgnoreCase("MsoNormal")) { int pt = 0; String insert = " LINE-HEIGHT: 15.5pt; "; if (m.group(6).matches("(?i).*?font-size\\s*:\\s*(\\d+)pt.*")) { pt = Integer.valueOf(m.group(6).replaceAll("(?i).*?font-size\\s*:\\s*(\\d+)pt.*", "$1")); } if (pt > 5 && pt <= 12) { insert = " LINE-HEIGHT: 15.5pt; "; } else if (pt > 12 && pt <= 24) { insert = " LINE-HEIGHT: 30pt; "; } else if (pt > 24 && pt <= 36) { insert = " LINE-HEIGHT: 46pt; "; } else if (pt > 36) { insert = " LINE-HEIGHT: 62pt; "; } String result = m.replaceAll("$1"+insert+"$2$3$4$5$6$7"); System.out.println(result); }} 首非常感谢您!不过这个不可以,原来有LINE-HEIGHT:属性的也加上了劳烦您再给看下。 就是有的<P>中会有两个LINE-HEIGHT String s="<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left><SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN></P><P >test</P><P style=\"TEXT-ALIGN: left; LINE-HEIGHT: 30pt; TEXT-INDENT: 32pt; mso-pagination: widow-orphan; mso-line-height-rule: exactly; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0\"class=MsoNormal align=left> <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">共同协作,快速联动,积极组织“二次抓捕”,成功抓获 <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\"> 2 </SPAN> 名网上在逃人员。 </SPAN></P>"; 你把replaceAll改成replaceFirst就可以了for exampleString s="<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left><SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN></P><P >test</P><P style=\"TEXT-ALIGN: left; LINE-HEIGHT: 30pt; TEXT-INDENT: 32pt; mso-pagination: widow-orphan; mso-line-height-rule: exactly; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0\"class=MsoNormal align=left> <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">共同协作,快速联动,积极组织“二次抓捕”,成功抓获 <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\"> 2 </SPAN> 名网上在逃人员。 </SPAN></P>";Pattern p = Pattern.compile("(?i)(<p\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*class=)(\\w*)(\\s*.*?\\s*>\\s*.*?\\s*<span\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*>\\s*.*?\\s*</span>\\s*.*?\\s*</p>)");Matcher m = p.matcher(s);while(m.find()) { if (!m.group(2).matches("(?i).*?line-height.*?") && m.group(4).equalsIgnoreCase("MsoNormal")) { int pt = 0; String insert = " LINE-HEIGHT: 15.5pt; "; if (m.group(6).matches("(?i).*?font-size\\s*:\\s*(\\d+)pt.*")) { pt = Integer.valueOf(m.group(6).replaceAll("(?i).*?font-size\\s*:\\s*(\\d+)pt.*", "$1")); } if (pt > 5 && pt <= 12) { insert = " LINE-HEIGHT: 15.5pt; "; } else if (pt > 12 && pt <= 24) { insert = " LINE-HEIGHT: 30pt; "; } else if (pt > 24 && pt <= 36) { insert = " LINE-HEIGHT: 46pt; "; } else if (pt > 36) { insert = " LINE-HEIGHT: 62pt; "; } String result = m.replaceFirst("$1"+insert+"$2$3$4$5$6$7"); //修改这里 System.out.println(result); }} 8好意思,上面造成死循环了,修改一下String s="<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left><SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN></P><P >test</P><P style=\"TEXT-ALIGN: left; LINE-HEIGHT: 30pt; TEXT-INDENT: 32pt; mso-pagination: widow-orphan; mso-line-height-rule: exactly; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0\"class=MsoNormal align=left> <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">共同协作,快速联动,积极组织“二次抓捕”,成功抓获 <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\"> 2 </SPAN> 名网上在逃人员。 </SPAN></P>";s = s + "\n\n" + s; //为了测试,多加几次数据Pattern p = Pattern.compile("(?i)(<p\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*class=)(\\w*)(\\s*.*?\\s*>\\s*.*?\\s*<span\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*>\\s*.*?\\s*</span>\\s*.*?\\s*</p>)");Matcher m = p.matcher(s);StringBuffer buf = new StringBuffer();while(m.find()) { //System.out.printf("s=%d, e=%d\n", m.start(), m.end()); if (!m.group(2).matches("(?i).*?line-height.*?") && m.group(4).equalsIgnoreCase("MsoNormal")) { int pt = 0; String insert = " LINE-HEIGHT: 15.5pt; "; if (m.group(6).matches("(?i).*?font-size\\s*:\\s*(\\d+)pt.*")) { pt = Integer.valueOf(m.group(6).replaceAll("(?i).*?font-size\\s*:\\s*(\\d+)pt.*", "$1")); } if (pt > 5 && pt <= 12) { insert = " LINE-HEIGHT: 15.5pt; "; } else if (pt > 12 && pt <= 24) { insert = " LINE-HEIGHT: 30pt; "; } else if (pt > 24 && pt <= 36) { insert = " LINE-HEIGHT: 46pt; "; } else if (pt > 36) { insert = " LINE-HEIGHT: 62pt; "; } m.appendReplacement(buf, "$1"+insert+"$2$3$4$5$6$7"); }}m.appendTail(buf);System.out.println(buf); Hibernate懒加载问题 两台装tomcat不同服务器怎么实现同步影射? Struts跳转初值问题 XFIRE 的 范型 映射与调用 急~~ 高手来 新手入门,请大家多多帮忙 关于hibernate的问题 100分求动态添加无限级树状目录!!!!!!!!!!!!!!!! 菜鸟请教,如何建立JNDI数据源? 正在热心EJB,突然一瓢冷水泼来,请业内人士指点迷津... struts2上传文件遇到一点小麻烦 求教java动态添加数据库表,不是添加表格!!!(最好有代码) EJB3 组件之间不能相互调用
或者dom也可以啊
为啥非要当作字符串来处理呢,慢得很
我想在保存到数据库的时候就把问题处理掉
大概思路,就是把字符串分成几段来看待,然后依次判断是否符合替换条件,然后再进行替换(实际上是插入数据)
String s = "<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left>\n <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN>\n</P>";
Pattern p = Pattern.compile("(?i)(<p\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*class=)(\\w*)(\\s*.*?\\s*>\\s*.*?\\s*<span\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*>\\s*.*?\\s*</span>\\s*.*?\\s*</p>)");
Matcher m = p.matcher(s);
while(m.find()) {
if (!m.group(2).matches("(?i).*?line-height.*?") && m.group(4).equalsIgnoreCase("MsoNormal")) {
int pt = 0;
String insert = " LINE-HEIGHT: 15.5pt; ";
if (m.group(6).matches("(?i).*?font-size\\s*:\\s*(\\d+)pt.*")) {
pt = Integer.valueOf(m.group(6).replaceAll("(?i).*?font-size\\s*:\\s*(\\d+)pt.*", "$1"));
}
if (pt > 5 && pt <= 12) {
insert = " LINE-HEIGHT: 15.5pt; ";
} else if (pt > 12 && pt <= 24) {
insert = " LINE-HEIGHT: 30pt; ";
} else if (pt > 24 && pt <= 36) {
insert = " LINE-HEIGHT: 46pt; ";
} else if (pt > 36) {
insert = " LINE-HEIGHT: 62pt; ";
} String result = m.replaceAll("$1"+insert+"$2$3$4$5$6$7");
System.out.println(result);
}
}
不过这个不可以,原来有LINE-HEIGHT:属性的也加上了劳烦您再给看下。
String s="<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left><SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN></P><P >test</P><P style=\"TEXT-ALIGN: left; LINE-HEIGHT: 30pt; TEXT-INDENT: 32pt; mso-pagination: widow-orphan; mso-line-height-rule: exactly; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0\"class=MsoNormal align=left> <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">共同协作,快速联动,积极组织“二次抓捕”,成功抓获 <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\"> 2 </SPAN> 名网上在逃人员。 </SPAN></P>";
你把replaceAll改成replaceFirst就可以了
for example
String s="<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left><SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN></P><P >test</P><P style=\"TEXT-ALIGN: left; LINE-HEIGHT: 30pt; TEXT-INDENT: 32pt; mso-pagination: widow-orphan; mso-line-height-rule: exactly; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0\"class=MsoNormal align=left> <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">共同协作,快速联动,积极组织“二次抓捕”,成功抓获 <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\"> 2 </SPAN> 名网上在逃人员。 </SPAN></P>";
Pattern p = Pattern.compile("(?i)(<p\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*class=)(\\w*)(\\s*.*?\\s*>\\s*.*?\\s*<span\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*>\\s*.*?\\s*</span>\\s*.*?\\s*</p>)");
Matcher m = p.matcher(s);
while(m.find()) {
if (!m.group(2).matches("(?i).*?line-height.*?") && m.group(4).equalsIgnoreCase("MsoNormal")) {
int pt = 0;
String insert = " LINE-HEIGHT: 15.5pt; ";
if (m.group(6).matches("(?i).*?font-size\\s*:\\s*(\\d+)pt.*")) {
pt = Integer.valueOf(m.group(6).replaceAll("(?i).*?font-size\\s*:\\s*(\\d+)pt.*", "$1"));
}
if (pt > 5 && pt <= 12) {
insert = " LINE-HEIGHT: 15.5pt; ";
} else if (pt > 12 && pt <= 24) {
insert = " LINE-HEIGHT: 30pt; ";
} else if (pt > 24 && pt <= 36) {
insert = " LINE-HEIGHT: 46pt; ";
} else if (pt > 36) {
insert = " LINE-HEIGHT: 62pt; ";
} String result = m.replaceFirst("$1"+insert+"$2$3$4$5$6$7"); //修改这里
System.out.println(result);
}
}
String s="<P style=\"TEXT-ALIGN: left; TEXT-INDENT: 32.25pt; mso-pagination: widow-orphan; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto\" class=MsoNormal align=left><SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">针对以上情况,</SPAN></P><P >test</P><P style=\"TEXT-ALIGN: left; LINE-HEIGHT: 30pt; TEXT-INDENT: 32pt; mso-pagination: widow-orphan; mso-line-height-rule: exactly; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto; mso-char-indent-count: 2.0\"class=MsoNormal align=left> <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\">共同协作,快速联动,积极组织“二次抓捕”,成功抓获 <SPAN style=\"FONT-FAMILY: 仿宋_GB2312; FONT-SIZE: 16pt; mso-hansi-font-family: 宋体; mso-bidi-font-family: 宋体; mso-font-kerning: 0pt\"> 2 </SPAN> 名网上在逃人员。 </SPAN></P>";
s = s + "\n\n" + s; //为了测试,多加几次数据
Pattern p = Pattern.compile("(?i)(<p\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*class=)(\\w*)(\\s*.*?\\s*>\\s*.*?\\s*<span\\s*.*?\\s*style=\\s*[\"])(\\s*.*?)([\"]\\s*>\\s*.*?\\s*</span>\\s*.*?\\s*</p>)");
Matcher m = p.matcher(s);
StringBuffer buf = new StringBuffer();
while(m.find()) {
//System.out.printf("s=%d, e=%d\n", m.start(), m.end());
if (!m.group(2).matches("(?i).*?line-height.*?") && m.group(4).equalsIgnoreCase("MsoNormal")) {
int pt = 0;
String insert = " LINE-HEIGHT: 15.5pt; ";
if (m.group(6).matches("(?i).*?font-size\\s*:\\s*(\\d+)pt.*")) {
pt = Integer.valueOf(m.group(6).replaceAll("(?i).*?font-size\\s*:\\s*(\\d+)pt.*", "$1"));
}
if (pt > 5 && pt <= 12) {
insert = " LINE-HEIGHT: 15.5pt; ";
} else if (pt > 12 && pt <= 24) {
insert = " LINE-HEIGHT: 30pt; ";
} else if (pt > 24 && pt <= 36) {
insert = " LINE-HEIGHT: 46pt; ";
} else if (pt > 36) {
insert = " LINE-HEIGHT: 62pt; ";
} m.appendReplacement(buf, "$1"+insert+"$2$3$4$5$6$7");
}
}
m.appendTail(buf);
System.out.println(buf);