将1个html代码放到String里,怎么用正则表达式取到每行每列的值?如下:
北京 20 20 12 13 33 23 12 34 23 23 22
天津 20 20 12 13 33 23 12 34 23 23 22
上海 20 20 12 13 33 23 12 34 23 23 22
广东 20 20 12 13 33 23 12 34 23 23 22各城市分数那一行的颜色是<tr bgcolor="#ffffee">,与表头部分背景色<tr bgcolor="#eeeeee">不同,可以根据这个将数据行和表头区别开。HTML代码如下:<HTML>
<HEAD>
<title>月度考核成绩</title>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<BODY topmargin=0 leftmargin=0>
<TABLE WIDTH="95%" border="0" align="center" cellpadding="3" cellspacing="1" bgcolor=SteelBlue class="Navbar" ID="Table2">
<tr bgcolor="#eeeeee">
<td height=40 align=center colspan=16><font size=4><b>月度考核成绩</b></font></td>
</tr>
<tr bgcolor="#eeeeff">
<th align=center width="9%" rowspan="2">地区</th>
<th align=center width="27%" colspan="3">可用性(20分)</th>
<th align=center width="8%" rowspan="2">一致性<br>(20分)</th>
<th align=center width="8%" rowspan="2">准确性<br>(20分)</th>
<th align=center width="8%" rowspan="2">及时性<br>(20分)</th>
<th align=center width="8%" rowspan="2">准确性<br>(20分)</th>
<th align=center width="8%" rowspan="2">扣分<br>(-10分)</th>
<th align=center width="8%" rowspan="2">总分<br>(100分)</th>
<th align=center width="8%" rowspan="2">名次</th>
<th align=center width="8%" rowspan="2">级别</th>
</tr><tr bgcolor="#eeeeff">
<th align=center width="9%">指标(15分)</th>
<th align=center width="9%">成功率(5分)</th>
<th align=center width="9%">小计</th>
</tr>
<tr bgcolor="#ffffee">
<td>北京</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr>
<tr bgcolor="#ffffee">
<td>天津</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr> <tr bgcolor="#ffffee">
<td>上海</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr> <tr bgcolor="#ffffee">
<td>广东</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr>
</TABLE>
</BODY>
</HTML>
北京 20 20 12 13 33 23 12 34 23 23 22
天津 20 20 12 13 33 23 12 34 23 23 22
上海 20 20 12 13 33 23 12 34 23 23 22
广东 20 20 12 13 33 23 12 34 23 23 22各城市分数那一行的颜色是<tr bgcolor="#ffffee">,与表头部分背景色<tr bgcolor="#eeeeee">不同,可以根据这个将数据行和表头区别开。HTML代码如下:<HTML>
<HEAD>
<title>月度考核成绩</title>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<BODY topmargin=0 leftmargin=0>
<TABLE WIDTH="95%" border="0" align="center" cellpadding="3" cellspacing="1" bgcolor=SteelBlue class="Navbar" ID="Table2">
<tr bgcolor="#eeeeee">
<td height=40 align=center colspan=16><font size=4><b>月度考核成绩</b></font></td>
</tr>
<tr bgcolor="#eeeeff">
<th align=center width="9%" rowspan="2">地区</th>
<th align=center width="27%" colspan="3">可用性(20分)</th>
<th align=center width="8%" rowspan="2">一致性<br>(20分)</th>
<th align=center width="8%" rowspan="2">准确性<br>(20分)</th>
<th align=center width="8%" rowspan="2">及时性<br>(20分)</th>
<th align=center width="8%" rowspan="2">准确性<br>(20分)</th>
<th align=center width="8%" rowspan="2">扣分<br>(-10分)</th>
<th align=center width="8%" rowspan="2">总分<br>(100分)</th>
<th align=center width="8%" rowspan="2">名次</th>
<th align=center width="8%" rowspan="2">级别</th>
</tr><tr bgcolor="#eeeeff">
<th align=center width="9%">指标(15分)</th>
<th align=center width="9%">成功率(5分)</th>
<th align=center width="9%">小计</th>
</tr>
<tr bgcolor="#ffffee">
<td>北京</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr>
<tr bgcolor="#ffffee">
<td>天津</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr> <tr bgcolor="#ffffee">
<td>上海</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr> <tr bgcolor="#ffffee">
<td>广东</td>
<td align="center">20</td>
<td align="center">20</td>
<td align="center">12</td>
<td align="center">13</td>
<td align="center">33</td>
<td align="center">23</td>
<td align="center">12</td>
<td align="center">34</td>
<td align="center">23</td>
<td align="center">23</td>
<td align="center">22</td>
</tr>
</TABLE>
</BODY>
</HTML>
解决方案 »
- Java的构造函数问题
- 程序员人生苦短
- 如果insert发生异常,但没有做任何处理,时间长了是否会导致无法从weblogic连接池获得连接?
- tomcat异常: could not execute query ,怎么处理?
- struts问题
- 请教使用soap调用.net web service的问题
- 散100分:大家来聊聊CORBA
- 问一个struts连接数据源的地问题!在线等待
- 请高人指点,关于javaMail的问题
- 急……请问Websphere Application Server v5中的发布及部署问题!
- Could not instantiate bean class [com.zhi.base.BaseAction]: Constructor threw ex
- 数据转换
我不是在html里处理它,我是把这个html文件保存到本地了,然后读到string里,只能用正则表达式解析。
"<HEAD>\r\n" +
"<title>月度考核成绩</title>\r\n" +
"<meta http-equiv=\"Content-Type\" content=\"text/html; charset=gb2312\">\r\n" +
"<BODY topmargin=0 leftmargin=0>\r\n" +
"<TABLE WIDTH=\"95%\" border=\"0\" align=\"center\" cellpadding=\"3\" cellspacing=\"1\" bgcolor=SteelBlue class=\"Navbar\" ID=\"Table2\">\r\n" +
" \r\n" +
"<tr bgcolor=\"#eeeeee\">\r\n" +
"<td height=40 align=center colspan=16><font size=4><b>月度考核成绩</b></font></td>\r\n" +
"</tr>\r\n" +
"<tr bgcolor=\"#eeeeff\">\r\n" +
"<th align=center width=\"9%\" rowspan=\"2\">地区</th>\r\n" +
"<th align=center width=\"27%\" colspan=\"3\">可用性(20分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">一致性<br>(20分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">准确性<br>(20分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">及时性<br>(20分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">准确性<br>(20分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">扣分<br>(-10分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">总分<br>(100分)</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">名次</th>\r\n" +
"<th align=center width=\"8%\" rowspan=\"2\">级别</th> \r\n" +
"</tr>\r\n" +
" \r\n" +
"<tr bgcolor=\"#eeeeff\">\r\n" +
"<th align=center width=\"9%\">指标(15分)</th>\r\n" +
"<th align=center width=\"9%\">成功率(5分)</th>\r\n" +
"<th align=center width=\"9%\">小计</th>\r\n" +
"</tr>\r\n" +
" \r\n" +
"<tr bgcolor=\"#ffffee\">\r\n" +
"<td>北京</td>\r\n" +
"<td align=\"center\">20</td>\r\n" +
"<td align=\"center\">20</td> \r\n" +
"<td align=\"center\">12</td>\r\n" +
"<td align=\"center\">13</td>\r\n" +
"<td align=\"center\">33</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">12</td> \r\n" +
"<td align=\"center\">34</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">22</td>\r\n" +
"</tr> \r\n" +
" \r\n" +
"<tr bgcolor=\"#ffffee\">\r\n" +
"<td>天津</td>\r\n" +
"<td align=\"center\">20</td>\r\n" +
"<td align=\"center\">20</td> \r\n" +
"<td align=\"center\">12</td>\r\n" +
"<td align=\"center\">13</td>\r\n" +
"<td align=\"center\">33</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">12</td> \r\n" +
"<td align=\"center\">34</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">22</td>\r\n" +
"</tr> \r\n" +
" \r\n" +
"<tr bgcolor=\"#ffffee\">\r\n" +
"<td>上海</td>\r\n" +
"<td align=\"center\">20</td>\r\n" +
"<td align=\"center\">20</td> \r\n" +
"<td align=\"center\">12</td>\r\n" +
"<td align=\"center\">13</td>\r\n" +
"<td align=\"center\">33</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">12</td> \r\n" +
"<td align=\"center\">34</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">22</td>\r\n" +
"</tr> \r\n" +
" \r\n" +
"<tr bgcolor=\"#ffffee\">\r\n" +
"<td>广东</td>\r\n" +
"<td align=\"center\">20</td>\r\n" +
"<td align=\"center\">20</td> \r\n" +
"<td align=\"center\">12</td>\r\n" +
"<td align=\"center\">13</td>\r\n" +
"<td align=\"center\">33</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">12</td> \r\n" +
"<td align=\"center\">34</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">23</td>\r\n" +
"<td align=\"center\">22</td>\r\n" +
"</tr> \r\n" +
"</TABLE>\r\n" +
"</BODY>\r\n" +
"</HTML>";
StringBuilder buff = new StringBuilder("(?s)<tr bgcolor=\"#ffffee\">.*?<td>(.*?)</td>");
for (int i = 0; i < 11; i++) {
buff.append("(?:\\s*<td align=\"center\">(\\d+)</td>\\s*)");
}
buff.append("</tr>");
Pattern pattern = Pattern.compile(buff.toString());
Matcher matcher = pattern.matcher(html);
while (matcher.find()) {
for (int i = 1; i <= 11; i++) {
System.out.printf("%s ", matcher.group(i));
}
System.out.println();
}