java源文件里的if语句在bytecode里翻译成比较指令, 而while循环在字节码中是两条指令实现的。一个是比较指令,一个是goto。 如下的java程序: public class TestCycle { public static void main(String[ ]args) { int a = 1; int b = 2; if(a < b) { a = a - 3; }
while(a < b) { a++; } } }编译成bytecode后如下: public static void main(java.lang.String[]); LineNumberTable: line 6: 0 line 7: 2 line 8: 4 line 9: 9 line 12: 12 line 13: 15 line 12: 18 line 15: 23 LocalVariableTable: Start Length Slot Name Signature 0 24 0 args [Ljava/lang/String; 2 22 1 a I 4 20 2 b I Code: Stack=2, Locals=3, Args_size=1 0: iconst_1 1: istore_1 2: iconst_2 3: istore_2 4: iload_1 5: iload_2 6: if_icmpge 18 9: iinc 1, -3 12: goto 18 15: iinc 1, 1 18: iload_1 19: iload_2 20: if_icmplt 15 23: return LineNumberTable: line 6: 0 line 7: 2 line 8: 4 line 9: 9 line 12: 12 line 13: 15 line 12: 18 line 15: 23 LocalVariableTable: Start Length Slot Name Signature 0 24 0 args [Ljava/lang/String; 2 22 1 a I 4 20 2 b I StackMapTable: number_of_entries = 3 frame_type = 253 /* append */ offset_delta = 12 locals = [ int, int ] frame_type = 2 /* same */ frame_type = 2 /* same */ } 我在分析bytecode的时候,分析道第20行的那个if_icmplt指令时,如何知道这个分支指令对应的是while循环呢? 一种方法是通过模式分析,看后边的goto语句有没有指向这个分支指令的前两三行, 但这个方法在循环条件和循环体比较复杂的时候不太准确。 有没有什么比较好的办法能够知道某个bytecode中的分支指令是对应于源代码中的while或者for循环,而不是一个if语句呢?
我找到了解决办法,所以跟你分享下: @Override public void sawOpcode(int seen) { if (seen == IF_ICMPGE || seen == IF_ICMPGT || seen == IF_ICMPLT || seen == IF_ICMPLE || seen == IF_ICMPNE || seen == IF_ICMPEQ) { int branchTarget = getBranchTarget(); int seenGOTO = getCodeByte(branchTarget - 3); int seenIINC = getCodeByte(getPC() - 6);
//证明这是for循环 if (seenGOTO == GOTO && seenIINC == IINC) {
} }
if(seen == IFEQ) { int branchTarget = getBranchTarget(); int seenGOTO = getCodeByte(branchTarget - 3); int seenIINC = getCodeByte(getPC() - 4);
//证明这是while循环 if (seenGOTO == GOTO && seenIINC == IINC) {
} } }
我找到了解决办法,所以跟你分享下: @Override public void sawOpcode(int seen) { if (seen == IF_ICMPGE || seen == IF_ICMPGT || seen == IF_ICMPLT || seen == IF_ICMPLE || seen == IF_ICMPNE || seen == IF_ICMPEQ) { int branchTarget = getBranchTarget(); int seenGOTO = getCodeByte(branchTarget - 3); int seenIINC = getCodeByte(getPC() - 6);
//证明这是for循环 if (seenGOTO == GOTO && seenIINC == IINC) {
} }
if(seen == IFEQ) { int branchTarget = getBranchTarget(); int seenGOTO = getCodeByte(branchTarget - 3); int seenIINC = getCodeByte(getPC() - 4);
//证明这是while循环 if (seenGOTO == GOTO && seenIINC == IINC) {
} } }
//昨天对while的判定不够全面,应该是:if(seen == IFEQ || seen == IFNE || seen == IFLT || seen == IFGE || seen == IFGT || seen == IFLE) { int branchTarget = getBranchTarget(); int seenGOTO = getCodeByte(branchTarget - 3); int seenIINC = getCodeByte(getPC() - 4);
//证明这是while循环 if (seenGOTO == GOTO && seenIINC == IINC) {
只能自己用javap -c list来查看字节码指令,然后和list.java的源代码比较了
类似工具估计即使有也不会很完美,因为即使是最好的反编译工具,反编译回来的代码都未必和源代码完全一致。
public static void main(String args[]) {
int a=1,b=2;
System.out.println(a+b);
}
}用javap -c -l NumTest命令输出如下,比不加-l参数多了个LineNumberTable对照表:Compiled from "NumTest.java"
public class NumTest {
public NumTest();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0 public static void main(java.lang.String[]);
Code:
0: iconst_1
1: istore_1
2: iconst_2
3: istore_2
4: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
7: iload_1
8: iload_2
9: iadd
10: invokevirtual #3 // Method java/io/PrintStream.println:(I)V
13: return
LineNumberTable:
line 3: 0
line 5: 4
line 6: 13
}
我现在是用程序操作字节码文件,而不是人看。当然通过上边的方法也可以写命令来实现,但不算是正途。
既然javap命令知道这个对应关系,javap是用什么办法知道的?这个对应表从哪里能得到呢?
有没有写好的class文件处理库能达到相似的目的?
而while循环在字节码中是两条指令实现的。一个是比较指令,一个是goto。
如下的java程序:
public class TestCycle { public static void main(String[ ]args) {
int a = 1;
int b = 2;
if(a < b) {
a = a - 3;
}
while(a < b) {
a++;
}
}
}编译成bytecode后如下:
public static void main(java.lang.String[]);
LineNumberTable:
line 6: 0
line 7: 2
line 8: 4
line 9: 9
line 12: 12
line 13: 15
line 12: 18
line 15: 23 LocalVariableTable:
Start Length Slot Name Signature
0 24 0 args [Ljava/lang/String;
2 22 1 a I
4 20 2 b I
Code:
Stack=2, Locals=3, Args_size=1
0: iconst_1
1: istore_1
2: iconst_2
3: istore_2
4: iload_1
5: iload_2
6: if_icmpge 18
9: iinc 1, -3
12: goto 18
15: iinc 1, 1
18: iload_1
19: iload_2
20: if_icmplt 15
23: return
LineNumberTable:
line 6: 0
line 7: 2
line 8: 4
line 9: 9
line 12: 12
line 13: 15
line 12: 18
line 15: 23 LocalVariableTable:
Start Length Slot Name Signature
0 24 0 args [Ljava/lang/String;
2 22 1 a I
4 20 2 b I StackMapTable: number_of_entries = 3
frame_type = 253 /* append */
offset_delta = 12
locals = [ int, int ]
frame_type = 2 /* same */
frame_type = 2 /* same */
}
我在分析bytecode的时候,分析道第20行的那个if_icmplt指令时,如何知道这个分支指令对应的是while循环呢?
一种方法是通过模式分析,看后边的goto语句有没有指向这个分支指令的前两三行,
但这个方法在循环条件和循环体比较复杂的时候不太准确。
有没有什么比较好的办法能够知道某个bytecode中的分支指令是对应于源代码中的while或者for循环,而不是一个if语句呢?
@Override
public void sawOpcode(int seen)
{
if (seen == IF_ICMPGE || seen == IF_ICMPGT || seen == IF_ICMPLT || seen == IF_ICMPLE || seen == IF_ICMPNE || seen == IF_ICMPEQ) {
int branchTarget = getBranchTarget();
int seenGOTO = getCodeByte(branchTarget - 3);
int seenIINC = getCodeByte(getPC() - 6);
//证明这是for循环
if (seenGOTO == GOTO && seenIINC == IINC) {
}
}
if(seen == IFEQ) {
int branchTarget = getBranchTarget();
int seenGOTO = getCodeByte(branchTarget - 3);
int seenIINC = getCodeByte(getPC() - 4);
//证明这是while循环
if (seenGOTO == GOTO && seenIINC == IINC) {
}
} }
@Override
public void sawOpcode(int seen)
{
if (seen == IF_ICMPGE || seen == IF_ICMPGT || seen == IF_ICMPLT || seen == IF_ICMPLE || seen == IF_ICMPNE || seen == IF_ICMPEQ) {
int branchTarget = getBranchTarget();
int seenGOTO = getCodeByte(branchTarget - 3);
int seenIINC = getCodeByte(getPC() - 6);
//证明这是for循环
if (seenGOTO == GOTO && seenIINC == IINC) {
}
}
if(seen == IFEQ) {
int branchTarget = getBranchTarget();
int seenGOTO = getCodeByte(branchTarget - 3);
int seenIINC = getCodeByte(getPC() - 4);
//证明这是while循环
if (seenGOTO == GOTO && seenIINC == IINC) {
}
} }
//昨天对while的判定不够全面,应该是:if(seen == IFEQ || seen == IFNE || seen == IFLT || seen == IFGE || seen == IFGT || seen == IFLE) {
int branchTarget = getBranchTarget();
int seenGOTO = getCodeByte(branchTarget - 3);
int seenIINC = getCodeByte(getPC() - 4);
//证明这是while循环
if (seenGOTO == GOTO && seenIINC == IINC) {
}
}