两个数组:
String[] A ;
String[] B ;两种循环方法:
1.
for (int i = 0; i < A.length; i++) {
for (int j = 0; j < B.length; j++) {
if (A[i].equals(B[j])) {
break;
}
}
}2.
Arrays.sort(A);
for (int k = 0; k < B.length; k++) {
if (Arrays.binarySearch(A, B[k]) >= 0) {
break;
}
}哪种会更快一些呢?
String[] A ;
String[] B ;两种循环方法:
1.
for (int i = 0; i < A.length; i++) {
for (int j = 0; j < B.length; j++) {
if (A[i].equals(B[j])) {
break;
}
}
}2.
Arrays.sort(A);
for (int k = 0; k < B.length; k++) {
if (Arrays.binarySearch(A, B[k]) >= 0) {
break;
}
}哪种会更快一些呢?
如果要自己写数据内容进行测试的话,怎样才能避免这样的极端现象呢?
我想考虑平均情况,应该怎么做呢?
可以知道 当数据越多 第2种将有一定优势
Arrays.binarySearch(A, B[k])是log(n)
所以,总体上还是n*log(n)
Arrays.sort(A);
for (int k = 0; k < B.length; k++) {
if (Arrays.binarySearch(A, B[k]) > = 0) {
break;
}
}
} else {
Arrays.sort(B);
for (int k = 0; k < A.length; k++) {
if (Arrays.binarySearch(B, A[k]) > = 0) {
break;
}
}
}
假设A数组长度为n,B数组长度为m,那么排序预处理的复杂度是n*log(n),m次二分查找的复杂度是m*log(n),所以整个的复杂度是(m+n)*log(n),所以应该预先排序较小的数组。
测试如下:
import java.util.Arrays;
import java.util.Random;public class Main {
static int NUM_OF_TESTS = 10; public static void main(String[] args) {
for (int testNo = 0; testNo < NUM_OF_TESTS; ++testNo) {
System.out.println("Test Case No." + (testNo+1) + ":");
// 随机产生长度不超过1000000的数组
Random random = new Random();
int n = Math.abs(random.nextInt()) % 1000000, m = Math.abs(random
.nextInt()) % 1000000; if (n < m) {
int tmp = n;
n = m;
m = tmp;
} int[] A = new int[n];
int[] B = new int[m];
int[] C = new int[n];
int[] D = new int[m];
for (int i = 0; i < n; ++i) {
A[i] = C[i] = random.nextInt();
}
for (int i = 0; i < m; ++i) {
B[i] = D[i] = random.nextInt();
} // 为了减少数据对结果的影响,这里不break
long startTime = System.currentTimeMillis();
Arrays.sort(A);
for (int k = 0; k < B.length; k++) {
if (Arrays.binarySearch(A, B[k]) >= 0) {
// break;
}
}
long endTime = System.currentTimeMillis();
long presortLargerCost = endTime - startTime; startTime = System.currentTimeMillis();
Arrays.sort(D);
for (int k = 0; k < D.length; k++) {
if (Arrays.binarySearch(C, D[k]) >= 0) {
// break;
}
}
endTime = System.currentTimeMillis();
long presortSmallerCost = endTime - startTime; System.out.println("Presort larger Array" + "(" + n + ")"
+ " cost: " + presortLargerCost + " ms");
System.out.println("Presort smaller Array" + "(" + m + ")"
+ " cost: " + presortSmallerCost + " ms");
}
}
}Sample Output:
Test Case No.1:
Presort larger Array(664728) cost: 312 ms
Presort smaller Array(585535) cost: 152 ms
Test Case No.2:
Presort larger Array(356144) cost: 102 ms
Presort smaller Array(181028) cost: 44 ms
Test Case No.3:
Presort larger Array(887920) cost: 424 ms
Presort smaller Array(745125) cost: 205 ms
Test Case No.4:
Presort larger Array(917560) cost: 423 ms
Presort smaller Array(707575) cost: 203 ms
Test Case No.5:
Presort larger Array(740511) cost: 272 ms
Presort smaller Array(442009) cost: 115 ms
Test Case No.6:
Presort larger Array(479413) cost: 155 ms
Presort smaller Array(248410) cost: 61 ms
Test Case No.7:
Presort larger Array(766345) cost: 312 ms
Presort smaller Array(533508) cost: 140 ms
Test Case No.8:
Presort larger Array(997986) cost: 418 ms
Presort smaller Array(643405) cost: 177 ms
Test Case No.9:
Presort larger Array(265796) cost: 70 ms
Presort smaller Array(113479) cost: 26 ms
Test Case No.10:
Presort larger Array(989236) cost: 227 ms
Presort smaller Array(145733) cost: 35 ms
第一种和第二种相比,除了当n,m很小的时候,会因为常数因子小有些许优势外,基本无优势可言。当然是第二种快。
忘记说了,lz说的第一种的复杂度是o(n*m),第二种是o((n+m)*log(n)),我前面说的是在第二种上的优化。