如何"有效"读取很大的日志文件(可能有几十万行)??

在读取如下格式日志文件(可能有几十万行):
    20050928172857003218 222 1127899737796
    20050928172858472769 333 1127899738203
    20050928172858780754 444 1127899738875
    20050928172859651455 1124 1127899739468
    .
    .
    .
目前我是一次性读入,代码如下
    public static HashMap logmap = new HashMap();//全局变量
    ...
   public LogAnaly(String pathname) {    FileInputStream finStrm = null;
    try {
      //读取指定文件
      this.setFilePath(pathname);
      if (pathname != "") {
        finStrm = new FileInputStream(pathname);      }     if (finStrm != null) {
        instrea = new InputStreamReader(finStrm);
      }      if (instrea != null) {        //建立缓冲流
        buffer = new BufferedReader(instrea);
        //logmap = new HashMap();
        int intRow = 0;
        while (buffer.ready()) {
          firstLine = buffer.readLine();
          //加入hashmap
          if (firstLine != null && firstLine != "") {            logmap.put(String.valueOf(intRow), firstLine);
          }
          intRow++;
        } //while      }    }
    catch (FileNotFoundException ex) {
      System.out.println(ex.toString());
    }
    catch (IOException IOEX) {
      System.out.println(IOEX.toString());
    }
    finally {
      try {
        if (instrea != null) {
          instrea.close();
        }
        if (buffer != null) {
          buffer.close();
        }
      }
      catch (IOException ex1) {
      }
    }  }
当日志文件达到6万行左右,我的整个分析日志文件程序无法正常运行.
我想弄清楚:
    1/怎样比较好控制内存的占有;
    2/如果我批量读取,如:如果文件有20万行,一次读取4万行,分5次读取完毕.这样做是否比一次读完更有效率,代码怎么写呢?
请大家指点,敬请给出代码.

解决方案 »

免费领取超大流量手机卡，每月29元包185G流量+100分钟通话, 中国电信官方发货

1.4有一个很好的操作大文件的新的一套操作方法：java.nio.*(新io)，不过涉及的东西比较多，需要个一天半载学习一下，这里有一个新旧比较的例子，你可以试试：//: c12:MappedIO.java
// {Clean: temp.tmp}
import java.io.*;
import java.nio.*;
import java.nio.channels.*;public class MappedIO {
  private static int numOfInts = 4000000;
  private static int numOfUbuffInts = 200000;
  private abstract static class Tester {
    private String name;
    public Tester(String name) { this.name = name; }
    public long runTest() {
      System.out.print(name + ": ");
      try {
        long startTime = System.currentTimeMillis();
        test();
        long endTime = System.currentTimeMillis();
        return (endTime - startTime);
      } catch (IOException e) {
        throw new RuntimeException(e);
      }
    }
    public abstract void test() throws IOException;
  }
  private static Tester[] tests = {
    new Tester("Stream Write") {
      public void test() throws IOException {
        DataOutputStream dos = new DataOutputStream(
          new BufferedOutputStream(
            new FileOutputStream(new File("temp.tmp"))));
        for(int i = 0; i < numOfInts; i++)
          dos.writeInt(i);
        dos.close();
      }
    },
    new Tester("Mapped Write") {
      public void test() throws IOException {
        FileChannel fc =
          new RandomAccessFile("temp.tmp", "rw")
          .getChannel();
        IntBuffer ib = fc.map(
          FileChannel.MapMode.READ_WRITE, 0, fc.size())
          .asIntBuffer();
        for(int i = 0; i < numOfInts; i++)
          ib.put(i);
        fc.close();
      }
    },
    new Tester("Stream Read") {
      public void test() throws IOException {
        DataInputStream dis = new DataInputStream(
          new BufferedInputStream(
            new FileInputStream("temp.tmp")));
        for(int i = 0; i < numOfInts; i++)
          dis.readInt();
        dis.close();
      }
    },
    new Tester("Mapped Read") {
      public void test() throws IOException {
        FileChannel fc = new FileInputStream(
          new File("temp.tmp")).getChannel();
        IntBuffer ib = fc.map(
          FileChannel.MapMode.READ_ONLY, 0, fc.size())
          .asIntBuffer();
        while(ib.hasRemaining())
          ib.get();
        fc.close();
      }
    },
    new Tester("Stream Read/Write") {
      public void test() throws IOException {
        RandomAccessFile raf = new RandomAccessFile(
          new File("temp.tmp"), "rw");
        raf.writeInt(1);
        for(int i = 0; i < numOfUbuffInts; i++) {
          raf.seek(raf.length() - 4);
          raf.writeInt(raf.readInt());
        }
        raf.close();
      }
    },
    new Tester("Mapped Read/Write") {
      public void test() throws IOException {
        FileChannel fc = new RandomAccessFile(
          new File("temp.tmp"), "rw").getChannel();
        IntBuffer ib = fc.map(
          FileChannel.MapMode.READ_WRITE, 0, fc.size())
          .asIntBuffer();
        ib.put(0);
        for(int i = 1; i < numOfUbuffInts; i++)
          ib.put(ib.get(i - 1));
        fc.close();
      }
    }
  };
  public static void main(String[] args) {
    for(int i = 0; i < tests.length; i++)
      System.out.println(tests[i].runTest());
  }
} ///:~
csdn太搞笑，自动把我的空格去掉了！！
日志格式
20050928172857003218   222   1127899737796...
20050928172858472769   333   1127899738203...
20050928172858780754   444   1127899738875...
...
所以我需要一行一行读取。
majy兄说的方法我试过了，java.nio.*中的方法确实在读取大文件效率高了很多，先谢谢了。
不好意思,我还想弄清楚一个问题:
    3/怎样存储这么大的文件,我是这样的logmap.put(String.valueOf(intRow), firstLine);
,实际效果很不好.
nio是什么，我怎么从来没听说过？1.5的吗？