我在往hbase中存数据,数据量比较大,在我的代码运行一段时间后,出现一下错误,结果hadoop集群中一半的datanode都断开连接,甚至都不能再ping通,然而在代码刚开始运行时候集群中一切都是好的。所以怀疑是代码运行中因为连接超时导致的,但是现在非常不缺定。我把eclipse显示的错误粘贴出来,请大家帮忙看看,非常感谢。
eclipse显示的错误信息:org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/root-region-server
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:703)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:679)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:130)
at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:83)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:845)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:831)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1067)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:856)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:958)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:860)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:817)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1507)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1392)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:918)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:770)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:757)
at cn.edu.xidian.repace.xml2hbase.hbase.HbaseInsert.addRecord(HbaseInsert.java:117)
at cn.edu.xidian.repace.xml2hbase.test.main(test.java:57)
13/09/10 08:41:45 INFO zookeeper.ClientCnxn: Opening socket connection to server centos2.xdccl.com/192.168.1.121:2181. Will not attempt to authenticate using SASL (unknown error)
13/09/10 08:41:45 INFO zookeeper.ClientCnxn: Socket connection established to centos2.xdccl.com/192.168.1.121:2181, initiating session
13/09/10 08:41:45 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x41010022f40001, likely server has closed socket, closing socket connection and attempting reconnect
13/09/10 08:41:45 INFO zookeeper.ClientCnxn: Opening socket connection to server centos16.xdccl.com/192.168.1.132:2181. Will not attempt to authenticate using SASL (unknown error)
13/09/10 08:41:54 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 9476ms for sessionid 0x41010022f40001, closing socket connection and attempting reconnect
13/09/10 08:41:54 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
13/09/10 08:41:54 INFO util.RetryCounter: Sleeping 8000ms before retry #3...
13/09/10 08:42:05 WARN zookeeper.ZKUtil: hconnection-0x41010022f40001-0x41010022f40001 Unable to set watcher on znode (/hbase)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:444)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:214)
at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:77)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:845)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:958)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:856)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:958)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:860)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:817)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1507)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1392)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:918)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:770)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:757)
at cn.edu.xidian.repace.xml2hbase.hbase.HbaseInsert.addRecord(HbaseInsert.java:117)
at cn.edu.xidian.repace.xml2hbase.test.main(test.java:57)
13/09/10 08:46:53 INFO client.HConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, will automatically reconnect when needed.
13/09/10 08:46:53 INFO zookeeper.ClientCnxn: Opening socket connection to server centos2.xdccl.com/192.168.1.121:2181. Will not attempt to authenticate using SASL (unknown error)
13/09/10 08:46:53 INFO zookeeper.ClientCnxn: Socket connection established to centos2.xdccl.com/192.168.1.121:2181, initiating session
13/09/10 08:46:53 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x41010022f40001, likely server has closed socket, closing socket connection and attempting reconnect
Unable to find region for C2V-x64.0-4,2874673,99999999999999 after 10 tries.record insertion failed
13/09/10 08:46:54 INFO zookeeper.ClientCnxn: Opening socket connection to server centos16.xdccl.com/192.168.1.132:2181. Will not attempt to authenticate using SASL (unknown error)
13/09/10 08:47:04 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 10990ms for sessionid 0x41010022f40001, closing socket connection and attempting reconnect
13/09/10 08:48:30 WARN zookeeper.ClientCnxn: Session 0x41010022f40001 for server centos7.xdccl.com/192.168.1.123:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: 您的主机中的软件中止了一个已建立的连接。
at sun.nio.ch.SocketDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(Unknown Source)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.write(Unknown Source)
at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
13/09/10 08:48:35 INFO zookeeper.ClientCnxn: Opening socket connection to server centos1.xdccl.com/192.168.1.130:2181. Will not attempt to authenticate using SASL (unknown error)
13/09/10 08:48:49 INFO zookeeper.ClientCnxn: Socket connection established to centos1.xdccl.com/192.168.1.130:2181, initiating session
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.StringCoding.safeTrim(Unknown Source)
at java.lang.StringCoding.access$300(Unknown Source)
at java.lang.StringCoding$StringEncoder.encode(Unknown Source)
at java.lang.StringCoding.encode(Unknown Source)
at java.lang.String.getBytes(Unknown Source)
at org.apache.hadoop.hbase.util.Bytes.toBytes(Bytes.java:419)
at cn.edu.xidian.repace.xml2hbase.hbase.HbaseInsert.addRecord(HbaseInsert.java:106)
at cn.edu.xidian.repace.xml2hbase.test.main(test.java:57)
13/09/10 08:49:02 WARN zookeeper.ClientCnxn: Session 0x41010022f40001 for server centos1.xdccl.com/192.168.1.130:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: 您的主机中的软件中止了一个已建立的连接。
at sun.nio.ch.SocketDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(Unknown Source)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.write(Unknown Source)
at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)hbase   hadoop  存储  hbase  hadoop  存储  

解决方案 »

  1.   

    1. 查询你服务设置最大连接数,请改为比较大的数
    2. 查询你的程序,数据访问完成后,有没有断开?最好每次查询后都要断开
    3. java里面有一个程序运行需要的最大内存,请把它设置大一点
      

  2.   

    1.服务设置最大连接数?我这个是单线程的,需要多个连接吗?
              2.我现在只是往hbase中存储数据,没有查询,其实是要往hbase中存储大量xml文档,所以现在是每处理几个文档调用一次数据库存储。就是下面两句:   
                                C2Vtable.put(putList);
                      C2Vtable.flushCommits();
    这样完成后还需要断开吗?具体断开是什么意思?怎么断开,求解释一下,每太明白。谢谢了
      

  3.   

    断开:
    在.net里面有一个connect.close()方法,意思是查询出结果后关闭与数据的连接,java不知道是什么写的