spark写数据到elasticsearch中报错
EsSpark.saveToEs(result, "userprofile/users", Map("es.mapping.id" -> "uid"))报错信息为org.elasticsearch.hadoop.EsHadoopException: Could not write all entries [3/1024] (maybe ES was overloaded?). Bailing out...
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:250)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:201)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:163)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:49)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)spark写入的数据是5000w行数据的RDD,es集群有两个节点
EsSpark.saveToEs(result, "userprofile/users", Map("es.mapping.id" -> "uid"))报错信息为org.elasticsearch.hadoop.EsHadoopException: Could not write all entries [3/1024] (maybe ES was overloaded?). Bailing out...
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:250)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:201)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:163)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:49)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)spark写入的数据是5000w行数据的RDD,es集群有两个节点
解决方案 »
- 关于openstack的几个问题,请高手帮忙解答下,谢谢
- openstack绑定的浮动IP,无法ping通
- 华硕的P9D-V服务器主板怎么样?
- 使用sqoop将oracle数据库导入hive、hbase或hdfs时,报Error,是什么原因?
- 关于kafka的consumer读不出消息
- t2.large的机器访问本机的apache速度都特别慢
- eclipse开发spark应用程序 spark2.1.0 导入哪个jar包?
- storm适合虚机环境吗
- sqoop 1.4.6 从mysql导数据到hive时java.sql.SQLException: Access denied for user 'root问题
- AWS-CPP-SDK 上传文件失败
- Dell poweredge r730配置OS时找不到介质是怎么回事?
- kerberos用户的密码会自动变吗?
conf.set("es.nodes", elasticsearch_nodes);
conf.set("es.batch.write.retry.count", "10"); # 默认是重试3次,为-1的话为无限重试(慎用)
conf.set("es.batch.write.retry.wait", "60"); # 默认重试等待时间是10s.可适当加大