集群
    3 台机器, 一个是 2g 内存做主节点,另外两个都是1g内存做从节点。
想问下大神是不是我的内存太小导致的Thanks.. 
报错如下:[hadoop@master spark-1.3.0-bin-hadoop2.4]$ ./bin/spark-submit --class SimpleApp --master spark://172.21.7.182:7077 ~/spark_wordcount/target/scala-2.10/simple-project_2.10-1.0.jar 
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/04/22 14:30:30 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/04/22 14:30:32 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@bananapi:38979/user/Executor#266790798] with ID 1
15/04/22 14:30:32 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:32 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:32 INFO SparkDeploySchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@bananapi:43806/user/Executor#-850130035] with ID 0
15/04/22 14:30:33 INFO BlockManagerMasterActor: Registering block manager bananapi:60321 with 267.3 MB RAM, BlockManagerId(1, bananapi, 60321)
15/04/22 14:30:33 INFO BlockManagerMasterActor: Registering block manager bananapi:51018 with 267.3 MB RAM, BlockManagerId(0, bananapi, 51018)
15/04/22 14:30:34 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, bananapi): java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1155)
        at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
        at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
        at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
        at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
        at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
        at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
        at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
        at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:166)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1152)
        ... 11 more
Caused by: java.lang.IllegalArgumentException
        at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152)
        ... 20 more15/04/22 14:30:34 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on executor bananapi: java.io.IOException (java.lang.reflect.InvocationTargetException) [duplicate 1]
15/04/22 14:30:34 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:34 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:34 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 2) on executor bananapi: java.io.IOException (java.lang.reflect.InvocationTargetException) [duplicate 2]
15/04/22 14:30:34 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 4, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:35 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on executor bananapi: java.io.IOException (java.lang.reflect.InvocationTargetException) [duplicate 3]
15/04/22 14:30:35 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 5, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:35 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 4) on executor bananapi: java.io.IOException (java.lang.reflect.InvocationTargetException) [duplicate 4]
15/04/22 14:30:35 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 6, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:35 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5) on executor bananapi: java.io.IOException (java.lang.reflect.InvocationTargetException) [duplicate 5]
15/04/22 14:30:35 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, bananapi, PROCESS_LOCAL, 1389 bytes)
15/04/22 14:30:35 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6) on executor bananapi: java.io.IOException (java.lang.reflect.InvocationTargetException) [duplicate 6]
15/04/22 14:30:35 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
15/04/22 14:30:35 INFO TaskSchedulerImpl: Cancelling stage 0
15/04/22 14:30:35 INFO TaskSchedulerImpl: Stage 0 was cancelled
15/04/22 14:30:35 INFO DAGScheduler: Job 0 failed: count at SimpleApp.scala:11, took 20.358100 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, bananapi): java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1155)
        at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
        at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
        at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
        at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
        at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
        at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
        at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
        at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:166)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1152)
        ... 11 more
Caused by: java.lang.IllegalArgumentException
        at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152)
        ... 20 moreDriver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

解决方案 »

  1.   

    1G的话,确实有点小。
    不过这样的问题,大多数都是缺少包导致的,就是说环境所需的包,你在shell中运行,应该是没有问题,但是,在submit下就比较严格了。我在用submit提交的时候,都是出现很多类似你这样的问题,大部分是缺少包,还有一部分是你在执行submit的时候,后面是需要加一些参数,或者是指定一些jar等等的。还要看你部署的方式是什么,命令也有所不同。我也是刚学不久,我说的这些只能给你一些提示而已啦
      

  2.   

     java.io.IOException:
    io异常,是不是你脚本读写文件出问题
      

  3.   

    WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, worker1): java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2进行如下设置,解决报错信息。val conf = new SparkConf().setAppName("helloSpark").setMaster("spark://master:7077").set("spark.executor.memory", "2g")
    val sc = new SparkContext(conf)
    //注意添加下面的一句,我的问题就是这样解决的
    sc.addJar("/home/spark/IdeaProjects/FirstApp/out/artifacts/FirstAppjar1/FirstApp.jar")这对如下报错信息:16/02/23 16:39:53 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, worker1): java.lang.ClassNotFoundException: com.spark.firstApp.HelloSpark$$anonfun$2
      

  4.   

    我觉得你应该检查你的代码,通过caused by 这里去回到代码层面上