Spark在Yarn上的动态资源分配

参考地址:http://spark.apache.org/docs/1.5.2/job-scheduling.html#configuration-and-setup

1.配置hadoop/etc/yarn-site.xml

  1. <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle,spark_shuffle</value>
    </property>

  2. <property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
      <value>org.apache.spark.network.yarn.YarnShuffleService</value>
    </property>

  3. <!-- 默认端口号 -->
  4. <property>
  <name>spark.shuffle.service.port</name>
  <value>7337</value>
</property>

2.拷贝spark-xxx-yarn-shuffle.jar到指定目录



拷贝spark-xxx-yarn-shuffle.jar到 hadoop/share/hadoop/yarn/里面

3.配置spark-default.conf



spark.shuffle.service.enabled  true

spark.shuffle.service.prot         7337


Dynamic Allocation

Property NameDefaultMeaning
spark.dynamicAllocation.enabledfalseWhether to use dynamic resource allocation, which scales the number of executors registered with this application up and down based on the workload. Note that this is currently only available on YARN mode. For more detail, see the description here

This requires spark.shuffle.service.enabled to be set. The following configurations are also relevant:spark.dynamicAllocation.minExecutors,spark.dynamicAllocation.maxExecutors, andspark.dynamicAllocation.initialExecutors
spark.dynamicAllocation.executorIdleTimeout60sIf dynamic allocation is enabled and an executor has been idle for more than this duration, the executor will be removed. For more detail, see thisdescription.
spark.dynamicAllocation.cachedExecutorIdleTimeoutinfinityIf dynamic allocation is enabled and an executor which has cached data blocks has been idle for more than this duration, the executor will be removed. For more details, see this description.
spark.dynamicAllocation.initialExecutorsspark.dynamicAllocation.minExecutorsInitial number of executors to run if dynamic allocation is enabled.
spark.dynamicAllocation.maxExecutorsinfinityUpper bound for the number of executors if dynamic allocation is enabled.
spark.dynamicAllocation.minExecutors0Lower bound for the number of executors if dynamic allocation is enabled.
spark.dynamicAllocation.schedulerBacklogTimeout1sIf dynamic allocation is enabled and there have been pending tasks backlogged for more than this duration, new executors will be requested. For more detail, see this description.
spark.dynamicAllocation.sustainedSchedulerBacklogTimeoutschedulerBacklogTimeoutSame asspark.dynamicAllocation.schedulerBacklogTimeout, but used only for subsequent executor requests. For more detail, see this description.

4.重启Yarn服务


版权声明:本文为kimsungho原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。