spark的avro参数设置
时间: 2023-10-17 12:31:07 浏览: 44
Spark的Avro参数设置可以通过使用--conf选项来手动设置。具体的设置方式取决于你是在cluster模式还是client模式下运行Spark应用。在cluster模式下,可以使用以下命令来设置Avro参数:
spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--conf spark.yarn.am.memory=1024m \
--conf spark.executor.memory=3g \
--conf spark.driver.memory=2g \
--conf spark.hadoop.avro.mapred.ignore.inputs.without.extension=false \
/usr/lib/spark/lib/spark-examples-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar \
100000
在client模式下,可以使用以下命令来设置Avro参数:
spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-client \
--conf spark.yarn.am.memory=1024m \
--conf spark.executor.memory=3g \
--conf spark.driver.memory=2g \
--conf spark.hadoop.avro.mapred.ignore.inputs.without.extension=false \
/usr/lib/spark/lib/spark-examples-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar \
100000
以上命令中,spark.yarn.am.memory用于设置ApplicationMaster的内存,spark.executor.memory用于设置每个Executor的内存,spark.driver.memory用于设置Driver的内存。另外,spark.hadoop.avro.mapred.ignore.inputs.without.extension参数可用于指定是否忽略没有扩展名的输入文件。你可以根据自己的需求调整这些参数的值。<span class="em">1</span><span class="em">2</span><span class="em">3</span>
#### 引用[.reference_title]
- *1* *2* [spark-on-yarn配置参数](https://blog.csdn.net/okbin1991/article/details/129156627)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"]
- *3* [spark yarn 参数分析](https://blog.csdn.net/ZYC88888/article/details/78533944)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"]
[ .reference_list ]