spark- submit
时间: 2023-09-29 07:02:57 浏览: 146
Spark-submit is a command-line tool that is used to submit Spark applications to a cluster. It is used to submit applications written in Java, Scala, or Python to a Spark cluster. The tool takes a Spark application and its dependencies as input and submits them to the cluster for execution.
The syntax for using spark-submit is as follows:
```spark-submit [options] <app jar | python file> [app arguments]```
Some of the commonly used options with spark-submit are:
- `--class`: The name of the class containing the main method of the application.
- `--master`: The URL of the cluster manager to which the application should be submitted.
- `--deploy-mode`: The mode in which the application should be deployed (cluster or client).
- `--num-executors`: The number of executors to be used for the application.
- `--executor-memory`: The memory allocated to each executor.
- `--driver-memory`: The memory allocated to the driver program.
Once the application is submitted, spark-submit launches the driver program on a cluster node and starts the application. The output of the application is then returned to the driver program, which collects and aggregates the results.
阅读全文