Browse Source

[Doc] Spark on K8S (#13605)

3.2.0-release
Aaron Wang 1 year ago committed by GitHub
parent
commit
938f13b568
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 35
      docs/docs/en/guide/task/spark.md
  2. 5
      docs/docs/zh/guide/task/spark.md

35
docs/docs/en/guide/task/spark.md

@ -20,23 +20,24 @@ Spark task type for executing Spark application. When executing the Spark task,
- Please refer to [DolphinScheduler Task Parameters Appendix](appendix.md) `Default Task Parameters` section for default parameters.
| **Parameter** | **Description** |
|----------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|
| Program type | Supports Java, Scala, Python, and SQL. |
| The class of main function | The **full path** of Main Class, the entry point of the Spark program. |
| Main jar package | The Spark jar package (upload by Resource Center). |
| SQL scripts | SQL statements in .sql files that Spark sql runs. |
| Deployment mode | <ul><li>spark submit supports three modes: yarn-clusetr, yarn-client and local.</li><li>spark sql supports yarn-client and local modes.</li></ul> |
| Task name | Spark task name. |
| Driver core number | Set the number of Driver core, which can be set according to the actual production environment. |
| Driver memory size | Set the size of Driver memories, which can be set according to the actual production environment. |
| Number of Executor | Set the number of Executor, which can be set according to the actual production environment. |
| Executor memory size | Set the size of Executor memories, which can be set according to the actual production environment. |
| Main program parameters | Set the input parameters of the Spark program and support the substitution of custom parameter variables. |
| Optional parameters | Support `--jars`, `--files`,` --archives`, `--conf` format. |
| Resource | Appoint resource files in the `Resource` if parameters refer to them. |
| Custom parameter | It is a local user-defined parameter for Spark, and will replace the content with `${variable}` in the script. |
| Predecessor task | Selecting a predecessor task for the current task, will set the selected predecessor task as upstream of the current task. |
| **Parameter** | **Description** |
|----------------------------|------------------------------------------------------------------------------------------------------------------------------------|
| Program type | Supports Java, Scala, Python, and SQL. |
| The class of main function | The **full path** of Main Class, the entry point of the Spark program. |
| Main jar package | The Spark jar package (upload by Resource Center). |
| SQL scripts | SQL statements in .sql files that Spark sql runs. |
| Deployment mode | <ul><li>spark submit supports three modes: cluster, client and local.</li><li>spark sql supports client and local modes.</li></ul> |
| Namespace (cluster) | Select the namespace, submit application to native kubernetes, instead, submit to yarn cluster (default). |
| Task name | Spark task name. |
| Driver core number | Set the number of Driver core, which can be set according to the actual production environment. |
| Driver memory size | Set the size of Driver memories, which can be set according to the actual production environment. |
| Number of Executor | Set the number of Executor, which can be set according to the actual production environment. |
| Executor memory size | Set the size of Executor memories, which can be set according to the actual production environment. |
| Main program parameters | Set the input parameters of the Spark program and support the substitution of custom parameter variables. |
| Optional parameters | Support `--jars`, `--files`,` --archives`, `--conf` format. |
| Resource | Appoint resource files in the `Resource` if parameters refer to them. |
| Custom parameter | It is a local user-defined parameter for Spark, and will replace the content with `${variable}` in the script. |
| Predecessor task | Selecting a predecessor task for the current task, will set the selected predecessor task as upstream of the current task. |
## Task Example

5
docs/docs/zh/guide/task/spark.md

@ -24,8 +24,9 @@ Spark 任务类型用于执行 Spark 应用。对于 Spark 节点,worker 支
- 主函数的 Class:Spark 程序的入口 Main class 的全路径。
- 主程序包:执行 Spark 程序的 jar 包(通过资源中心上传)。
- SQL脚本:Spark sql 运行的 .sql 文件中的 SQL 语句。
- 部署方式:(1) spark submit 支持 yarn-clusetr、yarn-client 和 local 三种模式。
(2) spark sql 支持 yarn-client 和 local 两种模式。
- 部署方式:(1) spark submit 支持 cluster、client 和 local 三种模式。
(2) spark sql 支持 client 和 local 两种模式。
- 命名空间(集群):若选择命名空间(集群),则以原生的方式提交至所选择 K8S 集群执行,未选择则提交至 Yarn 集群执行(默认)。
- 任务名称(可选):Spark 程序的名称。
- Driver 核心数:用于设置 Driver 内核数,可根据实际生产环境设置对应的核心数。
- Driver 内存数:用于设置 Driver 内存数,可根据实际生产环境设置对应的内存数。

Loading…
Cancel
Save