diff --git a/docs/docs/en/guide/task/seatunnel.md b/docs/docs/en/guide/task/seatunnel.md new file mode 100644 index 0000000000..e748360665 --- /dev/null +++ b/docs/docs/en/guide/task/seatunnel.md @@ -0,0 +1,82 @@ +# Apache SeaTunnel + +## Overview + +`SeaTunnel` task type for creating and executing `SeaTunnel` tasks. When the worker executes this task, it will parse the config file through the `start-seatunnel-spark.sh` or `start-seatunnel-flink.sh` command. +Click [here](https://seatunnel.apache.org/) for more information about `Apache SeaTunnel`. + +## Create Task + +- Click Project Management -> Project Name -> Workflow Definition, and click the "Create Workflow" button to enter the DAG editing page. +- Drag the from the toolbar to the drawing board. + +## Task Parameter + +- Node name: The node name in a workflow definition is unique. +- Run flag: Identifies whether this node can be scheduled normally, if it does not need to be executed, you can turn on the prohibition switch. +- Descriptive information: describe the function of the node. +- Task priority: When the number of worker threads is insufficient, they are executed in order from high to low, and when the priority is the same, they are executed according to the first-in first-out principle. +- Worker grouping: Tasks are assigned to the machines of the worker group to execute. If Default is selected, a worker machine will be randomly selected for execution. +- Environment Name: Configure the environment name in which to run the script. +- Number of failed retry attempts: The number of times the task failed to be resubmitted. +- Failed retry interval: The time, in cents, interval for resubmitting the task after a failed task. +- Cpu quota: Assign the specified CPU time quota to the task executed. Takes a percentage value. Default -1 means unlimited. For example, the full CPU load of one core is 100%,and that of 16 cores is 1600%. This function is controlled by [task.resource.limit.state](../../architecture/configuration.md) +- Max memory:Assign the specified max memory to the task executed. Exceeding this limit will trigger oom to be killed and will not automatically retry. Takes an MB value. Default -1 means unlimited. This function is controlled by [task.resource.limit.state](../../architecture/configuration.md) +- Delayed execution time: The time, in cents, that a task is delayed in execution. +- Timeout alarm: Check the timeout alarm and timeout failure. When the task exceeds the "timeout period", an alarm email will be sent and the task execution will fail. +- Engine: Supports FLINK and SPARK + - FLINK + - Run model: supports `run` and `run-application` modes + - Option parameters: used to add the parameters of the Flink engine, such as `-m yarn-cluster -ynm seatunnel` + - SPARK + - Deployment mode: specify the deployment mode, `cluster` `client` `local` + - Master: Specify the `Master` model, `yarn` `local` `spark` `mesos`, where `spark` and `mesos` need to specify the `Master` service address, for example: 127.0.0.1:7077 + > Click [here](https://seatunnel.apache.org/docs/2.1.2/command/usage) for more information on the usage of `Apache SeaTunnel command` +- Custom Configuration: Supports custom configuration or select configuration file from Resource Center + > Click [here](https://seatunnel.apache.org/docs/2.1.2/concept/config) for more information about `Apache SeaTunnel config` file +- Script: Customize configuration information on the task node, including four parts: `env` `source` `transform` `sink` +- Resource file: The configuration file of the resource center can be referenced in the task node, and only one configuration file can be referenced. +- Predecessor task: Selecting a predecessor task for the current task will set the selected predecessor task as upstream of the current task. + +## Task Example + +This sample demonstrates using the Flink engine to read data from a Fake source and print to the console. + +### Configuring the SeaTunnel environment in DolphinScheduler + +If you want to use the SeaTunnel task type in the production environment, you need to configure the required environment first. The configuration file is as follows: `/dolphinscheduler/conf/env/dolphinscheduler_env.sh`. + +![seatunnel_task01](../../../../img/tasks/demo/seatunnel_task01.png) + +### Configuring SeaTunnel Task Node + +According to the above parameter description, configure the required content. + +![seatunnel_task02](../../../../img/tasks/demo/seatunnel_task02.png) + +### Config example + +```Config + +env { + execution.parallelism = 1 +} + +source { + FakeSource { + result_table_name = "fake" + field_name = "name,age" + } +} + +transform { + sql { + sql = "select name,age from fake" + } +} + +sink { + ConsoleSink {} +} + +``` diff --git a/docs/docs/en/guide/task/stored-procedure.md b/docs/docs/en/guide/task/stored-procedure.md index 5f307878df..5aff75207c 100644 --- a/docs/docs/en/guide/task/stored-procedure.md +++ b/docs/docs/en/guide/task/stored-procedure.md @@ -5,7 +5,7 @@ > Drag from the toolbar ![PNG](https://analysys.github.io/easyscheduler_docs_cn/images/toolbar_PROCEDURE.png) task node into the canvas, as shown in the figure below:
- +
## Task Parameters diff --git a/docs/docs/en/guide/task/zeppelin.md b/docs/docs/en/guide/task/zeppelin.md new file mode 100644 index 0000000000..8c8de9dae6 --- /dev/null +++ b/docs/docs/en/guide/task/zeppelin.md @@ -0,0 +1,53 @@ +# Apache Zeppelin + +## Overview + +Use `Zeppelin Task` to create a zeppelin-type task and execute zeppelin notebook paragraphs. When the worker executes `Zeppelin Task`, +it will call `Zeppelin Client API` to trigger zeppelin notebook paragraph. Click [here](https://zeppelin.apache.org/) for details about `Apache Zeppelin Notebook`. + +## Create Task + +- Click `Project Management -> Project Name -> Workflow Definition`, and click the `Create Workflow` button to enter the DAG editing page. +- Drag from the toolbar to the canvas. + +## Task Parameters + +| **Parameter** | **Description** | +| ------- | ---------- | +| Node Name | Set the name of the task. Node names within a workflow definition are unique. | +| Run flag | Indicates whether the node can be scheduled normally. If it is not necessary to execute, you can turn on the prohibiting execution switch. | +| Description | Describes the function of this node. | +| Task priority | When the number of worker threads is insufficient, they are executed in order from high to low according to the priority, and they are executed according to the first-in, first-out principle when the priority is the same. | +| Worker group | The task is assigned to the machines in the worker group for execution. If Default is selected, a worker machine will be randomly selected for execution. | +| Task group name | The group in Resources, if not configured, it will not be used. | +| Environment Name | Configure the environment in which to run the script. | +| Number of failed retries | The number of times the task is resubmitted after failure. It supports drop-down and manual filling. | +| Failure Retry Interval | The time interval for resubmitting the task if the task fails. It supports drop-down and manual filling. | +| Timeout alarm | Check Timeout Alarm and Timeout Failure. When the task exceeds the "timeout duration", an alarm email will be sent and the task execution will fail. | +| Zeppelin Note ID | The unique note id for a zeppelin notebook note. | +| Zeppelin Paragraph ID | The unique paragraph id for a zeppelin notebook paragraph. If you want to schedule a whole note at a time, leave this field blank. | +| Zeppelin Production Note Directory | The directory for cloned note in production mode. | +| Zeppelin Rest Endpoint | The REST endpoint of your zeppelin server | +| Zeppelin Parameters | Parameters in json format used for zeppelin dynamic form. | + +## Production (Clone) Mode + +- Fill in the optional `Zeppelin Production Note Directory` parameter to enable `Production Mode`. +- In `Production Mode`, the target note gets copied to the `Zeppelin Production Note Directory` you choose. +`Zeppelin Task Plugin` will execute the cloned note instead of the original one. Once execution done, +`Zeppelin Task Plugin` will delete the cloned note automatically. +Therefore, it increases the stability as the modification to a running note triggered by `Dolphin Scheduler` +will not affect the production task. +- If you leave the `Zeppelin Production Note Directory` empty, `Zeppelin Task Plugin` will execute the original note. +- 'Zeppelin Production Note Directory' should both start and end with a `slash`. e.g. `/production_note_directory/` + +## Task Example + +### Zeppelin Paragraph Task Example + +This example illustrates how to create a zeppelin paragraph task node. + +![demo-zeppelin-paragraph](../../../../img/tasks/demo/zeppelin.png) + +![demo-get-zeppelin-id](../../../../img/tasks/demo/zeppelin_id.png) + diff --git a/docs/docs/zh/guide/task/seatunnel.md b/docs/docs/zh/guide/task/seatunnel.md new file mode 100644 index 0000000000..c15844559a --- /dev/null +++ b/docs/docs/zh/guide/task/seatunnel.md @@ -0,0 +1,82 @@ +# Apache SeaTunnel + +## 综述 + +`SeaTunnel` 任务类型,用于创建并执行 `SeaTunnel` 类型任务。worker 执行该任务的时候,会通过 `start-seatunnel-spark.sh` 或 `start-seatunnel-flink.sh` 命令解析 config 文件。 +点击 [这里](https://seatunnel.apache.org/) 获取更多关于 `Apache SeaTunnel` 的信息。 + +## 创建任务 + +- 点击项目管理 -> 项目名称 -> 工作流定义,点击“创建工作流”按钮,进入 DAG 编辑页面; +- 拖动工具栏的 任务节点到画板中。 + +## 任务参数 + +- 节点名称:设置任务节点的名称。一个工作流定义中的节点名称是唯一的。 +- 运行标志:标识这个结点是否能正常调度,如果不需要执行,可以打开禁止执行开关。 +- 描述:描述该节点的功能。 +- 任务优先级:worker 线程数不足时,根据优先级从高到低依次执行,优先级一样时根据先进先出原则执行。 +- Worker 分组:任务分配给 worker 组的机器执行,选择 Default ,会随机选择一台 worker 机执行。 +- 环境名称:配置运行脚本的环境。 +- 失败重试次数:任务失败重新提交的次数。 +- 失败重试间隔:任务失败重新提交任务的时间间隔,以分为单位。 +- Cpu 配额: 为执行的任务分配指定的CPU时间配额,单位百分比,默认-1代表不限制,例如1个核心的CPU满载是100%,16个核心的是1600%。这个功能由 [task.resource.limit.state](../../architecture/configuration.md) 控制 +- 最大内存:为执行的任务分配指定的内存大小,超过会触发OOM被Kill同时不会进行自动重试,单位MB,默认-1代表不限制。这个功能由 [task.resource.limit.state](../../architecture/configuration.md) 控制 +- 延时执行时间:任务延迟执行的时间,以分为单位。 +- 超时警告:勾选超时警告、超时失败,当任务超过“超时时长”后,会发送告警邮件并且任务执行失败。 +- 引擎:支持 FLINK 和 SPARK + - FLINK + - 运行模型:支持 `run` 和 `run-application` 两种模式 + - 选项参数:用于添加 Flink 引擎本身参数,例如 `-m yarn-cluster -ynm seatunnel` + - SPARK + - 部署方式:指定部署模式,`cluster` `client` `local` + - Master:指定 `Master` 模型,`yarn` `local` `spark` `mesos`,其中 `spark` 和 `mesos` 需要指定 `Master` 服务地址,例如:127.0.0.1:7077 + > 点击 [这里](https://seatunnel.apache.org/docs/2.1.2/command/usage) 获取更多关于`Apache SeaTunnel command` 使用的信息 +- 自定义配置:支持自定义配置或从资源中心选择配置文件 + > 点击 [这里](https://seatunnel.apache.org/docs/2.1.2/concept/config) 获取更多关于`Apache SeaTunnel config` 文件介绍 +- 脚本:在任务节点那自定义配置信息,包括四部分:`env` `source` `transform` `sink` +- 资源文件:在任务节点引用资源中心的配置文件,只可以引用一个配置文件。 +- 前置任务:选择当前任务的前置任务,会将被选择的前置任务设置为当前任务的上游。 + +## 任务样例 + +该样例演示为使用 Flink 引擎从 Fake 源读取数据打印到控制台。 + +### 在 DolphinScheduler 中配置 SeaTunnel 环境 + +若生产环境中要是使用到 SeaTunnel 任务类型,则需要先配置好所需的环境,配置文件如下:`/dolphinscheduler/conf/env/dolphinscheduler_env.sh`。 + +![seatunnel_task01](../../../../img/tasks/demo/seatunnel_task01.png) + +### 配置 SeaTunnel 任务节点 + +根据上述参数说明,配置所需的内容即可。 + +![seatunnel_task02](../../../../img/tasks/demo/seatunnel_task02.png) + +### Config 样例 + +```Config + +env { + execution.parallelism = 1 +} + +source { + FakeSource { + result_table_name = "fake" + field_name = "name,age" + } +} + +transform { + sql { + sql = "select name,age from fake" + } +} + +sink { + ConsoleSink {} +} + +``` diff --git a/docs/img/new_ui/dev/project/workflow_date_manual.png b/docs/img/new_ui/dev/project/workflow_date_manual.png new file mode 100644 index 0000000000..70d03372f9 Binary files /dev/null and b/docs/img/new_ui/dev/project/workflow_date_manual.png differ diff --git a/docs/img/new_ui/dev/security/create-cluster.png b/docs/img/new_ui/dev/security/create-cluster.png new file mode 100644 index 0000000000..539c0ba25d Binary files /dev/null and b/docs/img/new_ui/dev/security/create-cluster.png differ diff --git a/docs/img/new_ui/dev/security/create-namespace.png b/docs/img/new_ui/dev/security/create-namespace.png new file mode 100644 index 0000000000..c7a7632629 Binary files /dev/null and b/docs/img/new_ui/dev/security/create-namespace.png differ diff --git a/docs/img/tasks/demo/seatunnel_task01.png b/docs/img/tasks/demo/seatunnel_task01.png new file mode 100644 index 0000000000..d802b95159 Binary files /dev/null and b/docs/img/tasks/demo/seatunnel_task01.png differ diff --git a/docs/img/tasks/demo/seatunnel_task02.png b/docs/img/tasks/demo/seatunnel_task02.png new file mode 100644 index 0000000000..0607b06318 Binary files /dev/null and b/docs/img/tasks/demo/seatunnel_task02.png differ diff --git a/docs/img/tasks/icons/seatunnel.png b/docs/img/tasks/icons/seatunnel.png new file mode 100644 index 0000000000..4b11920c5f Binary files /dev/null and b/docs/img/tasks/icons/seatunnel.png differ diff --git a/dolphinscheduler-api/src/test/java/org/apache/dolphinscheduler/api/service/QueueServiceTest.java b/dolphinscheduler-api/src/test/java/org/apache/dolphinscheduler/api/service/QueueServiceTest.java index cd8cdacce2..fa7c7ea69b 100644 --- a/dolphinscheduler-api/src/test/java/org/apache/dolphinscheduler/api/service/QueueServiceTest.java +++ b/dolphinscheduler-api/src/test/java/org/apache/dolphinscheduler/api/service/QueueServiceTest.java @@ -17,12 +17,8 @@ package org.apache.dolphinscheduler.api.service; -import static org.apache.dolphinscheduler.api.constants.ApiFuncIdentificationConstant.YARN_QUEUE_CREATE; -import static org.apache.dolphinscheduler.api.constants.ApiFuncIdentificationConstant.YARN_QUEUE_UPDATE; - import org.apache.dolphinscheduler.api.enums.Status; import org.apache.dolphinscheduler.api.exceptions.ServiceException; -import org.apache.dolphinscheduler.api.permission.ResourcePermissionCheckService; import org.apache.dolphinscheduler.api.service.impl.BaseServiceImpl; import org.apache.dolphinscheduler.api.service.impl.QueueServiceImpl; import org.apache.dolphinscheduler.api.utils.PageInfo; @@ -75,9 +71,6 @@ public class QueueServiceTest { @Mock private UserMapper userMapper; - @Mock - private ResourcePermissionCheckService resourcePermissionCheckService; - private static final String QUEUE = "queue"; private static final String QUEUE_NAME = "queueName"; private static final String EXISTS = "exists"; @@ -94,10 +87,7 @@ public class QueueServiceTest { @Test public void testQueryList() { - Set