Browse Source

[fix][doc] Fix sql-hive and hive-cli doc (#12765)

3.2.0-release
Tq 2 years ago committed by GitHub
parent
commit
9bba4b105c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 2
      docs/docs/en/guide/task/hive-cli.md
  2. 21
      docs/docs/en/guide/task/sql.md
  3. 2
      docs/docs/zh/guide/task/hive-cli.md
  4. 5
      docs/docs/zh/guide/task/sql.md
  5. BIN
      docs/img/tasks/demo/hive_cli_from_script.png
  6. BIN
      docs/img/tasks/demo/pre_post_sql.png

2
docs/docs/en/guide/task/hive-cli.md

@ -33,7 +33,7 @@ You could choose between these two based on your needs.
|------------------------------|------------------------------------------------------------------------------------------------------|
| Hive Cli Task Execution Type | The type of hive cli task execution, choose either `FROM_SCRIPT` or `FROM_FILE`. |
| Hive SQL Script | If you choose `FROM_SCRIPT` for `Hive Cli Task Execution Type`, you need to fill in your SQL script. |
| Hive Cli Options | Extra options for hive cli, such as `--verbose` |
| Hive Cli Options | Extra options for hive cli, such as `--verbose` to check execution result. |
| Resources | If you choose `FROM_FILE` for `Hive Cli Task Execution Type`, you need to select your SQL file. |
## Task Example

21
docs/docs/en/guide/task/sql.md

@ -20,16 +20,16 @@ Refer to [datasource-setting](../howto/datasource-setting.md) `DataSource Center
- Please refer to [DolphinScheduler Task Parameters Appendix](appendix.md) `Default Task Parameters` section for default parameters.
| **Parameter** | **Description** |
|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Data source | Select the corresponding DataSource. |
| SQL type | Supports query and non-query. <ul><li>Query: supports `DML select` type commands, which return a result set. You can specify three templates for email notification as form, attachment or form attachment;</li><li>Non-query: support `DDL` all commands and `DML update, delete, insert` three types of commands;<ul><li>Segmented execution symbol: When the data source does not support executing multiple SQL statements at a time, the symbol for splitting SQL statements is provided to call the data source execution method multiple times. Example: 1. When the Hive data source is selected as the data source, this parameter does not need to be filled in. Because the Hive data source itself supports executing multiple SQL statements at one time; 2. When the MySQL data source is selected as the data source, and multi-segment SQL statements are to be executed, this parameter needs to be filled in with a semicolon `;. Because the MySQL data source does not support executing multiple SQL statements at one time.</li></ul></li></ul> |
| SQL parameter | The input parameter format is `key1=value1;key2=value2...`. |
| SQL statement | SQL statement. |
| UDF function | For Hive DataSources, you can refer to UDF functions created in the resource center, but other DataSource do not support UDF functions. |
| Custom parameters | SQL task type, and stored procedure is a custom parameter order, to set customized parameter type and data type for the method is the same as the stored procedure task type. The difference is that the custom parameter of the SQL task type replaces the `${variable}` in the SQL statement. |
| Pre-SQL | Pre-SQL executes before the SQL statement. |
| Post-SQL | Post-SQL executes after the SQL statement. |
| **Parameter** | **Description** |
|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Data source | Select the corresponding DataSource. |
| SQL type | Supports query and non-query. <ul><li>Query: supports `DML select` type commands, which return a result set. You can specify three templates for email notification as form, attachment or form attachment;</li><li>Non-query: support `DDL` all commands and `DML update, delete, insert` three types of commands;<ul><li>Segmented execution symbol: When the data source does not support executing multiple SQL statements at a time, the symbol for splitting SQL statements is provided to call the data source execution method multiple times. Example: 1. When the Hive data source is selected as the data source, please do not use `;\n` due to Hive JDBC does not support executing multiple SQL statements at one time; 2. When the MySQL data source is selected as the data source, and multi-segment SQL statements are to be executed, this parameter needs to be filled in with a semicolon `;. Because the MySQL data source does not support executing multiple SQL statements at one time.</li></ul></li></ul> |
| SQL parameter | The input parameter format is `key1=value1;key2=value2...`. |
| SQL statement | SQL statement. |
| UDF function | For Hive DataSources, you can refer to UDF functions created in the resource center, but other DataSource do not support UDF functions. |
| Custom parameters | SQL task type, and stored procedure is a custom parameter order, to set customized parameter type and data type for the method is the same as the stored procedure task type. The difference is that the custom parameter of the SQL task type replaces the `${variable}` in the SQL statement. |
| Pre-SQL | Pre-SQL executes before the SQL statement. |
| Post-SQL | Post-SQL executes after the SQL statement. |
## Task Example
@ -38,6 +38,7 @@ Refer to [datasource-setting](../howto/datasource-setting.md) `DataSource Center
#### Create a Temporary Table in Hive and Write Data
This example creates a temporary table `tmp_hello_world` in Hive and writes a row of data. Before creating a temporary table, we need to ensure that the table does not exist. So we use custom parameters to obtain the time of the day as the suffix of the table name every time we run, this task can run every different day. The format of the created table name is: `tmp_hello_world_{yyyyMMdd}`.
**Note**:the hive datasource in the SQL task based on JDBC to execute SQLs, SQL statement does not support multi-statements, please avoid using ';' at the end of the statement. To process multi-statements please use [Hive-Cli](./hive-cli.md) task.
![hive-sql](../../../../img/tasks/demo/hive-sql.png)

2
docs/docs/zh/guide/task/hive-cli.md

@ -31,7 +31,7 @@
|---------------|-----------------------------------------------------|
| Hive Cli 任务类型 | Hive Cli任务执行方式,可以选择`FROM_SCRIPT`或者`FROM_FILE`。 |
| Hive SQL 脚本 | 手动填入您的Hive SQL脚本语句。 |
| Hive Cli 选项 | Hive Cli的其他选项,如`--verbose`。 |
| Hive Cli 选项 | Hive Cli的其他选项,如`--verbose`来查看任务结果。 |
| 资源 | 如果您选择`FROM_FILE`作为Hive Cli任务类型,您需要在资源中选择Hive SQL文件。 |
## 任务样例

5
docs/docs/zh/guide/task/sql.md

@ -23,7 +23,7 @@ SQL任务类型,用于连接数据库并执行相应SQL。
- sql类型:支持查询和非查询两种。
- 查询:支持 `DML select` 类型的命令,是有结果集返回的,可以指定邮件通知为表格、附件或表格附件三种模板;
- 非查询:支持 `DDL`全部命令 和 `DML update、delete、insert` 三种类型的命令;
- 默认采用`;\n`作为SQL分隔符,拆分成多段SQL语句执行。Hive支持一次执行多段SQL语句,故不会拆分
- 默认采用`;\n`作为SQL分隔符,拆分成多段SQL语句执行。Hive的JDBC不支持一次执行多段SQL语句,请不要使用`;\n`
- sql参数:输入参数格式为key1=value1;key2=value2…
- sql语句:SQL语句
- UDF函数:对于HIVE类型的数据源,可以引用资源中心中创建的UDF函数,其他类型的数据源暂不支持UDF函数。
@ -38,6 +38,7 @@ SQL任务类型,用于连接数据库并执行相应SQL。
#### 在hive中创建临时表并写入数据
该样例向hive中创建临时表`tmp_hello_world`并写入一行数据。选择SQL类型为非查询,在创建临时表之前需要确保该表不存在,所以我们使用自定义参数,在每次运行时获取当天时间作为表名后缀,这样这个任务就可以每天运行。创建的表名格式为:`tmp_hello_world_{yyyyMMdd}`。
**注意**:sql任务组件的hive应用是基于JDBC去调用,SQL statement 不支持多行执行,请注意不要在语句末尾使用';'。如果要执行多行语句请使用[Hive-Cli](./hive-cli.md)任务。
![hive-sql](../../../../img/tasks/demo/hive-sql.png)
@ -49,7 +50,7 @@ SQL任务类型,用于连接数据库并执行相应SQL。
### 使用前置sql和后置sql示例
在前置sql中执行建表操作,在sql语句中执行操作,在后置sql中执行清理操作
在前置sql中执行建表操作,在sql语句中执行操作,在后置sql中执行清理操作
![pre_post_sql](../../../../img/tasks/demo/pre_post_sql.png)

BIN
docs/img/tasks/demo/hive_cli_from_script.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 386 KiB

After

Width:  |  Height:  |  Size: 68 KiB

BIN
docs/img/tasks/demo/pre_post_sql.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 85 KiB

After

Width:  |  Height:  |  Size: 21 KiB

Loading…
Cancel
Save