diff --git a/docker/build/README.md b/docker/build/README.md index f1c952b4cb..6612307445 100644 --- a/docker/build/README.md +++ b/docker/build/README.md @@ -1,12 +1,12 @@ -## What is Dolphin Scheduler? +## What is DolphinScheduler? -Dolphin Scheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing. +DolphinScheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing. GitHub URL: https://github.com/apache/incubator-dolphinscheduler Official Website: https://dolphinscheduler.apache.org -![Dolphin Scheduler](https://dolphinscheduler.apache.org/img/hlogo_colorful.svg) +![DolphinScheduler](https://dolphinscheduler.apache.org/img/hlogo_colorful.svg) [![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) [![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) @@ -23,7 +23,7 @@ The default **postgres** user `root`, postgres password `root` and database `dol The default **zookeeper** is created in the `docker-compose.yml`. -Access the Web UI:http://192.168.xx.xx:12345/dolphinscheduler +Access the Web UI: http://192.168.xx.xx:12345/dolphinscheduler The default username is `admin` and the default password is `dolphinscheduler123` @@ -62,7 +62,7 @@ $ docker run -d --name dolphinscheduler-master \ apache/dolphinscheduler:latest master-server ``` -* Start a **worker server**, For example: +* Start a **worker server** (including **logger server**), For example: ``` $ docker run -d --name dolphinscheduler-worker \ @@ -118,7 +118,7 @@ Please read `./docker/build/hooks/build` `./docker/build/hooks/build.bat` script ## Environment Variables -The Dolphin Scheduler image uses several environment variables which are easy to miss. While none of the variables are required, they may significantly aid you in using the image. +The DolphinScheduler Docker container is configured through environment variables, and the default value will be used if an environment variable is not set. **`DATABASE_TYPE`** @@ -168,9 +168,41 @@ This environment variable sets the database for database. The default value is ` **Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. -**`DOLPHINSCHEDULER_ENV_PATH`** +**`HADOOP_HOME`** + +This environment variable sets `HADOOP_HOME`. The default value is `/opt/soft/hadoop`. + +**`HADOOP_CONF_DIR`** + +This environment variable sets `HADOOP_CONF_DIR`. The default value is `/opt/soft/hadoop/etc/hadoop`. + +**`SPARK_HOME1`** + +This environment variable sets `SPARK_HOME1`. The default value is `/opt/soft/spark1`. + +**`SPARK_HOME2`** + +This environment variable sets `SPARK_HOME2`. The default value is `/opt/soft/spark2`. + +**`PYTHON_HOME`** + +This environment variable sets `PYTHON_HOME`. The default value is `/usr/bin/python`. + +**`JAVA_HOME`** + +This environment variable sets `JAVA_HOME`. The default value is `/usr/lib/jvm/java-1.8-openjdk`. -This environment variable sets the runtime environment for task. The default value is `/opt/dolphinscheduler/conf/env/dolphinscheduler_env.sh`. +**`HIVE_HOME`** + +This environment variable sets `HIVE_HOME`. The default value is `/opt/soft/hive`. + +**`FLINK_HOME`** + +This environment variable sets `FLINK_HOME`. The default value is `/opt/soft/flink`. + +**`DATAX_HOME`** + +This environment variable sets `DATAX_HOME`. The default value is `/opt/soft/datax/bin/datax.py`. **`DOLPHINSCHEDULER_DATA_BASEDIR_PATH`** @@ -268,7 +300,7 @@ This environment variable sets port for `worker-server`. The default value is `1 **`WORKER_GROUPS`** -This environment variable sets group for `worker-server`. The default value is `default`. +This environment variable sets groups for `worker-server`. The default value is `default`. **`WORKER_WEIGHT`** @@ -308,3 +340,142 @@ EOF " > ${DOLPHINSCHEDULER_HOME}/conf/${line%.*} done ``` + +## FAQ + +### How to stop dolphinscheduler by docker-compose? + +Stop containers: + +``` +docker-compose stop +``` + +Stop containers and remove containers, networks and volumes: + +``` +docker-compose down -v +``` + +### How to deploy dolphinscheduler on Docker Swarm? + +Assuming that the Docker Swarm cluster has been created (If there is no Docker Swarm cluster, please refer to [create-swarm](https://docs.docker.com/engine/swarm/swarm-tutorial/create-swarm/)) + +Start a stack named dolphinscheduler + +``` +docker stack deploy -c docker-stack.yml dolphinscheduler +``` + +Stop and remove the stack named dolphinscheduler + +``` +docker stack rm dolphinscheduler +``` + +### How to use MySQL as the DolphinScheduler's database instead of PostgreSQL? + +> Because of the commercial license, we cannot directly use the driver and client of MySQL. +> +> If you want to use MySQL, you can build a new image based on the `apache/dolphinscheduler` image as follows. + +1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) + +2. Create a new `Dockerfile` to add MySQL driver and client: + +``` +FROM apache/dolphinscheduler:latest +COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib +RUN apk add --update --no-cache mysql-client +``` + +3. Build a new docker image including MySQL driver and client: + +``` +docker build -t apache/dolphinscheduler:mysql . +``` + +4. Modify all `image` fields to `apache/dolphinscheduler:mysql` in `docker-compose.yml` + +> If you want to deploy dolphinscheduler on Docker Swarm, you need modify `docker-stack.yml` + +5. Comment the `dolphinscheduler-postgresql` block in `docker-compose.yml` + +6. Add `dolphinscheduler-mysql` service in `docker-compose.yml` (**Optional**, you can directly use a external MySQL database) + +7. Modify all DATABASE environments in `docker-compose.yml` + +``` +DATABASE_TYPE: mysql +DATABASE_DRIVER: com.mysql.jdbc.Driver +DATABASE_HOST: dolphinscheduler-mysql +DATABASE_PORT: 3306 +DATABASE_USERNAME: root +DATABASE_PASSWORD: root +DATABASE_DATABASE: dolphinscheduler +DATABASE_PARAMS: useUnicode=true&characterEncoding=UTF-8 +``` + +> If you have added `dolphinscheduler-mysql` service in `docker-compose.yml`, just set `DATABASE_HOST` to `dolphinscheduler-mysql` + +8. Run a dolphinscheduler (See **How to use this docker image**) + +### How to support MySQL datasource in `Datasource manage`? + +> Because of the commercial license, we cannot directly use the driver of MySQL. +> +> If you want to add MySQL datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. + +1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) + +2. Create a new `Dockerfile` to add MySQL driver: + +``` +FROM apache/dolphinscheduler:latest +COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib +``` + +3. Build a new docker image including MySQL driver: + +``` +docker build -t apache/dolphinscheduler:mysql-driver . +``` + +4. Modify all `image` fields to `apache/dolphinscheduler:mysql-driver` in `docker-compose.yml` + +> If you want to deploy dolphinscheduler on Docker Swarm, you need modify `docker-stack.yml` + +5. Run a dolphinscheduler (See **How to use this docker image**) + +6. Add a MySQL datasource in `Datasource manage` + +### How to support Oracle datasource in `Datasource manage`? + +> Because of the commercial license, we cannot directly use the driver of Oracle. +> +> If you want to add Oracle datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. + +1. Download the Oracle driver [ojdbc8.jar](https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/) (such as `ojdbc8-19.9.0.0.jar`) + +2. Create a new `Dockerfile` to add Oracle driver: + +``` +FROM apache/dolphinscheduler:latest +COPY ojdbc8-19.9.0.0.jar /opt/dolphinscheduler/lib +``` + +3. Build a new docker image including Oracle driver: + +``` +docker build -t apache/dolphinscheduler:oracle-driver . +``` + +4. Modify all `image` fields to `apache/dolphinscheduler:oracle-driver` in `docker-compose.yml` + +> If you want to deploy dolphinscheduler on Docker Swarm, you need modify `docker-stack.yml` + +5. Run a dolphinscheduler (See **How to use this docker image**) + +6. Add a Oracle datasource in `Datasource manage` + +For more information please refer to the [incubator-dolphinscheduler](https://github.com/apache/incubator-dolphinscheduler.git) documentation. diff --git a/docker/build/README_zh_CN.md b/docker/build/README_zh_CN.md index 993a27435e..311c53c02d 100644 --- a/docker/build/README_zh_CN.md +++ b/docker/build/README_zh_CN.md @@ -1,4 +1,4 @@ -## Dolphin Scheduler是什么? +## DolphinScheduler是什么? 一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中`开箱即用`。 @@ -6,7 +6,7 @@ GitHub URL: https://github.com/apache/incubator-dolphinscheduler Official Website: https://dolphinscheduler.apache.org -![Dolphin Scheduler](https://dolphinscheduler.apache.org/img/hlogo_colorful.svg) +![DolphinScheduler](https://dolphinscheduler.apache.org/img/hlogo_colorful.svg) [![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) [![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) @@ -14,6 +14,7 @@ Official Website: https://dolphinscheduler.apache.org ## 如何使用docker镜像 #### 以 docker-compose 的方式启动dolphinscheduler(推荐) + ``` $ docker-compose -f ./docker/docker-swarm/docker-compose.yml up -d ``` @@ -22,7 +23,9 @@ $ docker-compose -f ./docker/docker-swarm/docker-compose.yml up -d 同时,默认的`Zookeeper`也会在`docker-compose.yml`文件中被创建。 -访问前端界面:http://192.168.xx.xx:12345/dolphinscheduler +访问前端页面:http://192.168.xx.xx:12345/dolphinscheduler + +默认的用户是`admin`,默认的密码是`dolphinscheduler123` #### 或者通过环境变量 **`DATABASE_HOST`** **`DATABASE_PORT`** **`ZOOKEEPER_QUORUM`** 使用已存在的服务 @@ -37,7 +40,7 @@ $ docker run -d --name dolphinscheduler \ apache/dolphinscheduler:latest all ``` -访问前端界面:http://192.168.xx.xx:12345/dolphinscheduler +访问前端页面:http://192.168.xx.xx:12345/dolphinscheduler #### 或者运行dolphinscheduler中的部分服务 @@ -59,7 +62,7 @@ $ docker run -d --name dolphinscheduler-master \ apache/dolphinscheduler:latest master-server ``` -* 启动一个 **worker server**, 如下: +* 启动一个 **worker server** (包括 **logger server**), 如下: ``` $ docker run -d --name dolphinscheduler-worker \ @@ -115,7 +118,7 @@ C:\incubator-dolphinscheduler>.\docker\build\hooks\build.bat ## 环境变量 -Dolphin Scheduler映像使用了几个容易遗漏的环境变量。虽然这些变量不是必须的,但是可以帮助你更容易配置镜像并根据你的需求定义相应的服务配置。 +DolphinScheduler Docker 容器通过环境变量进行配置,缺省时将会使用默认值 **`DATABASE_TYPE`** @@ -165,9 +168,41 @@ Dolphin Scheduler映像使用了几个容易遗漏的环境变量。虽然这些 **注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 -**`DOLPHINSCHEDULER_ENV_PATH`** +**`HADOOP_HOME`** + +配置`dolphinscheduler`的`HADOOP_HOME`,默认值 `/opt/soft/hadoop`。 + +**`HADOOP_CONF_DIR`** + +配置`dolphinscheduler`的`HADOOP_CONF_DIR`,默认值 `/opt/soft/hadoop/etc/hadoop`。 + +**`SPARK_HOME1`** + +配置`dolphinscheduler`的`SPARK_HOME1`,默认值 `/opt/soft/spark1`。 + +**`SPARK_HOME2`** + +配置`dolphinscheduler`的`SPARK_HOME2`,默认值 `/opt/soft/spark2`。 + +**`PYTHON_HOME`** + +配置`dolphinscheduler`的`PYTHON_HOME`,默认值 `/usr/bin/python`。 + +**`JAVA_HOME`** -任务执行时的环境变量配置文件, 默认值 `/opt/dolphinscheduler/conf/env/dolphinscheduler_env.sh`。 +配置`dolphinscheduler`的`JAVA_HOME`,默认值 `/usr/lib/jvm/java-1。8-openjdk`。 + +**`HIVE_HOME`** + +配置`dolphinscheduler`的`HIVE_HOME`,默认值 `/opt/soft/hive`。 + +**`FLINK_HOME`** + +配置`dolphinscheduler`的`FLINK_HOME`,默认值 `/opt/soft/flink`。 + +**`DATAX_HOME`** + +配置`dolphinscheduler`的`DATAX_HOME`,默认值 `/opt/soft/datax/bin/datax。py`。 **`DOLPHINSCHEDULER_DATA_BASEDIR_PATH`** @@ -305,3 +340,142 @@ EOF " > ${DOLPHINSCHEDULER_HOME}/conf/${line%.*} done ``` + +## FAQ + +### 如何通过 docker-compose 停止 dolphinscheduler? + +停止所有容器: + +``` +docker-compose stop +``` + +停止所有容器并移除所有容器,网络和存储卷: + +``` +docker-compose down -v +``` + +### 如何在 Docker Swarm 上部署 dolphinscheduler? + +假设 Docker Swarm 集群已经部署(如果还没有创建 Docker Swarm 集群,请参考 [create-swarm](https://docs.docker.com/engine/swarm/swarm-tutorial/create-swarm/)) + +启动名为 dolphinscheduler 的 stack + +``` +docker stack deploy -c docker-stack.yml dolphinscheduler +``` + +启动并移除名为 dolphinscheduler 的 stack + +``` +docker stack rm dolphinscheduler +``` + +### 如何用 MySQL 替代 PostgreSQL 作为 DolphinScheduler 的数据库? + +> 由于商业许可证的原因,我们不能直接使用 MySQL 的驱动包和客户端. +> +> 如果你要使用 MySQL, 你可以基于官方镜像 `apache/dolphinscheduler` 进行构建. + +1. 下载 MySQL 驱动包 [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (要求 `>=5.1.47`) + +2. 创建一个新的 `Dockerfile`,用于添加 MySQL 的驱动包和客户端: + +``` +FROM apache/dolphinscheduler:latest +COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib +RUN apk add --update --no-cache mysql-client +``` + +3. 构建一个包含 MySQL 的驱动包和客户端的新镜像: + +``` +docker build -t apache/dolphinscheduler:mysql . +``` + +4. 修改 `docker-compose.yml` 文件中的所有 image 字段为 `apache/dolphinscheduler:mysql` + +> 如果你想在 Docker Swarm 上部署 dolphinscheduler,你需要修改 `docker-stack.yml` + +5. 注释 `docker-compose.yml` 文件中的 `dolphinscheduler-postgresql` 块 + +6. 在 `docker-compose.yml` 文件中添加 `dolphinscheduler-mysql` 服务(**可选**,你可以直接使用一个外部的 MySQL 数据库) + +7. 修改 `docker-compose.yml` 文件中的所有 DATABASE 环境变量 + +``` +DATABASE_TYPE: mysql +DATABASE_DRIVER: com.mysql.jdbc.Driver +DATABASE_HOST: dolphinscheduler-mysql +DATABASE_PORT: 3306 +DATABASE_USERNAME: root +DATABASE_PASSWORD: root +DATABASE_DATABASE: dolphinscheduler +DATABASE_PARAMS: useUnicode=true&characterEncoding=UTF-8 +``` + +> 如果你已经添加了 `dolphinscheduler-mysql` 服务,设置 `DATABASE_HOST` 为 `dolphinscheduler-mysql` 即可 + +8. 运行 dolphinscheduler (详见**如何使用docker镜像**) + +### 如何在数据源中心支持 MySQL 数据源? + +> 由于商业许可证的原因,我们不能直接使用 MySQL 的驱动包. +> +> 如果你要添加 MySQL 数据源, 你可以基于官方镜像 `apache/dolphinscheduler` 进行构建. + +1. 下载 MySQL 驱动包 [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (要求 `>=5.1.47`) + +2. 创建一个新的 `Dockerfile`,用于添加 MySQL 驱动包: + +``` +FROM apache/dolphinscheduler:latest +COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib +``` + +3. 构建一个包含 MySQL 驱动包的新镜像: + +``` +docker build -t apache/dolphinscheduler:mysql-driver . +``` + +4. 将 `docker-compose.yml` 文件中的所有 image 字段 修改为 `apache/dolphinscheduler:mysql-driver` + +> 如果你想在 Docker Swarm 上部署 dolphinscheduler,你需要修改 `docker-stack.yml` + +5. 运行 dolphinscheduler (详见**如何使用docker镜像**) + +6. 在数据源中心添加一个 MySQL 数据源 + +### 如何在数据源中心支持 Oracle 数据源? + +> 由于商业许可证的原因,我们不能直接使用 Oracle 的驱动包. +> +> 如果你要添加 Oracle 数据源, 你可以基于官方镜像 `apache/dolphinscheduler` 进行构建. + +1. 下载 Oracle 驱动包 [ojdbc8.jar](https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/) (such as `ojdbc8-19.9.0.0.jar`) + +2. 创建一个新的 `Dockerfile`,用于添加 Oracle 驱动包: + +``` +FROM apache/dolphinscheduler:latest +COPY ojdbc8-19.9.0.0.jar /opt/dolphinscheduler/lib +``` + +3. 构建一个包含 Oracle 驱动包的新镜像: + +``` +docker build -t apache/dolphinscheduler:oracle-driver . +``` + +4. 将 `docker-compose.yml` 文件中的所有 image 字段 修改为 `apache/dolphinscheduler:oracle-driver` + +> 如果你想在 Docker Swarm 上部署 dolphinscheduler,你需要修改 `docker-stack.yml` + +5. 运行 dolphinscheduler (详见**如何使用docker镜像**) + +6. 在数据源中心添加一个 Oracle 数据源 + +更多信息请查看 [incubator-dolphinscheduler](https://github.com/apache/incubator-dolphinscheduler.git) 文档. diff --git a/docker/build/conf/dolphinscheduler/worker.properties.tpl b/docker/build/conf/dolphinscheduler/worker.properties.tpl index d3ef35a813..cab729b6aa 100644 --- a/docker/build/conf/dolphinscheduler/worker.properties.tpl +++ b/docker/build/conf/dolphinscheduler/worker.properties.tpl @@ -30,7 +30,7 @@ worker.reserved.memory=${WORKER_RESERVED_MEMORY} # worker listener port worker.listen.port=${WORKER_LISTEN_PORT} -# default worker group +# default worker groups worker.groups=${WORKER_GROUPS} # default worker weight diff --git a/docker/docker-swarm/docker-compose.yml b/docker/docker-swarm/docker-compose.yml index a4a221c56e..01ac4bfb52 100644 --- a/docker/docker-swarm/docker-compose.yml +++ b/docker/docker-swarm/docker-compose.yml @@ -58,11 +58,15 @@ services: - 12345:12345 environment: TZ: Asia/Shanghai + DOLPHINSCHEDULER_OPTS: "-Xms512m -Xmx512m -Xmn256m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 ZOOKEEPER_QUORUM: dolphinscheduler-zookeeper:2181 RESOURCE_STORAGE_TYPE: HDFS RESOURCE_UPLOAD_PATH: /dolphinscheduler @@ -92,11 +96,15 @@ services: environment: TZ: Asia/Shanghai ALERT_PLUGIN_DIR: lib/plugin/alert + DOLPHINSCHEDULER_OPTS: "-Xms512m -Xmx512m -Xmn256m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 healthcheck: test: ["CMD", "/root/checkpoint.sh", "AlertServer"] interval: 30s @@ -126,11 +134,16 @@ services: MASTER_TASK_COMMIT_INTERVAL: "1000" MASTER_MAX_CPULOAD_AVG: "100" MASTER_RESERVED_MEMORY: "0.1" + DOLPHINSCHEDULER_DATA_BASEDIR_PATH: /tmp/dolphinscheduler + DOLPHINSCHEDULER_OPTS: "-Xms1g -Xmx1g -Xmn512m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 ZOOKEEPER_QUORUM: dolphinscheduler-zookeeper:2181 healthcheck: test: ["CMD", "/root/checkpoint.sh", "MasterServer"] @@ -162,6 +175,7 @@ services: WORKER_RESERVED_MEMORY: "0.1" WORKER_GROUPS: "default" WORKER_WEIGHT: "100" + ALERT_LISTEN_HOST: dolphinscheduler-alert HADOOP_HOME: "/opt/soft/hadoop" HADOOP_CONF_DIR: "/opt/soft/hadoop/etc/hadoop" SPARK_HOME1: "/opt/soft/spark1" @@ -172,12 +186,15 @@ services: FLINK_HOME: "/opt/soft/flink" DATAX_HOME: "/opt/soft/datax/bin/datax.py" DOLPHINSCHEDULER_DATA_BASEDIR_PATH: /tmp/dolphinscheduler - ALERT_LISTEN_HOST: dolphinscheduler-alert + DOLPHINSCHEDULER_OPTS: "-Xms1g -Xmx1g -Xmn512m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 ZOOKEEPER_QUORUM: dolphinscheduler-zookeeper:2181 RESOURCE_STORAGE_TYPE: HDFS RESOURCE_UPLOAD_PATH: /dolphinscheduler diff --git a/docker/docker-swarm/docker-stack.yml b/docker/docker-swarm/docker-stack.yml index 7206e4e678..4a34b37916 100644 --- a/docker/docker-swarm/docker-stack.yml +++ b/docker/docker-swarm/docker-stack.yml @@ -58,11 +58,15 @@ services: - 12345:12345 environment: TZ: Asia/Shanghai + DOLPHINSCHEDULER_OPTS: "-Xms512m -Xmx512m -Xmn256m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 ZOOKEEPER_QUORUM: dolphinscheduler-zookeeper:2181 RESOURCE_STORAGE_TYPE: HDFS RESOURCE_UPLOAD_PATH: /dolphinscheduler @@ -89,11 +93,15 @@ services: environment: TZ: Asia/Shanghai ALERT_PLUGIN_DIR: lib/plugin/alert + DOLPHINSCHEDULER_OPTS: "-Xms512m -Xmx512m -Xmn256m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 healthcheck: test: ["CMD", "/root/checkpoint.sh", "AlertServer"] interval: 30s @@ -122,11 +130,16 @@ services: MASTER_TASK_COMMIT_INTERVAL: "1000" MASTER_MAX_CPULOAD_AVG: "100" MASTER_RESERVED_MEMORY: "0.1" + DOLPHINSCHEDULER_DATA_BASEDIR_PATH: /tmp/dolphinscheduler + DOLPHINSCHEDULER_OPTS: "-Xms1g -Xmx1g -Xmn512m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 ZOOKEEPER_QUORUM: dolphinscheduler-zookeeper:2181 healthcheck: test: ["CMD", "/root/checkpoint.sh", "MasterServer"] @@ -156,6 +169,7 @@ services: WORKER_RESERVED_MEMORY: "0.1" WORKER_GROUPS: "default" WORKER_WEIGHT: "100" + ALERT_LISTEN_HOST: dolphinscheduler-alert HADOOP_HOME: "/opt/soft/hadoop" HADOOP_CONF_DIR: "/opt/soft/hadoop/etc/hadoop" SPARK_HOME1: "/opt/soft/spark1" @@ -166,12 +180,15 @@ services: FLINK_HOME: "/opt/soft/flink" DATAX_HOME: "/opt/soft/datax/bin/datax.py" DOLPHINSCHEDULER_DATA_BASEDIR_PATH: /tmp/dolphinscheduler - ALERT_LISTEN_HOST: dolphinscheduler-alert + DOLPHINSCHEDULER_OPTS: "-Xms1g -Xmx1g -Xmn512m" + DATABASE_TYPE: postgresql + DATABASE_DRIVER: org.postgresql.Driver DATABASE_HOST: dolphinscheduler-postgresql DATABASE_PORT: 5432 DATABASE_USERNAME: root DATABASE_PASSWORD: root DATABASE_DATABASE: dolphinscheduler + DATABASE_PARAMS: characterEncoding=utf8 ZOOKEEPER_QUORUM: dolphinscheduler-zookeeper:2181 RESOURCE_STORAGE_TYPE: HDFS RESOURCE_UPLOAD_PATH: /dolphinscheduler diff --git a/docker/kubernetes/dolphinscheduler/README.md b/docker/kubernetes/dolphinscheduler/README.md index 318c3a9132..0a5efe3163 100644 --- a/docker/kubernetes/dolphinscheduler/README.md +++ b/docker/kubernetes/dolphinscheduler/README.md @@ -1,14 +1,14 @@ -# Dolphin Scheduler +# DolphinScheduler -[Dolphin Scheduler](https://dolphinscheduler.apache.org) is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing. +[DolphinScheduler](https://dolphinscheduler.apache.org) is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing. ## Introduction -This chart bootstraps a [Dolphin Scheduler](https://dolphinscheduler.apache.org) distributed deployment on a [Kubernetes](http://kubernetes.io) cluster using the [Helm](https://helm.sh) package manager. +This chart bootstraps a [DolphinScheduler](https://dolphinscheduler.apache.org) distributed deployment on a [Kubernetes](http://kubernetes.io) cluster using the [Helm](https://helm.sh) package manager. ## Prerequisites -- Helm 3.1.0+ -- Kubernetes 1.12+ +- [Helm](https://helm.sh/) 3.1.0+ +- [Kubernetes](https://kubernetes.io/) 1.12+ - PV provisioner support in the underlying infrastructure ## Installing the Chart @@ -22,10 +22,38 @@ $ helm repo add bitnami https://charts.bitnami.com/bitnami $ helm dependency update . $ helm install dolphinscheduler . ``` -These commands deploy Dolphin Scheduler on the Kubernetes cluster in the default configuration. The [configuration](#configuration) section lists the parameters that can be configured during installation. + +To install the chart with a namespace named `test`: + +```bash +$ helm install dolphinscheduler . -n test +``` + +> **Tip**: If a namespace named `test` is used, the option `-n test` needs to be added to the `helm` and `kubectl` command + +These commands deploy DolphinScheduler on the Kubernetes cluster in the default configuration. The [configuration](#configuration) section lists the parameters that can be configured during installation. > **Tip**: List all releases using `helm list` +## Access DolphinScheduler UI + +If `ingress.enabled` in `values.yaml` is set to `true`, you just access `http://${ingress.host}/dolphinscheduler` in browser. + +> **Tip**: If there is a problem with ingress access, please contact the Kubernetes administrator and refer to the [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) + +Otherwise, you need to execute port-forward command like: + +```bash +$ kubectl port-forward --address 0.0.0.0 svc/dolphinscheduler-api 12345:12345 +$ kubectl port-forward --address 0.0.0.0 -n test svc/dolphinscheduler-api 12345:12345 # with test namespace +``` + +> **Tip**: If the error of `unable to do port forwarding: socat not found` appears, you need to install `socat` at first + +And then access the web: http://192.168.xx.xx:12345/dolphinscheduler + +The default username is `admin` and the default password is `dolphinscheduler123` + ## Uninstalling the Chart To uninstall/delete the `dolphinscheduler` deployment: @@ -34,63 +62,78 @@ To uninstall/delete the `dolphinscheduler` deployment: $ helm uninstall dolphinscheduler ``` -The command removes all the Kubernetes components associated with the chart and deletes the release. +The command removes all the Kubernetes components but PVC's associated with the chart and deletes the release. + +To delete the PVC's associated with `dolphinscheduler`: + +```bash +$ kubectl delete pvc -l app.kubernetes.io/instance=dolphinscheduler +``` + +> **Note**: Deleting the PVC's will delete all data as well. Please be cautious before doing it. ## Configuration -The following tables lists the configurable parameters of the Dolphins Scheduler chart and their default values. +The Configuration file is `values.yaml`, and the following tables lists the configurable parameters of the DolphinScheduler chart and their default values. | Parameter | Description | Default | | --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------- | +| `nameOverride` | String to partially override common.names.fullname | `nil` | +| `fullnameOverride` | String to fully override common.names.fullname | `nil` | | `timezone` | World time and date for cities in all time zones | `Asia/Shanghai` | -| `image.registry` | Docker image registry for the Dolphins Scheduler | `docker.io` | -| `image.repository` | Docker image repository for the Dolphins Scheduler | `dolphinscheduler` | -| `image.tag` | Docker image version for the Dolphins Scheduler | `1.2.1` | -| `image.imagePullPolicy` | Image pull policy. One of Always, Never, IfNotPresent | `IfNotPresent` | -| `image.pullSecres` | PullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images | `[]` | -| | | | -| `postgresql.enabled` | If not exists external PostgreSQL, by default, the Dolphins Scheduler will use a internal PostgreSQL | `true` | +| `image.registry` | Docker image registry for the DolphinScheduler | `docker.io` | +| `image.repository` | Docker image repository for the DolphinScheduler | `dolphinscheduler` | +| `image.tag` | Docker image version for the DolphinScheduler | `latest` | +| `image.pullPolicy` | Image pull policy. One of Always, Never, IfNotPresent | `IfNotPresent` | +| `image.pullSecrets` | Image pull secrets. An optional list of references to secrets in the same namespace to use for pulling any of the images | `[]` | +| | | | +| `postgresql.enabled` | If not exists external PostgreSQL, by default, the DolphinScheduler will use a internal PostgreSQL | `true` | | `postgresql.postgresqlUsername` | The username for internal PostgreSQL | `root` | | `postgresql.postgresqlPassword` | The password for internal PostgreSQL | `root` | | `postgresql.postgresqlDatabase` | The database for internal PostgreSQL | `dolphinscheduler` | | `postgresql.persistence.enabled` | Set `postgresql.persistence.enabled` to `true` to mount a new volume for internal PostgreSQL | `false` | | `postgresql.persistence.size` | `PersistentVolumeClaim` Size | `20Gi` | | `postgresql.persistence.storageClass` | PostgreSQL data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | -| `externalDatabase.type` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database type will use it. | `postgresql` | -| `externalDatabase.driver` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database driver will use it. | `org.postgresql.Driver` | -| `externalDatabase.host` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database host will use it. | `localhost` | -| `externalDatabase.port` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database port will use it. | `5432` | -| `externalDatabase.username` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database username will use it. | `root` | -| `externalDatabase.password` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database password will use it. | `root` | -| `externalDatabase.database` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database database will use it. | `dolphinscheduler` | -| `externalDatabase.params` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database params will use it. | `characterEncoding=utf8` | -| | | | -| `zookeeper.enabled` | If not exists external Zookeeper, by default, the Dolphin Scheduler will use a internal Zookeeper | `true` | -| `zookeeper.taskQueue` | Specify task queue for `master` and `worker` | `zookeeper` | +| `externalDatabase.type` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database type will use it | `postgresql` | +| `externalDatabase.driver` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database driver will use it | `org.postgresql.Driver` | +| `externalDatabase.host` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database host will use it | `localhost` | +| `externalDatabase.port` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database port will use it | `5432` | +| `externalDatabase.username` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database username will use it | `root` | +| `externalDatabase.password` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database password will use it | `root` | +| `externalDatabase.database` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database database will use it | `dolphinscheduler` | +| `externalDatabase.params` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database params will use it | `characterEncoding=utf8` | +| | | | +| `zookeeper.enabled` | If not exists external Zookeeper, by default, the DolphinScheduler will use a internal Zookeeper | `true` | +| `zookeeper.fourlwCommandsWhitelist` | A list of comma separated Four Letter Words commands to use | `srvr,ruok,wchs,cons` | +| `zookeeper.service.port` | ZooKeeper port | `2181` | | `zookeeper.persistence.enabled` | Set `zookeeper.persistence.enabled` to `true` to mount a new volume for internal Zookeeper | `false` | | `zookeeper.persistence.size` | `PersistentVolumeClaim` Size | `20Gi` | | `zookeeper.persistence.storageClass` | Zookeeper data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | -| `externalZookeeper.taskQueue` | If exists external Zookeeper, and set `zookeeper.enable` value to false. Specify task queue for `master` and `worker` | `zookeeper` | -| `externalZookeeper.zookeeperQuorum` | If exists external Zookeeper, and set `zookeeper.enable` value to false. Specify Zookeeper quorum | `127.0.0.1:2181` | -| `externalZookeeper.zookeeperRoot` | If exists external Zookeeper, and set `zookeeper.enable` value to false. Specify Zookeeper root path for `master` and `worker` | `dolphinscheduler` | -| | | | -| `common.configmap.DOLPHINSCHEDULER_ENV_PATH` | Extra env file path. | `/tmp/dolphinscheduler/env` | -| `common.configmap.DOLPHINSCHEDULER_DATA_BASEDIR_PATH` | File uploaded path of DS. | `/tmp/dolphinscheduler/files` | -| `common.configmap.RESOURCE_STORAGE_TYPE` | Resource Storate type, support type are: S3、HDFS、NONE. | `NONE` | -| `common.configmap.RESOURCE_UPLOAD_PATH` | The base path of resource. | `/ds` | -| `common.configmap.FS_DEFAULT_FS` | The default fs of resource, for s3 is the `s3a` prefix and bucket name. | `s3a://xxxx` | -| `common.configmap.FS_S3A_ENDPOINT` | If the resource type is `S3`, you should fill this filed, it's the endpoint of s3. | `s3.xxx.amazonaws.com` | -| `common.configmap.FS_S3A_ACCESS_KEY` | The access key for your s3 bucket. | `xxxxxxx` | -| `common.configmap.FS_S3A_SECRET_KEY` | The secret key for your s3 bucket. | `xxxxxxx` | +| `zookeeper.zookeeperRoot` | Specify dolphinscheduler root directory in Zookeeper | `/dolphinscheduler` | +| `externalZookeeper.zookeeperQuorum` | If exists external Zookeeper, and set `zookeeper.enabled` value to false. Specify Zookeeper quorum | `127.0.0.1:2181` | +| `externalZookeeper.zookeeperRoot` | If exists external Zookeeper, and set `zookeeper.enabled` value to false. Specify dolphinscheduler root directory in Zookeeper | `/dolphinscheduler` | +| | | | +| `common.configmap.DOLPHINSCHEDULER_ENV` | System env path, self configuration, please read `values.yaml` | `[]` | +| `common.configmap.DOLPHINSCHEDULER_DATA_BASEDIR_PATH` | User data directory path, self configuration, please make sure the directory exists and have read write permissions | `/tmp/dolphinscheduler` | +| `common.configmap.RESOURCE_STORAGE_TYPE` | Resource storage type: HDFS, S3, NONE | `HDFS` | +| `common.configmap.RESOURCE_UPLOAD_PATH` | Resource store on HDFS/S3 path, please make sure the directory exists on hdfs and have read write permissions | `/dolphinscheduler` | +| `common.configmap.FS_DEFAULT_FS` | Resource storage file system like `file:///`, `hdfs://mycluster:8020` or `s3a://dolphinscheduler` | `file:///` | +| `common.configmap.FS_S3A_ENDPOINT` | S3 endpoint when `common.configmap.RESOURCE_STORAGE_TYPE` is set to `S3` | `s3.xxx.amazonaws.com` | +| `common.configmap.FS_S3A_ACCESS_KEY` | S3 access key when `common.configmap.RESOURCE_STORAGE_TYPE` is set to `S3` | `xxxxxxx` | +| `common.configmap.FS_S3A_SECRET_KEY` | S3 secret key when `common.configmap.RESOURCE_STORAGE_TYPE` is set to `S3` | `xxxxxxx` | +| `common.fsFileResourcePersistence.enabled` | Set `common.fsFileResourcePersistence.enabled` to `true` to mount a new file resource volume for `api` and `worker` | `false` | +| `common.fsFileResourcePersistence.accessModes` | `PersistentVolumeClaim` Access Modes, must be `ReadWriteMany` | `[ReadWriteMany]` | +| `common.fsFileResourcePersistence.storageClassName` | Resource Persistent Volume Storage Class, must support the access mode: ReadWriteMany | `-` | +| `common.fsFileResourcePersistence.storage` | `PersistentVolumeClaim` Size | `20Gi` | +| | | | | `master.podManagementPolicy` | PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down | `Parallel` | -| | | | | `master.replicas` | Replicas is the desired number of replicas of the given Template | `3` | +| `master.annotations` | The `annotations` for master server | `{}` | +| `master.affinity` | If specified, the pod's scheduling constraints | `{}` | | `master.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | | `master.tolerations` | If specified, the pod's tolerations | `{}` | -| `master.affinity` | If specified, the pod's scheduling constraints | `{}` | -| `master.jvmOptions` | The JVM options for master server. | `""` | -| `master.resources` | The `resource` limit and request config for master server. | `{}` | -| `master.annotations` | The `annotations` for master server. | `{}` | +| `master.resources` | The `resource` limit and request config for master server | `{}` | +| `master.configmap.DOLPHINSCHEDULER_OPTS` | The java options for master server | `""` | | `master.configmap.MASTER_EXEC_THREADS` | Master execute thread num | `100` | | `master.configmap.MASTER_EXEC_TASK_NUM` | Master execute task number in parallel | `20` | | `master.configmap.MASTER_HEARTBEAT_INTERVAL` | Master heartbeat interval | `10` | @@ -98,6 +141,7 @@ The following tables lists the configurable parameters of the Dolphins Scheduler | `master.configmap.MASTER_TASK_COMMIT_INTERVAL` | Master commit task interval | `1000` | | `master.configmap.MASTER_MAX_CPULOAD_AVG` | Only less than cpu avg load, master server can work. default value : the number of cpu cores * 2 | `100` | | `master.configmap.MASTER_RESERVED_MEMORY` | Only larger than reserved memory, master server can work. default value : physical memory * 1/10, unit is G | `0.1` | +| `master.configmap.MASTER_LISTEN_PORT` | Master listen port | `5678` | | `master.livenessProbe.enabled` | Turn on and off liveness probe | `true` | | `master.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | | `master.livenessProbe.periodSeconds` | How often to perform the probe | `30` | @@ -114,22 +158,22 @@ The following tables lists the configurable parameters of the Dolphins Scheduler | `master.persistentVolumeClaim.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | | `master.persistentVolumeClaim.storageClassName` | `Master` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | | `master.persistentVolumeClaim.storage` | `PersistentVolumeClaim` Size | `20Gi` | -| | | | +| | | | | `worker.podManagementPolicy` | PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down | `Parallel` | | `worker.replicas` | Replicas is the desired number of replicas of the given Template | `3` | +| `worker.annotations` | The `annotations` for worker server | `{}` | +| `worker.affinity` | If specified, the pod's scheduling constraints | `{}` | | `worker.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | | `worker.tolerations` | If specified, the pod's tolerations | `{}` | -| `worker.affinity` | If specified, the pod's scheduling constraints | `{}` | -| `worker.jvmOptions` | The JVM options for worker server. | `""` | -| `worker.resources` | The `resource` limit and request config for worker server. | `{}` | -| `worker.annotations` | The `annotations` for worker server. | `{}` | +| `worker.resources` | The `resource` limit and request config for worker server | `{}` | +| `worker.configmap.DOLPHINSCHEDULER_OPTS` | The java options for worker server | `""` | | `worker.configmap.WORKER_EXEC_THREADS` | Worker execute thread num | `100` | | `worker.configmap.WORKER_HEARTBEAT_INTERVAL` | Worker heartbeat interval | `10` | -| `worker.configmap.WORKER_FETCH_TASK_NUM` | Submit the number of tasks at a time | `3` | | `worker.configmap.WORKER_MAX_CPULOAD_AVG` | Only less than cpu avg load, worker server can work. default value : the number of cpu cores * 2 | `100` | | `worker.configmap.WORKER_RESERVED_MEMORY` | Only larger than reserved memory, worker server can work. default value : physical memory * 1/10, unit is G | `0.1` | -| `worker.configmap.DOLPHINSCHEDULER_DATA_BASEDIR_PATH` | User data directory path, self configuration, please make sure the directory exists and have read write permissions | `/tmp/dolphinscheduler` | -| `worker.configmap.DOLPHINSCHEDULER_ENV` | System env path, self configuration, please read `values.yaml` | `[]` | +| `worker.configmap.WORKER_LISTEN_PORT` | Worker listen port | `1234` | +| `worker.configmap.WORKER_GROUPS` | Worker groups | `default` | +| `worker.configmap.WORKER_WEIGHT` | Worker weight | `100` | | `worker.livenessProbe.enabled` | Turn on and off liveness probe | `true` | | `worker.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | | `worker.livenessProbe.periodSeconds` | How often to perform the probe | `30` | @@ -151,32 +195,18 @@ The following tables lists the configurable parameters of the Dolphins Scheduler | `worker.persistentVolumeClaim.logsPersistentVolume.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | | `worker.persistentVolumeClaim.logsPersistentVolume.storageClassName` | `Worker` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | | `worker.persistentVolumeClaim.logsPersistentVolume.storage` | `PersistentVolumeClaim` Size | `20Gi` | -| | | | +| | | | +| `alert.replicas` | Replicas is the desired number of replicas of the given Template | `1` | | `alert.strategy.type` | Type of deployment. Can be "Recreate" or "RollingUpdate" | `RollingUpdate` | | `alert.strategy.rollingUpdate.maxSurge` | The maximum number of pods that can be scheduled above the desired number of pods | `25%` | | `alert.strategy.rollingUpdate.maxUnavailable` | The maximum number of pods that can be unavailable during the update | `25%` | -| `alert.replicas` | Replicas is the desired number of replicas of the given Template | `1` | +| `alert.annotations` | The `annotations` for alert server | `{}` | +| `alert.affinity` | If specified, the pod's scheduling constraints | `{}` | | `alert.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | | `alert.tolerations` | If specified, the pod's tolerations | `{}` | -| `alert.affinity` | If specified, the pod's scheduling constraints | `{}` | -| `alert.jvmOptions` | The JVM options for alert server. | `""` | -| `alert.resources` | The `resource` limit and request config for alert server. | `{}` | -| `alert.annotations` | The `annotations` for alert server. | `{}` | -| `alert.configmap.ALERT_PLUGIN_DIR` | Alert plugin path. | `/opt/dolphinscheduler/alert/plugin` | -| `alert.configmap.XLS_FILE_PATH` | XLS file path | `/tmp/xls` | -| `alert.configmap.MAIL_SERVER_HOST` | Mail `SERVER HOST ` | `nil` | -| `alert.configmap.MAIL_SERVER_PORT` | Mail `SERVER PORT` | `nil` | -| `alert.configmap.MAIL_SENDER` | Mail `SENDER` | `nil` | -| `alert.configmap.MAIL_USER` | Mail `USER` | `nil` | -| `alert.configmap.MAIL_PASSWD` | Mail `PASSWORD` | `nil` | -| `alert.configmap.MAIL_SMTP_STARTTLS_ENABLE` | Mail `SMTP STARTTLS` enable | `false` | -| `alert.configmap.MAIL_SMTP_SSL_ENABLE` | Mail `SMTP SSL` enable | `false` | -| `alert.configmap.MAIL_SMTP_SSL_TRUST` | Mail `SMTP SSL TRUST` | `nil` | -| `alert.configmap.ENTERPRISE_WECHAT_ENABLE` | `Enterprise Wechat` enable | `false` | -| `alert.configmap.ENTERPRISE_WECHAT_CORP_ID` | `Enterprise Wechat` corp id | `nil` | -| `alert.configmap.ENTERPRISE_WECHAT_SECRET` | `Enterprise Wechat` secret | `nil` | -| `alert.configmap.ENTERPRISE_WECHAT_AGENT_ID` | `Enterprise Wechat` agent id | `nil` | -| `alert.configmap.ENTERPRISE_WECHAT_USERS` | `Enterprise Wechat` users | `nil` | +| `alert.resources` | The `resource` limit and request config for alert server | `{}` | +| `alert.configmap.DOLPHINSCHEDULER_OPTS` | The java options for alert server | `""` | +| `alert.configmap.ALERT_PLUGIN_DIR` | Alert plugin directory | `lib/plugin/alert` | | `alert.livenessProbe.enabled` | Turn on and off liveness probe | `true` | | `alert.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | | `alert.livenessProbe.periodSeconds` | How often to perform the probe | `30` | @@ -194,16 +224,16 @@ The following tables lists the configurable parameters of the Dolphins Scheduler | `alert.persistentVolumeClaim.storageClassName` | `Alert` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | | `alert.persistentVolumeClaim.storage` | `PersistentVolumeClaim` Size | `20Gi` | | | | | +| `api.replicas` | Replicas is the desired number of replicas of the given Template | `1` | | `api.strategy.type` | Type of deployment. Can be "Recreate" or "RollingUpdate" | `RollingUpdate` | | `api.strategy.rollingUpdate.maxSurge` | The maximum number of pods that can be scheduled above the desired number of pods | `25%` | | `api.strategy.rollingUpdate.maxUnavailable` | The maximum number of pods that can be unavailable during the update | `25%` | -| `api.replicas` | Replicas is the desired number of replicas of the given Template | `1` | +| `api.annotations` | The `annotations` for api server | `{}` | +| `api.affinity` | If specified, the pod's scheduling constraints | `{}` | | `api.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | | `api.tolerations` | If specified, the pod's tolerations | `{}` | -| `api.affinity` | If specified, the pod's scheduling constraints | `{}` | -| `api.jvmOptions` | The JVM options for api server. | `""` | -| `api.resources` | The `resource` limit and request config for api server. | `{}` | -| `api.annotations` | The `annotations` for api server. | `{}` | +| `api.resources` | The `resource` limit and request config for api server | `{}` | +| `api.configmap.DOLPHINSCHEDULER_OPTS` | The java options for api server | `""` | | `api.livenessProbe.enabled` | Turn on and off liveness probe | `true` | | `api.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | | `api.livenessProbe.periodSeconds` | How often to perform the probe | `30` | @@ -228,4 +258,108 @@ The following tables lists the configurable parameters of the Dolphins Scheduler | `ingress.tls.hosts` | Ingress tls hosts | `dolphinscheduler.org` | | `ingress.tls.secretName` | Ingress tls secret name | `dolphinscheduler-tls` | -For more information please refer to the [chart](https://github.com/apache/incubator-dolphinscheduler.git) documentation. +## FAQ + +### How to use MySQL as the DolphinScheduler's database instead of PostgreSQL? + +> Because of the commercial license, we cannot directly use the driver and client of MySQL. +> +> If you want to use MySQL, you can build a new image based on the `apache/dolphinscheduler` image as follows. + +1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) + +2. Create a new `Dockerfile` to add MySQL driver and client: + +``` +FROM apache/dolphinscheduler:latest +COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib +RUN apk add --update --no-cache mysql-client +``` + +3. Build a new docker image including MySQL driver and client: + +``` +docker build -t apache/dolphinscheduler:mysql . +``` + +4. Push the docker image `apache/dolphinscheduler:mysql` to a docker registry + +5. Modify image `registry` and `repository`, and update `tag` to `mysql` in `values.yaml` + +6. Modify postgresql `enabled` to `false` + +7. Modify externalDatabase (especially modify `host`, `username` and `password`): + +``` +externalDatabase: + type: "mysql" + driver: "com.mysql.jdbc.Driver" + host: "localhost" + port: "3306" + username: "root" + password: "root" + database: "dolphinscheduler" + params: "useUnicode=true&characterEncoding=UTF-8" +``` + +8. Run a DolphinScheduler release in Kubernetes (See **Installing the Chart**) + +### How to support MySQL datasource in `Datasource manage`? + +> Because of the commercial license, we cannot directly use the driver of MySQL. +> +> If you want to add MySQL datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. + +1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) + +2. Create a new `Dockerfile` to add MySQL driver: + +``` +FROM apache/dolphinscheduler:latest +COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib +``` + +3. Build a new docker image including MySQL driver: + +``` +docker build -t apache/dolphinscheduler:mysql-driver . +``` + +4. Push the docker image `apache/dolphinscheduler:mysql-driver` to a docker registry + +5. Modify image `registry` and `repository`, and update `tag` to `mysql-driver` in `values.yaml` + +6. Run a DolphinScheduler release in Kubernetes (See **Installing the Chart**) + +7. Add a MySQL datasource in `Datasource manage` + +### How to support Oracle datasource in `Datasource manage`? + +> Because of the commercial license, we cannot directly use the driver of Oracle. +> +> If you want to add Oracle datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. + +1. Download the Oracle driver [ojdbc8.jar](https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/) (such as `ojdbc8-19.9.0.0.jar`) + +2. Create a new `Dockerfile` to add Oracle driver: + +``` +FROM apache/dolphinscheduler:latest +COPY ojdbc8-19.9.0.0.jar /opt/dolphinscheduler/lib +``` + +3. Build a new docker image including Oracle driver: + +``` +docker build -t apache/dolphinscheduler:oracle-driver . +``` + +4. Push the docker image `apache/dolphinscheduler:oracle-driver` to a docker registry + +5. Modify image `registry` and `repository`, and update `tag` to `oracle-driver` in `values.yaml` + +6. Run a DolphinScheduler release in Kubernetes (See **Installing the Chart**) + +7. Add a Oracle datasource in `Datasource manage` + +For more information please refer to the [incubator-dolphinscheduler](https://github.com/apache/incubator-dolphinscheduler.git) documentation. diff --git a/dolphinscheduler-server/src/main/resources/worker.properties b/dolphinscheduler-server/src/main/resources/worker.properties index 5fdbf1d910..fd249e26bb 100644 --- a/dolphinscheduler-server/src/main/resources/worker.properties +++ b/dolphinscheduler-server/src/main/resources/worker.properties @@ -30,7 +30,7 @@ # worker listener port #worker.listen.port=1234 -# default worker group +# default worker groups #worker.groups=default # default worker weight