Browse Source
* [Improvement][Docker] Support more configurations and improve image for python * [Improvement][K8s] Support more configurations, more service accesses, and reduce the duplication configurations * [Improvement][Python] Improve the compatibility of python home * [Improvement][Install] Fix install config and install.sh * [Improvement][Install] Fix workflow ut * [Improvement][Docker] Optimize docker-swarm/check script * [Improvement][DB] Update default username and password of database * Update comments in master/worker.properties * Specify version for all images * [Improvement][Docker] Optimize PS1 and WORKDIR * [Improvement][SQL] Reduce heap size to 64m in create-dolphinscheduler.sh and upgrade-dolphinscheduler.sh * [Fix-5431][K8s] Fix master and worker cannot get the right address with custom DNSpull/3/MERGE
Shiwen Cheng
4 years ago
committed by
GitHub
62 changed files with 1059 additions and 2624 deletions
@ -1 +1,11 @@
|
||||
# DolphinScheduler for Docker |
||||
# DolphinScheduler for Docker and Kubernetes |
||||
|
||||
### QuickStart in Docker |
||||
|
||||
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](https://dolphinscheduler.apache.org/en-us/docs/latest/user_doc/docker-deployment.html) |
||||
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](https://dolphinscheduler.apache.org/zh-cn/docs/latest/user_doc/docker-deployment.html) |
||||
|
||||
### QuickStart in Kubernetes |
||||
|
||||
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](https://dolphinscheduler.apache.org/en-us/docs/latest/user_doc/kubernetes-deployment.html) |
||||
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](https://dolphinscheduler.apache.org/zh-cn/docs/latest/user_doc/kubernetes-deployment.html) |
||||
|
@ -1,488 +0,0 @@
|
||||
## What is DolphinScheduler? |
||||
|
||||
DolphinScheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing. |
||||
|
||||
GitHub URL: https://github.com/apache/dolphinscheduler |
||||
|
||||
Official Website: https://dolphinscheduler.apache.org |
||||
|
||||
![DolphinScheduler](https://dolphinscheduler.apache.org/img/hlogo_colorful.svg) |
||||
|
||||
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) |
||||
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) |
||||
|
||||
## Prerequisites |
||||
|
||||
- [Docker](https://docs.docker.com/engine/) 1.13.1+ |
||||
- [Docker Compose](https://docs.docker.com/compose/) 1.11.0+ |
||||
|
||||
## How to use this docker image |
||||
|
||||
#### You can start a dolphinscheduler by docker-compose (recommended) |
||||
|
||||
``` |
||||
$ docker-compose -f ./docker/docker-swarm/docker-compose.yml up -d |
||||
``` |
||||
|
||||
The default **postgres** user `root`, postgres password `root` and database `dolphinscheduler` are created in the `docker-compose.yml`. |
||||
|
||||
The default **zookeeper** is created in the `docker-compose.yml`. |
||||
|
||||
Access the Web UI: http://192.168.xx.xx:12345/dolphinscheduler |
||||
|
||||
The default username is `admin` and the default password is `dolphinscheduler123` |
||||
|
||||
> **Tip**: For quick start in docker, you can create a tenant named `ds` and associate the user `admin` with the tenant `ds` |
||||
|
||||
#### Or via Environment Variables **`DATABASE_HOST`** **`DATABASE_PORT`** **`DATABASE_DATABASE`** **`ZOOKEEPER_QUORUM`** |
||||
|
||||
You can specify **existing postgres and zookeeper service**. Example: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
-p 12345:12345 \ |
||||
apache/dolphinscheduler:latest all |
||||
``` |
||||
|
||||
Access the Web UI:http://192.168.xx.xx:12345/dolphinscheduler |
||||
|
||||
#### Or start a standalone dolphinscheduler server |
||||
|
||||
You can start a standalone dolphinscheduler server. |
||||
|
||||
* Create a **local volume** for resource storage, For example: |
||||
|
||||
``` |
||||
docker volume create dolphinscheduler-resource-local |
||||
``` |
||||
|
||||
* Start a **master server**, For example: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-master \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
apache/dolphinscheduler:latest master-server |
||||
``` |
||||
|
||||
* Start a **worker server** (including **logger server**), For example: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-worker \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
-e ALERT_LISTEN_HOST="dolphinscheduler-alert" \ |
||||
-v dolphinscheduler-resource-local:/dolphinscheduler \ |
||||
apache/dolphinscheduler:latest worker-server |
||||
``` |
||||
|
||||
* Start a **api server**, For example: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-api \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
-v dolphinscheduler-resource-local:/dolphinscheduler \ |
||||
-p 12345:12345 \ |
||||
apache/dolphinscheduler:latest api-server |
||||
``` |
||||
|
||||
* Start a **alert server**, For example: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-alert \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
apache/dolphinscheduler:latest alert-server |
||||
``` |
||||
|
||||
**Note**: You must be specify `DATABASE_HOST` `DATABASE_PORT` `DATABASE_DATABASE` `DATABASE_USERNAME` `DATABASE_PASSWORD` `ZOOKEEPER_QUORUM` when start a standalone dolphinscheduler server. |
||||
|
||||
## How to build a docker image |
||||
|
||||
You can build a docker image in A Unix-like operating system, You can also build it in Windows operating system. |
||||
|
||||
In Unix-Like, Example: |
||||
|
||||
```bash |
||||
$ cd path/dolphinscheduler |
||||
$ sh ./docker/build/hooks/build |
||||
``` |
||||
|
||||
In Windows, Example: |
||||
|
||||
```bat |
||||
C:\dolphinscheduler>.\docker\build\hooks\build.bat |
||||
``` |
||||
|
||||
Please read `./docker/build/hooks/build` `./docker/build/hooks/build.bat` script files if you don't understand |
||||
|
||||
## Environment Variables |
||||
|
||||
The DolphinScheduler Docker container is configured through environment variables, and the default value will be used if an environment variable is not set. |
||||
|
||||
**`DATABASE_TYPE`** |
||||
|
||||
This environment variable sets the type for database. The default value is `postgresql`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_DRIVER`** |
||||
|
||||
This environment variable sets the type for database. The default value is `org.postgresql.Driver`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_HOST`** |
||||
|
||||
This environment variable sets the host for database. The default value is `127.0.0.1`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_PORT`** |
||||
|
||||
This environment variable sets the port for database. The default value is `5432`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_USERNAME`** |
||||
|
||||
This environment variable sets the username for database. The default value is `root`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_PASSWORD`** |
||||
|
||||
This environment variable sets the password for database. The default value is `root`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_DATABASE`** |
||||
|
||||
This environment variable sets the database for database. The default value is `dolphinscheduler`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`DATABASE_PARAMS`** |
||||
|
||||
This environment variable sets the database for database. The default value is `characterEncoding=utf8`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`. |
||||
|
||||
**`HADOOP_HOME`** |
||||
|
||||
This environment variable sets `HADOOP_HOME`. The default value is `/opt/soft/hadoop`. |
||||
|
||||
**`HADOOP_CONF_DIR`** |
||||
|
||||
This environment variable sets `HADOOP_CONF_DIR`. The default value is `/opt/soft/hadoop/etc/hadoop`. |
||||
|
||||
**`SPARK_HOME1`** |
||||
|
||||
This environment variable sets `SPARK_HOME1`. The default value is `/opt/soft/spark1`. |
||||
|
||||
**`SPARK_HOME2`** |
||||
|
||||
This environment variable sets `SPARK_HOME2`. The default value is `/opt/soft/spark2`. |
||||
|
||||
**`PYTHON_HOME`** |
||||
|
||||
This environment variable sets `PYTHON_HOME`. The default value is `/usr`. |
||||
|
||||
**`JAVA_HOME`** |
||||
|
||||
This environment variable sets `JAVA_HOME`. The default value is `/usr/lib/jvm/java-1.8-openjdk`. |
||||
|
||||
**`HIVE_HOME`** |
||||
|
||||
This environment variable sets `HIVE_HOME`. The default value is `/opt/soft/hive`. |
||||
|
||||
**`FLINK_HOME`** |
||||
|
||||
This environment variable sets `FLINK_HOME`. The default value is `/opt/soft/flink`. |
||||
|
||||
**`DATAX_HOME`** |
||||
|
||||
This environment variable sets `DATAX_HOME`. The default value is `/opt/soft/datax`. |
||||
|
||||
**`DOLPHINSCHEDULER_DATA_BASEDIR_PATH`** |
||||
|
||||
User data directory path, self configuration, please make sure the directory exists and have read write permissions. The default value is `/tmp/dolphinscheduler` |
||||
|
||||
**`DOLPHINSCHEDULER_OPTS`** |
||||
|
||||
This environment variable sets java options. The default value is empty. |
||||
|
||||
**`RESOURCE_STORAGE_TYPE`** |
||||
|
||||
This environment variable sets resource storage type for dolphinscheduler like `HDFS`, `S3`, `NONE`. The default value is `HDFS`. |
||||
|
||||
**`RESOURCE_UPLOAD_PATH`** |
||||
|
||||
This environment variable sets resource store path on HDFS/S3 for resource storage. The default value is `/dolphinscheduler`. |
||||
|
||||
**`FS_DEFAULT_FS`** |
||||
|
||||
This environment variable sets fs.defaultFS for resource storage like `file:///`, `hdfs://mycluster:8020` or `s3a://dolphinscheduler`. The default value is `file:///`. |
||||
|
||||
**`FS_S3A_ENDPOINT`** |
||||
|
||||
This environment variable sets s3 endpoint for resource storage. The default value is `s3.xxx.amazonaws.com`. |
||||
|
||||
**`FS_S3A_ACCESS_KEY`** |
||||
|
||||
This environment variable sets s3 access key for resource storage. The default value is `xxxxxxx`. |
||||
|
||||
**`FS_S3A_SECRET_KEY`** |
||||
|
||||
This environment variable sets s3 secret key for resource storage. The default value is `xxxxxxx`. |
||||
|
||||
**`ZOOKEEPER_QUORUM`** |
||||
|
||||
This environment variable sets zookeeper quorum for `master-server` and `worker-serverr`. The default value is `127.0.0.1:2181`. |
||||
|
||||
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`. |
||||
|
||||
**`ZOOKEEPER_ROOT`** |
||||
|
||||
This environment variable sets zookeeper root directory for dolphinscheduler. The default value is `/dolphinscheduler`. |
||||
|
||||
**`MASTER_EXEC_THREADS`** |
||||
|
||||
This environment variable sets exec thread num for `master-server`. The default value is `100`. |
||||
|
||||
**`MASTER_EXEC_TASK_NUM`** |
||||
|
||||
This environment variable sets exec task num for `master-server`. The default value is `20`. |
||||
|
||||
**`MASTER_HEARTBEAT_INTERVAL`** |
||||
|
||||
This environment variable sets heartbeat interval for `master-server`. The default value is `10`. |
||||
|
||||
**`MASTER_TASK_COMMIT_RETRYTIMES`** |
||||
|
||||
This environment variable sets task commit retry times for `master-server`. The default value is `5`. |
||||
|
||||
**`MASTER_TASK_COMMIT_INTERVAL`** |
||||
|
||||
This environment variable sets task commit interval for `master-server`. The default value is `1000`. |
||||
|
||||
**`MASTER_MAX_CPULOAD_AVG`** |
||||
|
||||
This environment variable sets max cpu load avg for `master-server`. The default value is `100`. |
||||
|
||||
**`MASTER_RESERVED_MEMORY`** |
||||
|
||||
This environment variable sets reserved memory for `master-server`. The default value is `0.1`. |
||||
|
||||
**`MASTER_LISTEN_PORT`** |
||||
|
||||
This environment variable sets port for `master-server`. The default value is `5678`. |
||||
|
||||
**`WORKER_EXEC_THREADS`** |
||||
|
||||
This environment variable sets exec thread num for `worker-server`. The default value is `100`. |
||||
|
||||
**`WORKER_HEARTBEAT_INTERVAL`** |
||||
|
||||
This environment variable sets heartbeat interval for `worker-server`. The default value is `10`. |
||||
|
||||
**`WORKER_MAX_CPULOAD_AVG`** |
||||
|
||||
This environment variable sets max cpu load avg for `worker-server`. The default value is `100`. |
||||
|
||||
**`WORKER_RESERVED_MEMORY`** |
||||
|
||||
This environment variable sets reserved memory for `worker-server`. The default value is `0.1`. |
||||
|
||||
**`WORKER_LISTEN_PORT`** |
||||
|
||||
This environment variable sets port for `worker-server`. The default value is `1234`. |
||||
|
||||
**`WORKER_GROUPS`** |
||||
|
||||
This environment variable sets groups for `worker-server`. The default value is `default`. |
||||
|
||||
**`WORKER_HOST_WEIGHT`** |
||||
|
||||
This environment variable sets weight for `worker-server`. The default value is `100`. |
||||
|
||||
**`ALERT_LISTEN_HOST`** |
||||
|
||||
This environment variable sets the host of `alert-server` for `worker-server`. The default value is `127.0.0.1`. |
||||
|
||||
**`ALERT_PLUGIN_DIR`** |
||||
|
||||
This environment variable sets the alert plugin directory for `alert-server`. The default value is `lib/plugin/alert`. |
||||
|
||||
## Initialization scripts |
||||
|
||||
If you would like to do additional initialization in an image derived from this one, add one or more environment variable under `/root/start-init-conf.sh`, and modify template files in `/opt/dolphinscheduler/conf/*.tpl`. |
||||
|
||||
For example, to add an environment variable `API_SERVER_PORT` in `/root/start-init-conf.sh`: |
||||
|
||||
``` |
||||
export API_SERVER_PORT=5555 |
||||
``` |
||||
|
||||
and to modify `/opt/dolphinscheduler/conf/application-api.properties.tpl` template file, add server port: |
||||
``` |
||||
server.port=${API_SERVER_PORT} |
||||
``` |
||||
|
||||
`/root/start-init-conf.sh` will dynamically generate config file: |
||||
|
||||
```sh |
||||
echo "generate dolphinscheduler config" |
||||
ls ${DOLPHINSCHEDULER_HOME}/conf/ | grep ".tpl" | while read line; do |
||||
eval "cat << EOF |
||||
$(cat ${DOLPHINSCHEDULER_HOME}/conf/${line}) |
||||
EOF |
||||
" > ${DOLPHINSCHEDULER_HOME}/conf/${line%.*} |
||||
done |
||||
``` |
||||
|
||||
## FAQ |
||||
|
||||
### How to stop dolphinscheduler by docker-compose? |
||||
|
||||
Stop containers: |
||||
|
||||
``` |
||||
docker-compose stop |
||||
``` |
||||
|
||||
Stop containers and remove containers, networks and volumes: |
||||
|
||||
``` |
||||
docker-compose down -v |
||||
``` |
||||
|
||||
### How to deploy dolphinscheduler on Docker Swarm? |
||||
|
||||
Assuming that the Docker Swarm cluster has been created (If there is no Docker Swarm cluster, please refer to [create-swarm](https://docs.docker.com/engine/swarm/swarm-tutorial/create-swarm/)) |
||||
|
||||
Start a stack named dolphinscheduler |
||||
|
||||
``` |
||||
docker stack deploy -c docker-stack.yml dolphinscheduler |
||||
``` |
||||
|
||||
Stop and remove the stack named dolphinscheduler |
||||
|
||||
``` |
||||
docker stack rm dolphinscheduler |
||||
``` |
||||
|
||||
### How to use MySQL as the DolphinScheduler's database instead of PostgreSQL? |
||||
|
||||
> Because of the commercial license, we cannot directly use the driver and client of MySQL. |
||||
> |
||||
> If you want to use MySQL, you can build a new image based on the `apache/dolphinscheduler` image as follows. |
||||
|
||||
1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) |
||||
|
||||
2. Create a new `Dockerfile` to add MySQL driver and client: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib |
||||
RUN apk add --update --no-cache mysql-client |
||||
``` |
||||
|
||||
3. Build a new docker image including MySQL driver and client: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:mysql . |
||||
``` |
||||
|
||||
4. Modify all `image` fields to `apache/dolphinscheduler:mysql` in `docker-compose.yml` |
||||
|
||||
> If you want to deploy dolphinscheduler on Docker Swarm, you need modify `docker-stack.yml` |
||||
|
||||
5. Comment the `dolphinscheduler-postgresql` block in `docker-compose.yml` |
||||
|
||||
6. Add `dolphinscheduler-mysql` service in `docker-compose.yml` (**Optional**, you can directly use a external MySQL database) |
||||
|
||||
7. Modify all DATABASE environments in `docker-compose.yml` |
||||
|
||||
``` |
||||
DATABASE_TYPE: mysql |
||||
DATABASE_DRIVER: com.mysql.jdbc.Driver |
||||
DATABASE_HOST: dolphinscheduler-mysql |
||||
DATABASE_PORT: 3306 |
||||
DATABASE_USERNAME: root |
||||
DATABASE_PASSWORD: root |
||||
DATABASE_DATABASE: dolphinscheduler |
||||
DATABASE_PARAMS: useUnicode=true&characterEncoding=UTF-8 |
||||
``` |
||||
|
||||
> If you have added `dolphinscheduler-mysql` service in `docker-compose.yml`, just set `DATABASE_HOST` to `dolphinscheduler-mysql` |
||||
|
||||
8. Run a dolphinscheduler (See **How to use this docker image**) |
||||
|
||||
### How to support MySQL datasource in `Datasource manage`? |
||||
|
||||
> Because of the commercial license, we cannot directly use the driver of MySQL. |
||||
> |
||||
> If you want to add MySQL datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. |
||||
|
||||
1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) |
||||
|
||||
2. Create a new `Dockerfile` to add MySQL driver: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib |
||||
``` |
||||
|
||||
3. Build a new docker image including MySQL driver: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:mysql-driver . |
||||
``` |
||||
|
||||
4. Modify all `image` fields to `apache/dolphinscheduler:mysql-driver` in `docker-compose.yml` |
||||
|
||||
> If you want to deploy dolphinscheduler on Docker Swarm, you need modify `docker-stack.yml` |
||||
|
||||
5. Run a dolphinscheduler (See **How to use this docker image**) |
||||
|
||||
6. Add a MySQL datasource in `Datasource manage` |
||||
|
||||
### How to support Oracle datasource in `Datasource manage`? |
||||
|
||||
> Because of the commercial license, we cannot directly use the driver of Oracle. |
||||
> |
||||
> If you want to add Oracle datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. |
||||
|
||||
1. Download the Oracle driver [ojdbc8.jar](https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/) (such as `ojdbc8-19.9.0.0.jar`) |
||||
|
||||
2. Create a new `Dockerfile` to add Oracle driver: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY ojdbc8-19.9.0.0.jar /opt/dolphinscheduler/lib |
||||
``` |
||||
|
||||
3. Build a new docker image including Oracle driver: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:oracle-driver . |
||||
``` |
||||
|
||||
4. Modify all `image` fields to `apache/dolphinscheduler:oracle-driver` in `docker-compose.yml` |
||||
|
||||
> If you want to deploy dolphinscheduler on Docker Swarm, you need modify `docker-stack.yml` |
||||
|
||||
5. Run a dolphinscheduler (See **How to use this docker image**) |
||||
|
||||
6. Add a Oracle datasource in `Datasource manage` |
||||
|
||||
For more information please refer to the [dolphinscheduler](https://github.com/apache/dolphinscheduler.git) documentation. |
@ -1,488 +0,0 @@
|
||||
## DolphinScheduler是什么? |
||||
|
||||
一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中`开箱即用`。 |
||||
|
||||
GitHub URL: https://github.com/apache/dolphinscheduler |
||||
|
||||
Official Website: https://dolphinscheduler.apache.org |
||||
|
||||
![DolphinScheduler](https://dolphinscheduler.apache.org/img/hlogo_colorful.svg) |
||||
|
||||
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) |
||||
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) |
||||
|
||||
## 先决条件 |
||||
|
||||
- [Docker](https://docs.docker.com/engine/) 1.13.1+ |
||||
- [Docker Compose](https://docs.docker.com/compose/) 1.11.0+ |
||||
|
||||
## 如何使用docker镜像 |
||||
|
||||
#### 以 docker-compose 的方式启动dolphinscheduler(推荐) |
||||
|
||||
``` |
||||
$ docker-compose -f ./docker/docker-swarm/docker-compose.yml up -d |
||||
``` |
||||
|
||||
在`docker-compose.yml`文件中,默认的创建`Postgres`的用户、密码和数据库,默认值分别为:`root`、`root`、`dolphinscheduler`。 |
||||
|
||||
同时,默认的`Zookeeper`也会在`docker-compose.yml`文件中被创建。 |
||||
|
||||
访问前端页面:http://192.168.xx.xx:12345/dolphinscheduler |
||||
|
||||
默认的用户是`admin`,默认的密码是`dolphinscheduler123` |
||||
|
||||
> **提示**: 为了在docker中快速开始,你可以创建一个名为`ds`的租户,并将这个租户`ds`关联到用户`admin` |
||||
|
||||
#### 或者通过环境变量 **`DATABASE_HOST`** **`DATABASE_PORT`** **`ZOOKEEPER_QUORUM`** 使用已存在的服务 |
||||
|
||||
你可以指定已经存在的 **`Postgres`** 和 **`Zookeeper`** 服务. 如下: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
-p 12345:12345 \ |
||||
apache/dolphinscheduler:latest all |
||||
``` |
||||
|
||||
访问前端页面:http://192.168.xx.xx:12345/dolphinscheduler |
||||
|
||||
#### 或者运行dolphinscheduler中的部分服务 |
||||
|
||||
你能够运行dolphinscheduler中的部分服务。 |
||||
|
||||
* 创建一个 **本地卷** 用于资源存储,如下: |
||||
|
||||
``` |
||||
docker volume create dolphinscheduler-resource-local |
||||
``` |
||||
|
||||
* 启动一个 **master server**, 如下: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-master \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
apache/dolphinscheduler:latest master-server |
||||
``` |
||||
|
||||
* 启动一个 **worker server** (包括 **logger server**), 如下: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-worker \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
-e ALERT_LISTEN_HOST="dolphinscheduler-alert" \ |
||||
-v dolphinscheduler-resource-local:/dolphinscheduler \ |
||||
apache/dolphinscheduler:latest worker-server |
||||
``` |
||||
|
||||
* 启动一个 **api server**, 如下: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-api \ |
||||
-e ZOOKEEPER_QUORUM="192.168.x.x:2181" \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
-v dolphinscheduler-resource-local:/dolphinscheduler \ |
||||
-p 12345:12345 \ |
||||
apache/dolphinscheduler:latest api-server |
||||
``` |
||||
|
||||
* 启动一个 **alert server**, 如下: |
||||
|
||||
``` |
||||
$ docker run -d --name dolphinscheduler-alert \ |
||||
-e DATABASE_HOST="192.168.x.x" -e DATABASE_PORT="5432" -e DATABASE_DATABASE="dolphinscheduler" \ |
||||
-e DATABASE_USERNAME="test" -e DATABASE_PASSWORD="test" \ |
||||
apache/dolphinscheduler:latest alert-server |
||||
``` |
||||
|
||||
**注意**: 当你运行dolphinscheduler中的部分服务时,你必须指定这些环境变量 `DATABASE_HOST` `DATABASE_PORT` `DATABASE_DATABASE` `DATABASE_USERNAME` `DATABASE_PASSWORD` `ZOOKEEPER_QUORUM`。 |
||||
|
||||
## 如何构建一个docker镜像 |
||||
|
||||
你能够在类Unix系统和Windows系统中构建一个docker镜像。 |
||||
|
||||
类Unix系统, 如下: |
||||
|
||||
```bash |
||||
$ cd path/dolphinscheduler |
||||
$ sh ./docker/build/hooks/build |
||||
``` |
||||
|
||||
Windows系统, 如下: |
||||
|
||||
```bat |
||||
C:\dolphinscheduler>.\docker\build\hooks\build.bat |
||||
``` |
||||
|
||||
如果你不理解这些脚本 `./docker/build/hooks/build` `./docker/build/hooks/build.bat`,请阅读里面的内容。 |
||||
|
||||
## 环境变量 |
||||
|
||||
DolphinScheduler Docker 容器通过环境变量进行配置,缺省时将会使用默认值 |
||||
|
||||
**`DATABASE_TYPE`** |
||||
|
||||
配置`database`的`TYPE`, 默认值 `postgresql`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_DRIVER`** |
||||
|
||||
配置`database`的`DRIVER`, 默认值 `org.postgresql.Driver`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_HOST`** |
||||
|
||||
配置`database`的`HOST`, 默认值 `127.0.0.1`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_PORT`** |
||||
|
||||
配置`database`的`PORT`, 默认值 `5432`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_USERNAME`** |
||||
|
||||
配置`database`的`USERNAME`, 默认值 `root`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_PASSWORD`** |
||||
|
||||
配置`database`的`PASSWORD`, 默认值 `root`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_DATABASE`** |
||||
|
||||
配置`database`的`DATABASE`, 默认值 `dolphinscheduler`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`DATABASE_PARAMS`** |
||||
|
||||
配置`database`的`PARAMS`, 默认值 `characterEncoding=utf8`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`、`api-server`、`alert-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`HADOOP_HOME`** |
||||
|
||||
配置`dolphinscheduler`的`HADOOP_HOME`,默认值 `/opt/soft/hadoop`。 |
||||
|
||||
**`HADOOP_CONF_DIR`** |
||||
|
||||
配置`dolphinscheduler`的`HADOOP_CONF_DIR`,默认值 `/opt/soft/hadoop/etc/hadoop`。 |
||||
|
||||
**`SPARK_HOME1`** |
||||
|
||||
配置`dolphinscheduler`的`SPARK_HOME1`,默认值 `/opt/soft/spark1`。 |
||||
|
||||
**`SPARK_HOME2`** |
||||
|
||||
配置`dolphinscheduler`的`SPARK_HOME2`,默认值 `/opt/soft/spark2`。 |
||||
|
||||
**`PYTHON_HOME`** |
||||
|
||||
配置`dolphinscheduler`的`PYTHON_HOME`,默认值 `/usr`。 |
||||
|
||||
**`JAVA_HOME`** |
||||
|
||||
配置`dolphinscheduler`的`JAVA_HOME`,默认值 `/usr/lib/jvm/java-1。8-openjdk`。 |
||||
|
||||
**`HIVE_HOME`** |
||||
|
||||
配置`dolphinscheduler`的`HIVE_HOME`,默认值 `/opt/soft/hive`。 |
||||
|
||||
**`FLINK_HOME`** |
||||
|
||||
配置`dolphinscheduler`的`FLINK_HOME`,默认值 `/opt/soft/flink`。 |
||||
|
||||
**`DATAX_HOME`** |
||||
|
||||
配置`dolphinscheduler`的`DATAX_HOME`,默认值 `/opt/soft/datax`。 |
||||
|
||||
**`DOLPHINSCHEDULER_DATA_BASEDIR_PATH`** |
||||
|
||||
用户数据目录, 用户自己配置, 请确保这个目录存在并且用户读写权限, 默认值 `/tmp/dolphinscheduler`。 |
||||
|
||||
**`DOLPHINSCHEDULER_OPTS`** |
||||
|
||||
配置`dolphinscheduler`的`java options`,默认值 `""`、 |
||||
|
||||
**`RESOURCE_STORAGE_TYPE`** |
||||
|
||||
配置`dolphinscheduler`的资源存储类型,可选项为 `HDFS`、`S3`、`NONE`,默认值 `HDFS`。 |
||||
|
||||
**`RESOURCE_UPLOAD_PATH`** |
||||
|
||||
配置`HDFS/S3`上的资源存储路径,默认值 `/dolphinscheduler`。 |
||||
|
||||
**`FS_DEFAULT_FS`** |
||||
|
||||
配置资源存储的文件系统协议,如 `file:///`, `hdfs://mycluster:8020` or `s3a://dolphinscheduler`,默认值 `file:///`。 |
||||
|
||||
**`FS_S3A_ENDPOINT`** |
||||
|
||||
当`RESOURCE_STORAGE_TYPE=S3`时,需要配置`S3`的访问路径,默认值 `s3.xxx.amazonaws.com`。 |
||||
|
||||
**`FS_S3A_ACCESS_KEY`** |
||||
|
||||
当`RESOURCE_STORAGE_TYPE=S3`时,需要配置`S3`的`s3 access key`,默认值 `xxxxxxx`。 |
||||
|
||||
**`FS_S3A_SECRET_KEY`** |
||||
|
||||
当`RESOURCE_STORAGE_TYPE=S3`时,需要配置`S3`的`s3 secret key`,默认值 `xxxxxxx`。 |
||||
|
||||
**`ZOOKEEPER_QUORUM`** |
||||
|
||||
配置`master-server`和`worker-serverr`的`Zookeeper`地址, 默认值 `127.0.0.1:2181`。 |
||||
|
||||
**注意**: 当运行`dolphinscheduler`中`master-server`、`worker-server`这些服务时,必须指定这个环境变量,以便于你更好的搭建分布式服务。 |
||||
|
||||
**`ZOOKEEPER_ROOT`** |
||||
|
||||
配置`dolphinscheduler`在`zookeeper`中数据存储的根目录,默认值 `/dolphinscheduler`。 |
||||
|
||||
**`MASTER_EXEC_THREADS`** |
||||
|
||||
配置`master-server`中的执行线程数量,默认值 `100`。 |
||||
|
||||
**`MASTER_EXEC_TASK_NUM`** |
||||
|
||||
配置`master-server`中的执行任务数量,默认值 `20`。 |
||||
|
||||
**`MASTER_HEARTBEAT_INTERVAL`** |
||||
|
||||
配置`master-server`中的心跳交互时间,默认值 `10`。 |
||||
|
||||
**`MASTER_TASK_COMMIT_RETRYTIMES`** |
||||
|
||||
配置`master-server`中的任务提交重试次数,默认值 `5`。 |
||||
|
||||
**`MASTER_TASK_COMMIT_INTERVAL`** |
||||
|
||||
配置`master-server`中的任务提交交互时间,默认值 `1000`。 |
||||
|
||||
**`MASTER_MAX_CPULOAD_AVG`** |
||||
|
||||
配置`master-server`中的CPU中的`load average`值,默认值 `100`。 |
||||
|
||||
**`MASTER_RESERVED_MEMORY`** |
||||
|
||||
配置`master-server`的保留内存,默认值 `0.1`。 |
||||
|
||||
**`MASTER_LISTEN_PORT`** |
||||
|
||||
配置`master-server`的端口,默认值 `5678`。 |
||||
|
||||
**`WORKER_EXEC_THREADS`** |
||||
|
||||
配置`worker-server`中的执行线程数量,默认值 `100`。 |
||||
|
||||
**`WORKER_HEARTBEAT_INTERVAL`** |
||||
|
||||
配置`worker-server`中的心跳交互时间,默认值 `10`。 |
||||
|
||||
**`WORKER_MAX_CPULOAD_AVG`** |
||||
|
||||
配置`worker-server`中的CPU中的最大`load average`值,默认值 `100`。 |
||||
|
||||
**`WORKER_RESERVED_MEMORY`** |
||||
|
||||
配置`worker-server`的保留内存,默认值 `0.1`。 |
||||
|
||||
**`WORKER_LISTEN_PORT`** |
||||
|
||||
配置`worker-server`的端口,默认值 `1234`。 |
||||
|
||||
**`WORKER_GROUPS`** |
||||
|
||||
配置`worker-server`的分组,默认值 `default`。 |
||||
|
||||
**`WORKER_HOST_WEIGHT`** |
||||
|
||||
配置`worker-server`的权重,默认之`100`。 |
||||
|
||||
**`ALERT_LISTEN_HOST`** |
||||
|
||||
配置`worker-server`的告警主机,即`alert-server`的主机名,默认值 `127.0.0.1`。 |
||||
|
||||
**`ALERT_PLUGIN_DIR`** |
||||
|
||||
配置`alert-server`的告警插件目录,默认值 `lib/plugin/alert`。 |
||||
|
||||
## 初始化脚本 |
||||
|
||||
如果你想在编译的时候或者运行的时候附加一些其它的操作及新增一些环境变量,你可以在`/root/start-init-conf.sh`文件中进行修改,同时如果涉及到配置文件的修改,请在`/opt/dolphinscheduler/conf/*.tpl`中修改相应的配置文件 |
||||
|
||||
例如,在`/root/start-init-conf.sh`添加一个环境变量`API_SERVER_PORT`: |
||||
|
||||
``` |
||||
export API_SERVER_PORT=5555 |
||||
``` |
||||
|
||||
当添加以上环境变量后,你应该在相应的模板文件`/opt/dolphinscheduler/conf/application-api.properties.tpl`中添加这个环境变量配置: |
||||
``` |
||||
server.port=${API_SERVER_PORT} |
||||
``` |
||||
|
||||
`/root/start-init-conf.sh`将根据模板文件动态的生成配置文件: |
||||
|
||||
```sh |
||||
echo "generate dolphinscheduler config" |
||||
ls ${DOLPHINSCHEDULER_HOME}/conf/ | grep ".tpl" | while read line; do |
||||
eval "cat << EOF |
||||
$(cat ${DOLPHINSCHEDULER_HOME}/conf/${line}) |
||||
EOF |
||||
" > ${DOLPHINSCHEDULER_HOME}/conf/${line%.*} |
||||
done |
||||
``` |
||||
|
||||
## FAQ |
||||
|
||||
### 如何通过 docker-compose 停止 dolphinscheduler? |
||||
|
||||
停止所有容器: |
||||
|
||||
``` |
||||
docker-compose stop |
||||
``` |
||||
|
||||
停止所有容器并移除所有容器,网络和存储卷: |
||||
|
||||
``` |
||||
docker-compose down -v |
||||
``` |
||||
|
||||
### 如何在 Docker Swarm 上部署 dolphinscheduler? |
||||
|
||||
假设 Docker Swarm 集群已经部署(如果还没有创建 Docker Swarm 集群,请参考 [create-swarm](https://docs.docker.com/engine/swarm/swarm-tutorial/create-swarm/)) |
||||
|
||||
启动名为 dolphinscheduler 的 stack |
||||
|
||||
``` |
||||
docker stack deploy -c docker-stack.yml dolphinscheduler |
||||
``` |
||||
|
||||
启动并移除名为 dolphinscheduler 的 stack |
||||
|
||||
``` |
||||
docker stack rm dolphinscheduler |
||||
``` |
||||
|
||||
### 如何用 MySQL 替代 PostgreSQL 作为 DolphinScheduler 的数据库? |
||||
|
||||
> 由于商业许可证的原因,我们不能直接使用 MySQL 的驱动包和客户端. |
||||
> |
||||
> 如果你要使用 MySQL, 你可以基于官方镜像 `apache/dolphinscheduler` 进行构建. |
||||
|
||||
1. 下载 MySQL 驱动包 [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (要求 `>=5.1.47`) |
||||
|
||||
2. 创建一个新的 `Dockerfile`,用于添加 MySQL 的驱动包和客户端: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib |
||||
RUN apk add --update --no-cache mysql-client |
||||
``` |
||||
|
||||
3. 构建一个包含 MySQL 的驱动包和客户端的新镜像: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:mysql . |
||||
``` |
||||
|
||||
4. 修改 `docker-compose.yml` 文件中的所有 image 字段为 `apache/dolphinscheduler:mysql` |
||||
|
||||
> 如果你想在 Docker Swarm 上部署 dolphinscheduler,你需要修改 `docker-stack.yml` |
||||
|
||||
5. 注释 `docker-compose.yml` 文件中的 `dolphinscheduler-postgresql` 块 |
||||
|
||||
6. 在 `docker-compose.yml` 文件中添加 `dolphinscheduler-mysql` 服务(**可选**,你可以直接使用一个外部的 MySQL 数据库) |
||||
|
||||
7. 修改 `docker-compose.yml` 文件中的所有 DATABASE 环境变量 |
||||
|
||||
``` |
||||
DATABASE_TYPE: mysql |
||||
DATABASE_DRIVER: com.mysql.jdbc.Driver |
||||
DATABASE_HOST: dolphinscheduler-mysql |
||||
DATABASE_PORT: 3306 |
||||
DATABASE_USERNAME: root |
||||
DATABASE_PASSWORD: root |
||||
DATABASE_DATABASE: dolphinscheduler |
||||
DATABASE_PARAMS: useUnicode=true&characterEncoding=UTF-8 |
||||
``` |
||||
|
||||
> 如果你已经添加了 `dolphinscheduler-mysql` 服务,设置 `DATABASE_HOST` 为 `dolphinscheduler-mysql` 即可 |
||||
|
||||
8. 运行 dolphinscheduler (详见**如何使用docker镜像**) |
||||
|
||||
### 如何在数据源中心支持 MySQL 数据源? |
||||
|
||||
> 由于商业许可证的原因,我们不能直接使用 MySQL 的驱动包. |
||||
> |
||||
> 如果你要添加 MySQL 数据源, 你可以基于官方镜像 `apache/dolphinscheduler` 进行构建. |
||||
|
||||
1. 下载 MySQL 驱动包 [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (要求 `>=5.1.47`) |
||||
|
||||
2. 创建一个新的 `Dockerfile`,用于添加 MySQL 驱动包: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib |
||||
``` |
||||
|
||||
3. 构建一个包含 MySQL 驱动包的新镜像: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:mysql-driver . |
||||
``` |
||||
|
||||
4. 将 `docker-compose.yml` 文件中的所有 image 字段 修改为 `apache/dolphinscheduler:mysql-driver` |
||||
|
||||
> 如果你想在 Docker Swarm 上部署 dolphinscheduler,你需要修改 `docker-stack.yml` |
||||
|
||||
5. 运行 dolphinscheduler (详见**如何使用docker镜像**) |
||||
|
||||
6. 在数据源中心添加一个 MySQL 数据源 |
||||
|
||||
### 如何在数据源中心支持 Oracle 数据源? |
||||
|
||||
> 由于商业许可证的原因,我们不能直接使用 Oracle 的驱动包. |
||||
> |
||||
> 如果你要添加 Oracle 数据源, 你可以基于官方镜像 `apache/dolphinscheduler` 进行构建. |
||||
|
||||
1. 下载 Oracle 驱动包 [ojdbc8.jar](https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/) (such as `ojdbc8-19.9.0.0.jar`) |
||||
|
||||
2. 创建一个新的 `Dockerfile`,用于添加 Oracle 驱动包: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY ojdbc8-19.9.0.0.jar /opt/dolphinscheduler/lib |
||||
``` |
||||
|
||||
3. 构建一个包含 Oracle 驱动包的新镜像: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:oracle-driver . |
||||
``` |
||||
|
||||
4. 将 `docker-compose.yml` 文件中的所有 image 字段 修改为 `apache/dolphinscheduler:oracle-driver` |
||||
|
||||
> 如果你想在 Docker Swarm 上部署 dolphinscheduler,你需要修改 `docker-stack.yml` |
||||
|
||||
5. 运行 dolphinscheduler (详见**如何使用docker镜像**) |
||||
|
||||
6. 在数据源中心添加一个 Oracle 数据源 |
||||
|
||||
更多信息请查看 [dolphinscheduler](https://github.com/apache/dolphinscheduler.git) 文档. |
@ -0,0 +1,122 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one or more |
||||
# contributor license agreements. See the NOTICE file distributed with |
||||
# this work for additional information regarding copyright ownership. |
||||
# The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
# (the "License"); you may not use this file except in compliance with |
||||
# the License. You may obtain a copy of the License at |
||||
# |
||||
# http://www.apache.org/licenses/LICENSE-2.0 |
||||
# |
||||
# Unless required by applicable law or agreed to in writing, software |
||||
# distributed under the License is distributed on an "AS IS" BASIS, |
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
# See the License for the specific language governing permissions and |
||||
# limitations under the License. |
||||
# |
||||
|
||||
#============================================================================ |
||||
# Database |
||||
#============================================================================ |
||||
# postgresql |
||||
DATABASE_TYPE=postgresql |
||||
DATABASE_DRIVER=org.postgresql.Driver |
||||
DATABASE_HOST=dolphinscheduler-postgresql |
||||
DATABASE_PORT=5432 |
||||
DATABASE_USERNAME=root |
||||
DATABASE_PASSWORD=root |
||||
DATABASE_DATABASE=dolphinscheduler |
||||
DATABASE_PARAMS=characterEncoding=utf8 |
||||
# mysql |
||||
# DATABASE_TYPE=mysql |
||||
# DATABASE_DRIVER=com.mysql.jdbc.Driver |
||||
# DATABASE_HOST=dolphinscheduler-mysql |
||||
# DATABASE_PORT=3306 |
||||
# DATABASE_USERNAME=root |
||||
# DATABASE_PASSWORD=root |
||||
# DATABASE_DATABASE=dolphinscheduler |
||||
# DATABASE_PARAMS=useUnicode=true&characterEncoding=UTF-8 |
||||
|
||||
#============================================================================ |
||||
# ZooKeeper |
||||
#============================================================================ |
||||
ZOOKEEPER_QUORUM=dolphinscheduler-zookeeper:2181 |
||||
ZOOKEEPER_ROOT=/dolphinscheduler |
||||
|
||||
#============================================================================ |
||||
# Common |
||||
#============================================================================ |
||||
# common opts |
||||
DOLPHINSCHEDULER_OPTS= |
||||
# common env |
||||
DATA_BASEDIR_PATH=/tmp/dolphinscheduler |
||||
RESOURCE_STORAGE_TYPE=HDFS |
||||
RESOURCE_UPLOAD_PATH=/dolphinscheduler |
||||
FS_DEFAULT_FS=file:/// |
||||
FS_S3A_ENDPOINT=s3.xxx.amazonaws.com |
||||
FS_S3A_ACCESS_KEY=xxxxxxx |
||||
FS_S3A_SECRET_KEY=xxxxxxx |
||||
HADOOP_SECURITY_AUTHENTICATION_STARTUP_STATE=false |
||||
JAVA_SECURITY_KRB5_CONF_PATH=/opt/krb5.conf |
||||
LOGIN_USER_KEYTAB_USERNAME=hdfs@HADOOP.COM |
||||
LOGIN_USER_KEYTAB_PATH=/opt/hdfs.keytab |
||||
KERBEROS_EXPIRE_TIME=2 |
||||
HDFS_ROOT_USER=hdfs |
||||
RESOURCE_MANAGER_HTTPADDRESS_PORT=8088 |
||||
YARN_RESOURCEMANAGER_HA_RM_IDS= |
||||
YARN_APPLICATION_STATUS_ADDRESS=http://ds1:%s/ws/v1/cluster/apps/%s |
||||
YARN_JOB_HISTORY_STATUS_ADDRESS=http://ds1:19888/ws/v1/history/mapreduce/jobs/%s |
||||
DATASOURCE_ENCRYPTION_ENABLE=false |
||||
DATASOURCE_ENCRYPTION_SALT=!@#$%^&* |
||||
SUDO_ENABLE=true |
||||
# dolphinscheduler env |
||||
HADOOP_HOME=/opt/soft/hadoop |
||||
HADOOP_CONF_DIR=/opt/soft/hadoop/etc/hadoop |
||||
SPARK_HOME1=/opt/soft/spark1 |
||||
SPARK_HOME2=/opt/soft/spark2 |
||||
PYTHON_HOME=/usr/bin/python |
||||
JAVA_HOME=/usr/local/openjdk-8 |
||||
HIVE_HOME=/opt/soft/hive |
||||
FLINK_HOME=/opt/soft/flink |
||||
DATAX_HOME=/opt/soft/datax |
||||
|
||||
#============================================================================ |
||||
# Master Server |
||||
#============================================================================ |
||||
MASTER_SERVER_OPTS=-Xms1g -Xmx1g -Xmn512m |
||||
MASTER_EXEC_THREADS=100 |
||||
MASTER_EXEC_TASK_NUM=20 |
||||
MASTER_DISPATCH_TASK_NUM=3 |
||||
MASTER_HOST_SELECTOR=LowerWeight |
||||
MASTER_HEARTBEAT_INTERVAL=10 |
||||
MASTER_TASK_COMMIT_RETRYTIMES=5 |
||||
MASTER_TASK_COMMIT_INTERVAL=1000 |
||||
MASTER_MAX_CPULOAD_AVG=-1 |
||||
MASTER_RESERVED_MEMORY=0.3 |
||||
|
||||
#============================================================================ |
||||
# Worker Server |
||||
#============================================================================ |
||||
WORKER_SERVER_OPTS=-Xms1g -Xmx1g -Xmn512m |
||||
WORKER_EXEC_THREADS=100 |
||||
WORKER_HEARTBEAT_INTERVAL=10 |
||||
WORKER_HOST_WEIGHT=100 |
||||
WORKER_MAX_CPULOAD_AVG=-1 |
||||
WORKER_RESERVED_MEMORY=0.3 |
||||
WORKER_GROUPS=default |
||||
ALERT_LISTEN_HOST=dolphinscheduler-alert |
||||
|
||||
#============================================================================ |
||||
# Alert Server |
||||
#============================================================================ |
||||
ALERT_SERVER_OPTS=-Xms512m -Xmx512m -Xmn256m |
||||
ALERT_PLUGIN_DIR=lib/plugin/alert |
||||
|
||||
#============================================================================ |
||||
# Api Server |
||||
#============================================================================ |
||||
API_SERVER_OPTS=-Xms512m -Xmx512m -Xmn256m |
||||
|
||||
#============================================================================ |
||||
# Logger Server |
||||
#============================================================================ |
||||
LOGGER_SERVER_OPTS=-Xms512m -Xmx512m -Xmn256m |
@ -1,364 +0,0 @@
|
||||
# DolphinScheduler |
||||
|
||||
[DolphinScheduler](https://dolphinscheduler.apache.org) is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing. |
||||
|
||||
## Introduction |
||||
This chart bootstraps a [DolphinScheduler](https://dolphinscheduler.apache.org) distributed deployment on a [Kubernetes](http://kubernetes.io) cluster using the [Helm](https://helm.sh) package manager. |
||||
|
||||
## Prerequisites |
||||
|
||||
- [Helm](https://helm.sh/) 3.1.0+ |
||||
- [Kubernetes](https://kubernetes.io/) 1.12+ |
||||
- PV provisioner support in the underlying infrastructure |
||||
|
||||
## Installing the Chart |
||||
|
||||
To install the chart with the release name `dolphinscheduler`: |
||||
|
||||
```bash |
||||
$ git clone https://github.com/apache/dolphinscheduler.git |
||||
$ cd dolphinscheduler/docker/kubernetes/dolphinscheduler |
||||
$ helm repo add bitnami https://charts.bitnami.com/bitnami |
||||
$ helm dependency update . |
||||
$ helm install dolphinscheduler . |
||||
``` |
||||
|
||||
To install the chart with a namespace named `test`: |
||||
|
||||
```bash |
||||
$ helm install dolphinscheduler . -n test |
||||
``` |
||||
|
||||
> **Tip**: If a namespace named `test` is used, the option `-n test` needs to be added to the `helm` and `kubectl` command |
||||
|
||||
These commands deploy DolphinScheduler on the Kubernetes cluster in the default configuration. The [configuration](#configuration) section lists the parameters that can be configured during installation. |
||||
|
||||
> **Tip**: List all releases using `helm list` |
||||
|
||||
## Access DolphinScheduler UI |
||||
|
||||
If `ingress.enabled` in `values.yaml` is set to `true`, you just access `http://${ingress.host}/dolphinscheduler` in browser. |
||||
|
||||
> **Tip**: If there is a problem with ingress access, please contact the Kubernetes administrator and refer to the [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) |
||||
|
||||
Otherwise, you need to execute port-forward command like: |
||||
|
||||
```bash |
||||
$ kubectl port-forward --address 0.0.0.0 svc/dolphinscheduler-api 12345:12345 |
||||
$ kubectl port-forward --address 0.0.0.0 -n test svc/dolphinscheduler-api 12345:12345 # with test namespace |
||||
``` |
||||
|
||||
> **Tip**: If the error of `unable to do port forwarding: socat not found` appears, you need to install `socat` at first |
||||
|
||||
And then access the web: http://192.168.xx.xx:12345/dolphinscheduler |
||||
|
||||
The default username is `admin` and the default password is `dolphinscheduler123` |
||||
|
||||
> **Tip**: For quick start in docker, you can create a tenant named `ds` and associate the user `admin` with the tenant `ds` |
||||
|
||||
## Uninstalling the Chart |
||||
|
||||
To uninstall/delete the `dolphinscheduler` deployment: |
||||
|
||||
```bash |
||||
$ helm uninstall dolphinscheduler |
||||
``` |
||||
|
||||
The command removes all the Kubernetes components but PVC's associated with the chart and deletes the release. |
||||
|
||||
To delete the PVC's associated with `dolphinscheduler`: |
||||
|
||||
```bash |
||||
$ kubectl delete pvc -l app.kubernetes.io/instance=dolphinscheduler |
||||
``` |
||||
|
||||
> **Note**: Deleting the PVC's will delete all data as well. Please be cautious before doing it. |
||||
|
||||
## Configuration |
||||
|
||||
The Configuration file is `values.yaml`, and the following tables lists the configurable parameters of the DolphinScheduler chart and their default values. |
||||
|
||||
| Parameter | Description | Default | |
||||
| --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------- | |
||||
| `timezone` | World time and date for cities in all time zones | `Asia/Shanghai` | |
||||
| | | | |
||||
| `image.repository` | Docker image repository for the DolphinScheduler | `apache/dolphinscheduler` | |
||||
| `image.tag` | Docker image version for the DolphinScheduler | `latest` | |
||||
| `image.pullPolicy` | Image pull policy. One of Always, Never, IfNotPresent | `IfNotPresent` | |
||||
| `image.pullSecret` | Image pull secret. An optional reference to secret in the same namespace to use for pulling any of the images | `nil` | |
||||
| | | | |
||||
| `postgresql.enabled` | If not exists external PostgreSQL, by default, the DolphinScheduler will use a internal PostgreSQL | `true` | |
||||
| `postgresql.postgresqlUsername` | The username for internal PostgreSQL | `root` | |
||||
| `postgresql.postgresqlPassword` | The password for internal PostgreSQL | `root` | |
||||
| `postgresql.postgresqlDatabase` | The database for internal PostgreSQL | `dolphinscheduler` | |
||||
| `postgresql.persistence.enabled` | Set `postgresql.persistence.enabled` to `true` to mount a new volume for internal PostgreSQL | `false` | |
||||
| `postgresql.persistence.size` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| `postgresql.persistence.storageClass` | PostgreSQL data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `externalDatabase.type` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database type will use it | `postgresql` | |
||||
| `externalDatabase.driver` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database driver will use it | `org.postgresql.Driver` | |
||||
| `externalDatabase.host` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database host will use it | `localhost` | |
||||
| `externalDatabase.port` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database port will use it | `5432` | |
||||
| `externalDatabase.username` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database username will use it | `root` | |
||||
| `externalDatabase.password` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database password will use it | `root` | |
||||
| `externalDatabase.database` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database database will use it | `dolphinscheduler` | |
||||
| `externalDatabase.params` | If exists external PostgreSQL, and set `postgresql.enabled` value to false. DolphinScheduler's database params will use it | `characterEncoding=utf8` | |
||||
| | | | |
||||
| `zookeeper.enabled` | If not exists external Zookeeper, by default, the DolphinScheduler will use a internal Zookeeper | `true` | |
||||
| `zookeeper.fourlwCommandsWhitelist` | A list of comma separated Four Letter Words commands to use | `srvr,ruok,wchs,cons` | |
||||
| `zookeeper.service.port` | ZooKeeper port | `2181` | |
||||
| `zookeeper.persistence.enabled` | Set `zookeeper.persistence.enabled` to `true` to mount a new volume for internal Zookeeper | `false` | |
||||
| `zookeeper.persistence.size` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| `zookeeper.persistence.storageClass` | Zookeeper data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `zookeeper.zookeeperRoot` | Specify dolphinscheduler root directory in Zookeeper | `/dolphinscheduler` | |
||||
| `externalZookeeper.zookeeperQuorum` | If exists external Zookeeper, and set `zookeeper.enabled` value to false. Specify Zookeeper quorum | `127.0.0.1:2181` | |
||||
| `externalZookeeper.zookeeperRoot` | If exists external Zookeeper, and set `zookeeper.enabled` value to false. Specify dolphinscheduler root directory in Zookeeper | `/dolphinscheduler` | |
||||
| | | | |
||||
| `common.configmap.DOLPHINSCHEDULER_ENV` | System env path, self configuration, please read `values.yaml` | `[]` | |
||||
| `common.configmap.DOLPHINSCHEDULER_DATA_BASEDIR_PATH` | User data directory path, self configuration, please make sure the directory exists and have read write permissions | `/tmp/dolphinscheduler` | |
||||
| `common.configmap.RESOURCE_STORAGE_TYPE` | Resource storage type: HDFS, S3, NONE | `HDFS` | |
||||
| `common.configmap.RESOURCE_UPLOAD_PATH` | Resource store on HDFS/S3 path, please make sure the directory exists on hdfs and have read write permissions | `/dolphinscheduler` | |
||||
| `common.configmap.FS_DEFAULT_FS` | Resource storage file system like `file:///`, `hdfs://mycluster:8020` or `s3a://dolphinscheduler` | `file:///` | |
||||
| `common.configmap.FS_S3A_ENDPOINT` | S3 endpoint when `common.configmap.RESOURCE_STORAGE_TYPE` is set to `S3` | `s3.xxx.amazonaws.com` | |
||||
| `common.configmap.FS_S3A_ACCESS_KEY` | S3 access key when `common.configmap.RESOURCE_STORAGE_TYPE` is set to `S3` | `xxxxxxx` | |
||||
| `common.configmap.FS_S3A_SECRET_KEY` | S3 secret key when `common.configmap.RESOURCE_STORAGE_TYPE` is set to `S3` | `xxxxxxx` | |
||||
| `common.fsFileResourcePersistence.enabled` | Set `common.fsFileResourcePersistence.enabled` to `true` to mount a new file resource volume for `api` and `worker` | `false` | |
||||
| `common.fsFileResourcePersistence.accessModes` | `PersistentVolumeClaim` Access Modes, must be `ReadWriteMany` | `[ReadWriteMany]` | |
||||
| `common.fsFileResourcePersistence.storageClassName` | Resource Persistent Volume Storage Class, must support the access mode: ReadWriteMany | `-` | |
||||
| `common.fsFileResourcePersistence.storage` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| | | | |
||||
| `master.podManagementPolicy` | PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down | `Parallel` | |
||||
| `master.replicas` | Replicas is the desired number of replicas of the given Template | `3` | |
||||
| `master.annotations` | The `annotations` for master server | `{}` | |
||||
| `master.affinity` | If specified, the pod's scheduling constraints | `{}` | |
||||
| `master.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | |
||||
| `master.tolerations` | If specified, the pod's tolerations | `{}` | |
||||
| `master.resources` | The `resource` limit and request config for master server | `{}` | |
||||
| `master.configmap.DOLPHINSCHEDULER_OPTS` | The java options for master server | `""` | |
||||
| `master.configmap.MASTER_EXEC_THREADS` | Master execute thread number | `100` | |
||||
| `master.configmap.MASTER_EXEC_TASK_NUM` | Master execute task number in parallel | `20` | |
||||
| `master.configmap.MASTER_HEARTBEAT_INTERVAL` | Master heartbeat interval | `10` | |
||||
| `master.configmap.MASTER_TASK_COMMIT_RETRYTIMES` | Master commit task retry times | `5` | |
||||
| `master.configmap.MASTER_TASK_COMMIT_INTERVAL` | Master commit task interval | `1000` | |
||||
| `master.configmap.MASTER_MAX_CPULOAD_AVG` | Only less than cpu avg load, master server can work. default value : the number of cpu cores * 2 | `100` | |
||||
| `master.configmap.MASTER_RESERVED_MEMORY` | Only larger than reserved memory, master server can work. default value : physical memory * 1/10, unit is G | `0.1` | |
||||
| `master.configmap.MASTER_LISTEN_PORT` | Master listen port | `5678` | |
||||
| `master.livenessProbe.enabled` | Turn on and off liveness probe | `true` | |
||||
| `master.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | |
||||
| `master.livenessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `master.livenessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `master.livenessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `master.livenessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `master.readinessProbe.enabled` | Turn on and off readiness probe | `true` | |
||||
| `master.readinessProbe.initialDelaySeconds` | Delay before readiness probe is initiated | `30` | |
||||
| `master.readinessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `master.readinessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `master.readinessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `master.readinessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `master.persistentVolumeClaim.enabled` | Set `master.persistentVolumeClaim.enabled` to `true` to mount a new volume for `master` | `false` | |
||||
| `master.persistentVolumeClaim.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | |
||||
| `master.persistentVolumeClaim.storageClassName` | `Master` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `master.persistentVolumeClaim.storage` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| | | | |
||||
| `worker.podManagementPolicy` | PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down | `Parallel` | |
||||
| `worker.replicas` | Replicas is the desired number of replicas of the given Template | `3` | |
||||
| `worker.annotations` | The `annotations` for worker server | `{}` | |
||||
| `worker.affinity` | If specified, the pod's scheduling constraints | `{}` | |
||||
| `worker.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | |
||||
| `worker.tolerations` | If specified, the pod's tolerations | `{}` | |
||||
| `worker.resources` | The `resource` limit and request config for worker server | `{}` | |
||||
| `worker.configmap.DOLPHINSCHEDULER_OPTS` | The java options for worker server | `""` | |
||||
| `worker.configmap.WORKER_EXEC_THREADS` | Worker execute thread number | `100` | |
||||
| `worker.configmap.WORKER_HEARTBEAT_INTERVAL` | Worker heartbeat interval | `10` | |
||||
| `worker.configmap.WORKER_MAX_CPULOAD_AVG` | Only less than cpu avg load, worker server can work. default value : the number of cpu cores * 2 | `100` | |
||||
| `worker.configmap.WORKER_RESERVED_MEMORY` | Only larger than reserved memory, worker server can work. default value : physical memory * 1/10, unit is G | `0.1` | |
||||
| `worker.configmap.WORKER_LISTEN_PORT` | Worker listen port | `1234` | |
||||
| `worker.configmap.WORKER_GROUPS` | Worker groups | `default` | |
||||
| `worker.configmap.WORKER_HOST_WEIGHT` | Worker host weight | `100` | |
||||
| `worker.livenessProbe.enabled` | Turn on and off liveness probe | `true` | |
||||
| `worker.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | |
||||
| `worker.livenessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `worker.livenessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `worker.livenessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `worker.livenessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `worker.readinessProbe.enabled` | Turn on and off readiness probe | `true` | |
||||
| `worker.readinessProbe.initialDelaySeconds` | Delay before readiness probe is initiated | `30` | |
||||
| `worker.readinessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `worker.readinessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `worker.readinessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `worker.readinessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `worker.persistentVolumeClaim.enabled` | Set `worker.persistentVolumeClaim.enabled` to `true` to enable `persistentVolumeClaim` for `worker` | `false` | |
||||
| `worker.persistentVolumeClaim.dataPersistentVolume.enabled` | Set `worker.persistentVolumeClaim.dataPersistentVolume.enabled` to `true` to mount a data volume for `worker` | `false` | |
||||
| `worker.persistentVolumeClaim.dataPersistentVolume.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | |
||||
| `worker.persistentVolumeClaim.dataPersistentVolume.storageClassName` | `Worker` data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `worker.persistentVolumeClaim.dataPersistentVolume.storage` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| `worker.persistentVolumeClaim.logsPersistentVolume.enabled` | Set `worker.persistentVolumeClaim.logsPersistentVolume.enabled` to `true` to mount a logs volume for `worker` | `false` | |
||||
| `worker.persistentVolumeClaim.logsPersistentVolume.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | |
||||
| `worker.persistentVolumeClaim.logsPersistentVolume.storageClassName` | `Worker` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `worker.persistentVolumeClaim.logsPersistentVolume.storage` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| | | | |
||||
| `alert.replicas` | Replicas is the desired number of replicas of the given Template | `1` | |
||||
| `alert.strategy.type` | Type of deployment. Can be "Recreate" or "RollingUpdate" | `RollingUpdate` | |
||||
| `alert.strategy.rollingUpdate.maxSurge` | The maximum number of pods that can be scheduled above the desired number of pods | `25%` | |
||||
| `alert.strategy.rollingUpdate.maxUnavailable` | The maximum number of pods that can be unavailable during the update | `25%` | |
||||
| `alert.annotations` | The `annotations` for alert server | `{}` | |
||||
| `alert.affinity` | If specified, the pod's scheduling constraints | `{}` | |
||||
| `alert.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | |
||||
| `alert.tolerations` | If specified, the pod's tolerations | `{}` | |
||||
| `alert.resources` | The `resource` limit and request config for alert server | `{}` | |
||||
| `alert.configmap.DOLPHINSCHEDULER_OPTS` | The java options for alert server | `""` | |
||||
| `alert.configmap.ALERT_PLUGIN_DIR` | Alert plugin directory | `lib/plugin/alert` | |
||||
| `alert.livenessProbe.enabled` | Turn on and off liveness probe | `true` | |
||||
| `alert.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | |
||||
| `alert.livenessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `alert.livenessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `alert.livenessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `alert.livenessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `alert.readinessProbe.enabled` | Turn on and off readiness probe | `true` | |
||||
| `alert.readinessProbe.initialDelaySeconds` | Delay before readiness probe is initiated | `30` | |
||||
| `alert.readinessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `alert.readinessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `alert.readinessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `alert.readinessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `alert.persistentVolumeClaim.enabled` | Set `alert.persistentVolumeClaim.enabled` to `true` to mount a new volume for `alert` | `false` | |
||||
| `alert.persistentVolumeClaim.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | |
||||
| `alert.persistentVolumeClaim.storageClassName` | `Alert` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `alert.persistentVolumeClaim.storage` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| | | | |
||||
| `api.replicas` | Replicas is the desired number of replicas of the given Template | `1` | |
||||
| `api.strategy.type` | Type of deployment. Can be "Recreate" or "RollingUpdate" | `RollingUpdate` | |
||||
| `api.strategy.rollingUpdate.maxSurge` | The maximum number of pods that can be scheduled above the desired number of pods | `25%` | |
||||
| `api.strategy.rollingUpdate.maxUnavailable` | The maximum number of pods that can be unavailable during the update | `25%` | |
||||
| `api.annotations` | The `annotations` for api server | `{}` | |
||||
| `api.affinity` | If specified, the pod's scheduling constraints | `{}` | |
||||
| `api.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` | |
||||
| `api.tolerations` | If specified, the pod's tolerations | `{}` | |
||||
| `api.resources` | The `resource` limit and request config for api server | `{}` | |
||||
| `api.configmap.DOLPHINSCHEDULER_OPTS` | The java options for api server | `""` | |
||||
| `api.livenessProbe.enabled` | Turn on and off liveness probe | `true` | |
||||
| `api.livenessProbe.initialDelaySeconds` | Delay before liveness probe is initiated | `30` | |
||||
| `api.livenessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `api.livenessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `api.livenessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `api.livenessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `api.readinessProbe.enabled` | Turn on and off readiness probe | `true` | |
||||
| `api.readinessProbe.initialDelaySeconds` | Delay before readiness probe is initiated | `30` | |
||||
| `api.readinessProbe.periodSeconds` | How often to perform the probe | `30` | |
||||
| `api.readinessProbe.timeoutSeconds` | When the probe times out | `5` | |
||||
| `api.readinessProbe.failureThreshold` | Minimum consecutive successes for the probe | `3` | |
||||
| `api.readinessProbe.successThreshold` | Minimum consecutive failures for the probe | `1` | |
||||
| `api.persistentVolumeClaim.enabled` | Set `api.persistentVolumeClaim.enabled` to `true` to mount a new volume for `api` | `false` | |
||||
| `api.persistentVolumeClaim.accessModes` | `PersistentVolumeClaim` Access Modes | `[ReadWriteOnce]` | |
||||
| `api.persistentVolumeClaim.storageClassName` | `api` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` | |
||||
| `api.persistentVolumeClaim.storage` | `PersistentVolumeClaim` Size | `20Gi` | |
||||
| | | | |
||||
| `ingress.enabled` | Enable ingress | `false` | |
||||
| `ingress.host` | Ingress host | `dolphinscheduler.org` | |
||||
| `ingress.path` | Ingress path | `/dolphinscheduler` | |
||||
| `ingress.tls.enabled` | Enable ingress tls | `false` | |
||||
| `ingress.tls.secretName` | Ingress tls secret name | `dolphinscheduler-tls` | |
||||
|
||||
## FAQ |
||||
|
||||
### How to use MySQL as the DolphinScheduler's database instead of PostgreSQL? |
||||
|
||||
> Because of the commercial license, we cannot directly use the driver and client of MySQL. |
||||
> |
||||
> If you want to use MySQL, you can build a new image based on the `apache/dolphinscheduler` image as follows. |
||||
|
||||
1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) |
||||
|
||||
2. Create a new `Dockerfile` to add MySQL driver and client: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib |
||||
RUN apk add --update --no-cache mysql-client |
||||
``` |
||||
|
||||
3. Build a new docker image including MySQL driver and client: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:mysql . |
||||
``` |
||||
|
||||
4. Push the docker image `apache/dolphinscheduler:mysql` to a docker registry |
||||
|
||||
5. Modify image `repository` and update `tag` to `mysql` in `values.yaml` |
||||
|
||||
6. Modify postgresql `enabled` to `false` |
||||
|
||||
7. Modify externalDatabase (especially modify `host`, `username` and `password`): |
||||
|
||||
``` |
||||
externalDatabase: |
||||
type: "mysql" |
||||
driver: "com.mysql.jdbc.Driver" |
||||
host: "localhost" |
||||
port: "3306" |
||||
username: "root" |
||||
password: "root" |
||||
database: "dolphinscheduler" |
||||
params: "useUnicode=true&characterEncoding=UTF-8" |
||||
``` |
||||
|
||||
8. Run a DolphinScheduler release in Kubernetes (See **Installing the Chart**) |
||||
|
||||
### How to support MySQL datasource in `Datasource manage`? |
||||
|
||||
> Because of the commercial license, we cannot directly use the driver of MySQL. |
||||
> |
||||
> If you want to add MySQL datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. |
||||
|
||||
1. Download the MySQL driver [mysql-connector-java-5.1.49.jar](https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.49/mysql-connector-java-5.1.49.jar) (require `>=5.1.47`) |
||||
|
||||
2. Create a new `Dockerfile` to add MySQL driver: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY mysql-connector-java-5.1.49.jar /opt/dolphinscheduler/lib |
||||
``` |
||||
|
||||
3. Build a new docker image including MySQL driver: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:mysql-driver . |
||||
``` |
||||
|
||||
4. Push the docker image `apache/dolphinscheduler:mysql-driver` to a docker registry |
||||
|
||||
5. Modify image `repository` and update `tag` to `mysql-driver` in `values.yaml` |
||||
|
||||
6. Run a DolphinScheduler release in Kubernetes (See **Installing the Chart**) |
||||
|
||||
7. Add a MySQL datasource in `Datasource manage` |
||||
|
||||
### How to support Oracle datasource in `Datasource manage`? |
||||
|
||||
> Because of the commercial license, we cannot directly use the driver of Oracle. |
||||
> |
||||
> If you want to add Oracle datasource, you can build a new image based on the `apache/dolphinscheduler` image as follows. |
||||
|
||||
1. Download the Oracle driver [ojdbc8.jar](https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/) (such as `ojdbc8-19.9.0.0.jar`) |
||||
|
||||
2. Create a new `Dockerfile` to add Oracle driver: |
||||
|
||||
``` |
||||
FROM apache/dolphinscheduler:latest |
||||
COPY ojdbc8-19.9.0.0.jar /opt/dolphinscheduler/lib |
||||
``` |
||||
|
||||
3. Build a new docker image including Oracle driver: |
||||
|
||||
``` |
||||
docker build -t apache/dolphinscheduler:oracle-driver . |
||||
``` |
||||
|
||||
4. Push the docker image `apache/dolphinscheduler:oracle-driver` to a docker registry |
||||
|
||||
5. Modify image `repository` and update `tag` to `oracle-driver` in `values.yaml` |
||||
|
||||
6. Run a DolphinScheduler release in Kubernetes (See **Installing the Chart**) |
||||
|
||||
7. Add a Oracle datasource in `Datasource manage` |
||||
|
||||
For more information please refer to the [dolphinscheduler](https://github.com/apache/dolphinscheduler.git) documentation. |
@ -0,0 +1,36 @@
|
||||
# |
||||
# Licensed to the Apache Software Foundation (ASF) under one or more |
||||
# contributor license agreements. See the NOTICE file distributed with |
||||
# this work for additional information regarding copyright ownership. |
||||
# The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
# (the "License"); you may not use this file except in compliance with |
||||
# the License. You may obtain a copy of the License at |
||||
# |
||||
# http://www.apache.org/licenses/LICENSE-2.0 |
||||
# |
||||
# Unless required by applicable law or agreed to in writing, software |
||||
# distributed under the License is distributed on an "AS IS" BASIS, |
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
# See the License for the specific language governing permissions and |
||||
# limitations under the License. |
||||
# |
||||
{{- if .Values.common.sharedStoragePersistence.enabled }} |
||||
apiVersion: v1 |
||||
kind: PersistentVolumeClaim |
||||
metadata: |
||||
name: {{ include "dolphinscheduler.fullname" . }}-shared |
||||
labels: |
||||
app.kubernetes.io/name: {{ include "dolphinscheduler.fullname" . }}-shared |
||||
{{- include "dolphinscheduler.common.labels" . | nindent 4 }} |
||||
annotations: |
||||
"helm.sh/resource-policy": keep |
||||
spec: |
||||
accessModes: |
||||
{{- range .Values.common.sharedStoragePersistence.accessModes }} |
||||
- {{ . | quote }} |
||||
{{- end }} |
||||
storageClassName: {{ .Values.common.sharedStoragePersistence.storageClassName | quote }} |
||||
resources: |
||||
requests: |
||||
storage: {{ .Values.common.sharedStoragePersistence.storage | quote }} |
||||
{{- end }} |
@ -1,82 +0,0 @@
|
||||
/* |
||||
* Licensed to the Apache Software Foundation (ASF) under one or more |
||||
* contributor license agreements. See the NOTICE file distributed with |
||||
* this work for additional information regarding copyright ownership. |
||||
* The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
* (the "License"); you may not use this file except in compliance with |
||||
* the License. You may obtain a copy of the License at |
||||
* |
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
* |
||||
* Unless required by applicable law or agreed to in writing, software |
||||
* distributed under the License is distributed on an "AS IS" BASIS, |
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
* See the License for the specific language governing permissions and |
||||
* limitations under the License. |
||||
*/ |
||||
|
||||
package org.apache.dolphinscheduler.server.worker; |
||||
|
||||
import org.apache.dolphinscheduler.common.Constants; |
||||
import org.apache.dolphinscheduler.common.utils.StringUtils; |
||||
|
||||
import java.io.BufferedReader; |
||||
import java.io.FileInputStream; |
||||
import java.io.IOException; |
||||
import java.io.InputStreamReader; |
||||
|
||||
import org.junit.Test; |
||||
import org.slf4j.Logger; |
||||
import org.slf4j.LoggerFactory; |
||||
|
||||
public class EnvFileTest { |
||||
|
||||
private static final Logger logger = LoggerFactory.getLogger(EnvFileTest.class); |
||||
|
||||
@Test |
||||
public void test() { |
||||
String path = System.getProperty("user.dir")+"/script/env/dolphinscheduler_env.sh"; |
||||
String pythonHome = getPythonHome(path); |
||||
logger.info(pythonHome); |
||||
} |
||||
|
||||
/** |
||||
* get python home |
||||
* @param path |
||||
* @return |
||||
*/ |
||||
private static String getPythonHome(String path){ |
||||
BufferedReader br = null; |
||||
String line = null; |
||||
StringBuilder sb = new StringBuilder(); |
||||
try { |
||||
br = new BufferedReader(new InputStreamReader(new FileInputStream(path))); |
||||
while ((line = br.readLine()) != null){ |
||||
if (line.contains(Constants.PYTHON_HOME)) { |
||||
sb.append(line); |
||||
break; |
||||
} |
||||
} |
||||
String result = sb.toString(); |
||||
if (StringUtils.isEmpty(result)){ |
||||
return null; |
||||
} |
||||
String[] arrs = result.split("="); |
||||
if (arrs.length == 2){ |
||||
return arrs[1]; |
||||
} |
||||
|
||||
}catch (IOException e){ |
||||
logger.error("read file failed",e); |
||||
}finally { |
||||
try { |
||||
if (br != null){ |
||||
br.close(); |
||||
} |
||||
} catch (IOException e) { |
||||
logger.error(e.getMessage(),e); |
||||
} |
||||
} |
||||
return null; |
||||
} |
||||
} |
@ -0,0 +1,58 @@
|
||||
/* |
||||
* Licensed to the Apache Software Foundation (ASF) under one or more |
||||
* contributor license agreements. See the NOTICE file distributed with |
||||
* this work for additional information regarding copyright ownership. |
||||
* The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
* (the "License"); you may not use this file except in compliance with |
||||
* the License. You may obtain a copy of the License at |
||||
* |
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
* |
||||
* Unless required by applicable law or agreed to in writing, software |
||||
* distributed under the License is distributed on an "AS IS" BASIS, |
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
* See the License for the specific language governing permissions and |
||||
* limitations under the License. |
||||
*/ |
||||
|
||||
package org.apache.dolphinscheduler.server.worker.task; |
||||
|
||||
import org.junit.Assert; |
||||
import org.junit.Test; |
||||
import org.slf4j.Logger; |
||||
import org.slf4j.LoggerFactory; |
||||
|
||||
public class PythonCommandExecutorTest { |
||||
|
||||
private static final Logger logger = LoggerFactory.getLogger(PythonCommandExecutorTest.class); |
||||
|
||||
@Test |
||||
public void testGetPythonHome() { |
||||
String path = System.getProperty("user.dir") + "/script/env/dolphinscheduler_env.sh"; |
||||
if (path.contains("dolphinscheduler-server/")) { |
||||
path = path.replace("dolphinscheduler-server/", ""); |
||||
} |
||||
String pythonHome = PythonCommandExecutor.getPythonHome(path); |
||||
logger.info(pythonHome); |
||||
Assert.assertNotNull(pythonHome); |
||||
} |
||||
|
||||
@Test |
||||
public void testGetPythonCommand() { |
||||
String pythonCommand = PythonCommandExecutor.getPythonCommand(null); |
||||
Assert.assertEquals(PythonCommandExecutor.PYTHON, pythonCommand); |
||||
pythonCommand = PythonCommandExecutor.getPythonCommand(""); |
||||
Assert.assertEquals(PythonCommandExecutor.PYTHON, pythonCommand); |
||||
pythonCommand = PythonCommandExecutor.getPythonCommand("/usr/bin/python"); |
||||
Assert.assertEquals("/usr/bin/python", pythonCommand); |
||||
pythonCommand = PythonCommandExecutor.getPythonCommand("/usr/local/bin/python2"); |
||||
Assert.assertEquals("/usr/local/bin/python2", pythonCommand); |
||||
pythonCommand = PythonCommandExecutor.getPythonCommand("/opt/python/bin/python3.8"); |
||||
Assert.assertEquals("/opt/python/bin/python3.8", pythonCommand); |
||||
pythonCommand = PythonCommandExecutor.getPythonCommand("/opt/soft/python"); |
||||
Assert.assertEquals("/opt/soft/python/bin/python", pythonCommand); |
||||
pythonCommand = PythonCommandExecutor.getPythonCommand("/opt/soft/python-3.8"); |
||||
Assert.assertEquals("/opt/soft/python-3.8/bin/python", pythonCommand); |
||||
} |
||||
|
||||
} |
Loading…
Reference in new issue