More documentation please refer to <ahref="https://dolphinscheduler.apache.org/en-us/docs/1.2.0/user_doc/quick-start.html"target="_blank">[DolphinScheduler online documentation]</a>
### Recent R&D plan
Work plan of Dolphin Scheduler: [R&D plan](https://github.com/apache/incubator-dolphinscheduler/projects/1), Under the `In Develop` card is what is currently being developed, TODO card is to be done (including feature ideas)
@ -86,9 +76,8 @@ Welcome to participate in contributing, please refer to the process of submittin
Artifact:
```
dolphinscheduler-dist/dolphinscheduler-backend/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-backend-bin.tar.gz: Binary package of DolphinScheduler-Backend
dolphinscheduler-dist/dolphinscheduler-front/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-front-bin.tar.gz: Binary package of DolphinScheduler-UI
dolphinscheduler-dist/dolphinscheduler-src/target/apache-dolphinscheduler-incubating-${latest.release.version}-src.zip: Source code package of DolphinScheduler
dolphinscheduler-dist/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-bin.tar.gz: Binary package of DolphinScheduler
dolphinscheduler-dist/target/apache-dolphinscheduler-incubating-${latest.release.version}-src.zip: Source code package of DolphinScheduler
dolphinscheduler-dist/dolphinscheduler-backend/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-backend-bin.tar.gz: Binary package of DolphinScheduler-Backend
dolphinscheduler-dist/dolphinscheduler-front/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-front-bin.tar.gz: Binary package of DolphinScheduler-UI
dolphinscheduler-dist/dolphinscheduler-src/target/apache-dolphinscheduler-incubating-${latest.release.version}-src.zip: Source code package of DolphinScheduler
dolphinscheduler-dist/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-bin.tar.gz: Binary package of DolphinScheduler
dolphinscheduler-dist/target/apache-dolphinscheduler-incubating-${latest.release.version}-src.zip: Source code package of DolphinScheduler
- It is generated by executing the command ```mvn -U clean install -Prpmbuild -Dmaven.test.skip=true -X``` in the project root directory (In the directory: dolphinscheduler-dist/target/rpm/apache-dolphinscheduler-incubating/RPMS/noarch )
2. 创建DS的安装用户--权限
2. Create an installation for DS,who have read and write access to the installation directory (/opt/soft)
3. 初始化数据库信息
3. Install with rpm package
- Manual installation (recommended):
- Copy the prepared RPM packages to each node of the cluster.
- Execute with DS installation user: ```rpm -ivh apache-dolphinscheduler-incubating-xxx.noarch.rpm```
- Mysql-connector-java packaged using the default POM file will not be included.
- The RPM package was packaged in the project with the installation path of /opt/soft.
If you use mysql as the database, you need add it manually.
- Automatic installation with ambari
- Each node of the cluster needs to configure the local yum source
- Copy the prepared RPM packages to each node local yum source
4. Copy plug-in directory
- copy directory ambari_plugin/common-services/DOLPHIN to ambari-server/resources/common-services/
- copy directory ambari_plugin/statcks/DOLPHIN to ambari-server/resources/stacks/HDP/2.6/services/--stack version is selected based on the actual situation
5. Initializes the database information
```
-- 创建Dolphin Scheduler的数据库:dolphinscheduler
-- Create the database for the Dolphin Scheduler:dolphinscheduler
CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE
utf8_general_ci;
-- 初始化dolphinscheduler数据库的用户和密码,并分配权限
-- 替换下面sql语句中的{user}为dolphinscheduler数据库的用户
-- Initialize the user and password for the dolphinscheduler database and assign permissions
-- Replace the {user} in the SQL statement below with the user of the dolphinscheduler database
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}';
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'localhost' IDENTIFIED BY
'{password}';
@ -30,39 +49,84 @@
##### 二 Ambari安装Dolphin Scheduler
#### Ambari Install Dolphin Scheduler
- **NOTE: You have to install zookeeper first**
1. Ambari界面安装Dolphin Scheduler
1. Install Dolphin Scheduler on ambari web interface
#### Add components to the node through Ambari -- for example, add a DS Worker
***NOTE***: DS Logger is the installation dependent component of DS Worker in Dolphin's Ambari installation (need to add installation first; Prevent the Job log on the corresponding Worker from being checked)
1. Locate the component node to add -- for example, node ark3
**Note**: You must be specify `POSTGRESQL_HOST``POSTGRESQL_PORT``POSTGRESQL_DATABASE``POSTGRESQL_USERNAME``POSTGRESQL_PASSWORD``ZOOKEEPER_QUORUM` when start a standalone dolphinscheduler server.
**Note**: You must be specify `DATABASE_HOST``DATABASE_PORT``DATABASE_DATABASE``DATABASE_USERNAME``DATABASE_PASSWORD``ZOOKEEPER_QUORUM` when start a standalone dolphinscheduler server.
The Dolphin Scheduler image uses several environment variables which are easy to miss. While none of the variables are required, they may significantly aid you in using the image.
**`POSTGRESQL_HOST`**
**`DATABASE_TYPE`**
This environment variable sets the host for PostgreSQL. The default value is `127.0.0.1`.
This environment variable sets the type for database. The default value is `postgresql`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`POSTGRESQL_PORT`**
**`DATABASE_DRIVER`**
This environment variable sets the port for PostgreSQL. The default value is `5432`.
This environment variable sets the type for database. The default value is `org.postgresql.Driver`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`DATABASE_HOST`**
This environment variable sets the host for database. The default value is `127.0.0.1`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`DATABASE_PORT`**
This environment variable sets the port for database. The default value is `5432`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`POSTGRESQL_USERNAME`**
**`DATABASE_USERNAME`**
This environment variable sets the username for database. The default value is `root`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`DATABASE_PASSWORD`**
This environment variable sets the username for PostgreSQL. The default value is `root`.
This environment variable sets the password for database. The default value is `root`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`POSTGRESQL_PASSWORD`**
**`DATABASE_DATABASE`**
This environment variable sets the password for PostgreSQL. The default value is `root`.
This environment variable sets the database for database. The default value is `dolphinscheduler`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
**`POSTGRESQL_DATABASE`**
**`DATABASE_PARAMS`**
This environment variable sets the database for PostgreSQL. The default value is `dolphinscheduler`.
This environment variable sets the database for database. The default value is `characterEncoding=utf8`.
**Note**: You must be specify it when start a standalone dolphinscheduler server. Like `master-server`, `worker-server`, `api-server`, `alert-server`.
# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。"/dolphinscheduler" is recommended
# if resource.storage.type=S3,the value like: s3a://dolphinscheduler ; if resource.storage.type=HDFS, When namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
fs.defaultFS=hdfs://mycluster:8020
fs.defaultFS=${FS_DEFAULT_FS}
# if resource.storage.type=S3,s3 endpoint
#fs.s3a.endpoint=http://192.168.199.91:9010
fs.s3a.endpoint=${FS_S3A_ENDPOINT}
# if resource.storage.type=S3,s3 access key
#fs.s3a.access.key=A3DXS30FO22544RE
fs.s3a.access.key=${FS_S3A_ACCESS_KEY}
# if resource.storage.type=S3,s3 secret key
#fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK
fs.s3a.secret.key=${FS_S3A_SECRET_KEY}
# if not use hadoop resourcemanager, please keep default value; if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty TODO
@ -46,8 +46,8 @@ The following tables lists the configurable parameters of the Dolphins Scheduler
| `image.repository` | Docker image repository for the Dolphins Scheduler | `dolphinscheduler` |
| `image.tag` | Docker image version for the Dolphins Scheduler | `1.2.1` |
| `image.imagePullPolicy` | Image pull policy. One of Always, Never, IfNotPresent | `IfNotPresent` |
| `imagePullSecrets` | ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images | `[]` |
| | | |
| `image.pullSecres` | PullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images | `[]` |
| | | |
| `postgresql.enabled` | If not exists external PostgreSQL, by default, the Dolphins Scheduler will use a internal PostgreSQL | `true` |
| `postgresql.postgresqlUsername` | The username for internal PostgreSQL | `root` |
| `postgresql.postgresqlPassword` | The password for internal PostgreSQL | `root` |
@ -55,12 +55,15 @@ The following tables lists the configurable parameters of the Dolphins Scheduler
| `postgresql.persistence.enabled` | Set `postgresql.persistence.enabled` to `true` to mount a new volume for internal PostgreSQL | `false` |
| `postgresql.persistence.storageClass` | PostgreSQL data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` |
| `externalDatabase.host` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database host will use it. | `localhost` |
| `externalDatabase.type` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database type will use it. | `postgresql` |
| `externalDatabase.driver` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database driver will use it. | `org.postgresql.Driver` |
| `externalDatabase.host` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database host will use it. | `localhost` |
| `externalDatabase.port` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database port will use it. | `5432` |
| `externalDatabase.username` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database username will use it. | `root` |
| `externalDatabase.password` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database password will use it. | `root` |
| `externalDatabase.database` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database database will use it. | `dolphinscheduler` |
| | | |
| `externalDatabase.params` | If exists external PostgreSQL, and set `postgresql.enable` value to false. Dolphins Scheduler's database params will use it. | `characterEncoding=utf8` |
| | | |
| `zookeeper.enabled` | If not exists external Zookeeper, by default, the Dolphin Scheduler will use a internal Zookeeper | `true` |
| `zookeeper.taskQueue` | Specify task queue for `master` and `worker` | `zookeeper` |
| `zookeeper.persistence.enabled` | Set `zookeeper.persistence.enabled` to `true` to mount a new volume for internal Zookeeper | `false` |
@ -68,12 +71,25 @@ The following tables lists the configurable parameters of the Dolphins Scheduler
| `zookeeper.persistence.storageClass` | Zookeeper data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` |
| `externalZookeeper.taskQueue` | If exists external Zookeeper, and set `zookeeper.enable` value to false. Specify task queue for `master` and `worker` | `zookeeper` |
| `externalZookeeper.zookeeperQuorum` | If exists external Zookeeper, and set `zookeeper.enable` value to false. Specify Zookeeper quorum | `127.0.0.1:2181` |
| | | |
| `externalZookeeper.zookeeperRoot` | If exists external Zookeeper, and set `zookeeper.enable` value to false. Specify Zookeeper root path for `master` and `worker` | `dolphinscheduler` |
| | | |
| `common.configmap.DOLPHINSCHEDULER_ENV_PATH` | Extra env file path. | `/tmp/dolphinscheduler/env` |
| `common.configmap.DOLPHINSCHEDULER_DATA_BASEDIR_PATH` | File uploaded path of DS. | `/tmp/dolphinscheduler/files` |
| `common.configmap.RESOURCE_STORAGE_TYPE` | Resource Storate type, support type are: S3、HDFS、NONE. | `NONE` |
| `common.configmap.RESOURCE_UPLOAD_PATH` | The base path of resource. | `/ds` |
| `common.configmap.FS_DEFAULT_FS` | The default fs of resource, for s3 is the `s3a` prefix and bucket name. | `s3a://xxxx` |
| `common.configmap.FS_S3A_ENDPOINT` | If the resource type is `S3`, you should fill this filed, it's the endpoint of s3. | `s3.xxx.amazonaws.com` |
| `common.configmap.FS_S3A_ACCESS_KEY` | The access key for your s3 bucket. | `xxxxxxx` |
| `common.configmap.FS_S3A_SECRET_KEY` | The secret key for your s3 bucket. | `xxxxxxx` |
| `master.podManagementPolicy` | PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down | `Parallel` |
| | | |
| `master.replicas` | Replicas is the desired number of replicas of the given Template | `3` |
| `master.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` |
| `master.tolerations` | If specified, the pod's tolerations | `{}` |
| `master.affinity` | If specified, the pod's scheduling constraints | `{}` |
| `master.jvmOptions` | The JVM options for master server. | `""` |
| `master.resources` | The `resource` limit and request config for master server. | `{}` |
| `master.annotations` | The `annotations` for master server. | `{}` |
| `master.configmap.MASTER_EXEC_THREADS` | Master execute thread num | `100` |
| `master.configmap.MASTER_EXEC_TASK_NUM` | Master execute task number in parallel | `20` |
| `master.persistentVolumeClaim.storageClassName` | `Master` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` |
| `worker.podManagementPolicy` | PodManagementPolicy controls how pods are created during initial scale up, when replacing pods on nodes, or when scaling down | `Parallel` |
| `worker.replicas` | Replicas is the desired number of replicas of the given Template | `3` |
| `worker.nodeSelector` | NodeSelector is a selector which must be true for the pod to fit on a node | `{}` |
| `worker.tolerations` | If specified, the pod's tolerations | `{}` |
| `worker.affinity` | If specified, the pod's scheduling constraints | `{}` |
| `worker.jvmOptions` | The JVM options for worker server. | `""` |
| `worker.resources` | The `resource` limit and request config for worker server. | `{}` |
| `worker.annotations` | The `annotations` for worker server. | `{}` |
| `worker.configmap.WORKER_EXEC_THREADS` | Worker execute thread num | `100` |
| `worker.persistentVolumeClaim.logsPersistentVolume.storageClassName` | `Worker` logs data Persistent Volume Storage Class. If set to "-", storageClassName: "", which disables dynamic provisioning | `-` |
"<thead><tr><th>mysql service name</th><th>mysql address</th><th>port</th><th>no index of number</th><th>database client connections</th></tr></thead>\n"+
"<thead><tr><th>mysql service name</th><th>mysql address</th><th>database client connections</th><th>port</th><th>no index of number</th></tr></thead>\n"+
"<tr><td>{\"mysql service name\":\"mysql200\",\"mysql address\":\"192.168.xx.xx\",\"port\":\"3306\",\"no index of number\":\"80\",\"database client connections\":\"190\"}</td></tr><tr><td>{\"mysql service name\":\"mysql210\",\"mysql address\":\"192.168.xx.xx\",\"port\":\"3306\",\"no index of number\":\"10\",\"database client connections\":\"90\"}</td></tr> </table>\n"+
" </body>\n"+
"</html>";
returnConstants.HTML_HEADER_PREFIX+"<tr><td>{\"mysql service name\":\"mysql200\",\"mysql address\":\"192.168.xx.xx\",\"database client connections\":\"190\",\"port\":\"3306\",\"no index of number\":\"80\"}</td></tr><tr><td>{\"mysql service name\":\"mysql210\",\"mysql address\":\"192.168.xx.xx\",\"database client connections\":\"90\",\"port\":\"3306\",\"no index of number\":\"10\"}</td></tr>"+Constants.TABLE_BODY_HTML_TAIL;
USER_NOT_EXIST(10010,"user {0} not exists","用户[{0}]不存在"),
ALERT_GROUP_NOT_EXIST(10011,"alarm group not found","告警组不存在"),
ALERT_GROUP_EXIST(10012,"alarm group already exists","告警组名称已存在"),
@ -192,7 +192,7 @@ public enum Status {
RESOURCE_IS_USED(20014,"resource file is used by process definition","资源文件被上线的流程定义使用了"),
PARENT_RESOURCE_NOT_EXIST(20015,"parent resource not exist","父资源文件不存在"),
RESOURCE_NOT_EXIST_OR_NO_PERMISSION(20016,"resource not exist or no permission,please view the task node and remove error resource","请检查任务节点并移除无权限或者已删除的资源"),
RESOURCE_IS_AUTHORIZED(20017,"resource is authorized to user {0},suffix not allowed to be modified","资源文件已授权其他用户[{0}],后缀不允许修改"),
USER_NO_OPERATION_PERM(30001,"user has no operation privilege","当前用户没有操作权限"),
USER_NO_OPERATION_PROJECT_PERM(30002,"user {0} is not has project {1} permission","当前用户[{0}]没有[{1}]项目的操作权限"),