Browse Source

Merge pull request #3 from apache/dev

update code
pull/2/head
samz406 5 years ago committed by GitHub
parent
commit
1059d4b035
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 36
      .github/ISSUE_TEMPLATE/bug_report.md
  2. 20
      .github/ISSUE_TEMPLATE/feature_request.md
  3. 93
      CONTRIBUTING.md
  4. 41
      Dockerfile
  5. 98
      README.md
  6. 86
      README_zh_CN.md
  7. 310
      conf/install.sh
  8. 105
      conf/run.sh
  9. 146
      dockerfile/Dockerfile
  10. 30
      dockerfile/conf/escheduler/conf/alert.properties
  11. 31
      dockerfile/conf/escheduler/conf/alert_logback.xml
  12. 42
      dockerfile/conf/escheduler/conf/apiserver_logback.xml
  13. 19
      dockerfile/conf/escheduler/conf/application.properties
  14. 1
      dockerfile/conf/escheduler/conf/application_master.properties
  15. 42
      dockerfile/conf/escheduler/conf/common/common.properties
  16. 18
      dockerfile/conf/escheduler/conf/common/hadoop/hadoop.properties
  17. 3
      dockerfile/conf/escheduler/conf/config/install_config.conf
  18. 4
      dockerfile/conf/escheduler/conf/config/run_config.conf
  19. 53
      dockerfile/conf/escheduler/conf/dao/data_source.properties
  20. 3
      dockerfile/conf/escheduler/conf/env/.escheduler_env.sh
  21. 235
      dockerfile/conf/escheduler/conf/i18n/messages.properties
  22. 235
      dockerfile/conf/escheduler/conf/i18n/messages_en_US.properties
  23. 233
      dockerfile/conf/escheduler/conf/i18n/messages_zh_CN.properties
  24. 1
      dockerfile/conf/escheduler/conf/mail_templates/alert_mail_template.ftl
  25. 21
      dockerfile/conf/escheduler/conf/master.properties
  26. 34
      dockerfile/conf/escheduler/conf/master_logback.xml
  27. 39
      dockerfile/conf/escheduler/conf/quartz.properties
  28. 15
      dockerfile/conf/escheduler/conf/worker.properties
  29. 60
      dockerfile/conf/escheduler/conf/worker_logback.xml
  30. 25
      dockerfile/conf/escheduler/conf/zookeeper.properties
  31. 263
      dockerfile/conf/maven/settings.xml
  32. 12
      dockerfile/conf/nginx/default.conf
  33. 2
      dockerfile/conf/zookeeper/zoo.cfg
  34. 8
      dockerfile/hooks/build
  35. 8
      dockerfile/hooks/push
  36. 81
      dockerfile/startup.sh
  37. 16
      docs/en_US/1.0.1-release.md
  38. 49
      docs/en_US/1.0.2-release.md
  39. 30
      docs/en_US/1.0.3-release.md
  40. 2
      docs/en_US/1.0.4-release.md
  41. 2
      docs/en_US/1.0.5-release.md
  42. 55
      docs/en_US/1.1.0-release.md
  43. 299
      docs/en_US/EasyScheduler Proposal.md
  44. 284
      docs/en_US/EasyScheduler-FAQ.md
  45. 96
      docs/en_US/README.md
  46. 50
      docs/en_US/SUMMARY.md
  47. 316
      docs/en_US/architecture-design.md
  48. 207
      docs/en_US/backend-deployment.md
  49. 48
      docs/en_US/backend-development.md
  50. 23
      docs/en_US/book.json
  51. 115
      docs/en_US/frontend-deployment.md
  52. 650
      docs/en_US/frontend-development.md
  53. BIN
      docs/en_US/images/auth-project.png
  54. BIN
      docs/en_US/images/complement.png
  55. BIN
      docs/en_US/images/depend-b-and-c.png
  56. BIN
      docs/en_US/images/depend-last-tuesday.png
  57. BIN
      docs/en_US/images/depend-week.png
  58. BIN
      docs/en_US/images/save-definition.png
  59. BIN
      docs/en_US/images/save-global-parameters.png
  60. BIN
      docs/en_US/images/start-process.png
  61. BIN
      docs/en_US/images/timing.png
  62. 53
      docs/en_US/quick-start.md
  63. 699
      docs/en_US/system-manual.md
  64. 39
      docs/en_US/upgrade.md
  65. 30
      docs/zh_CN/1.0.3-release.md
  66. 28
      docs/zh_CN/1.0.4-release.md
  67. 23
      docs/zh_CN/1.0.5-release.md
  68. 63
      docs/zh_CN/1.1.0-release.md
  69. 287
      docs/zh_CN/EasyScheduler-FAQ.md
  70. 3
      docs/zh_CN/README.md
  71. 16
      docs/zh_CN/SUMMARY.md
  72. 2
      docs/zh_CN/book.json
  73. BIN
      docs/zh_CN/images/cdh_hive_error.png
  74. BIN
      docs/zh_CN/images/complement.png
  75. BIN
      docs/zh_CN/images/create-queue.png
  76. BIN
      docs/zh_CN/images/dag1.png
  77. BIN
      docs/zh_CN/images/dag2.png
  78. BIN
      docs/zh_CN/images/dag3.png
  79. BIN
      docs/zh_CN/images/dag4.png
  80. BIN
      docs/zh_CN/images/depend-node.png
  81. BIN
      docs/zh_CN/images/depend-node2.png
  82. BIN
      docs/zh_CN/images/depend-node3.png
  83. BIN
      docs/zh_CN/images/file-manage.png
  84. BIN
      docs/zh_CN/images/gant-pic.png
  85. BIN
      docs/zh_CN/images/global_parameter.png
  86. BIN
      docs/zh_CN/images/hive_edit.png
  87. BIN
      docs/zh_CN/images/hive_edit2.png
  88. BIN
      docs/zh_CN/images/hive_kerberos.png
  89. BIN
      docs/zh_CN/images/instance-detail.png
  90. BIN
      docs/zh_CN/images/instance-list.png
  91. BIN
      docs/zh_CN/images/local_parameter.png
  92. BIN
      docs/zh_CN/images/master-jk.png
  93. BIN
      docs/zh_CN/images/master2.png
  94. BIN
      docs/zh_CN/images/master_worker_lack_res.png
  95. BIN
      docs/zh_CN/images/mysql-jk.png
  96. BIN
      docs/zh_CN/images/mysql.png
  97. BIN
      docs/zh_CN/images/mysql_edit.png
  98. BIN
      docs/zh_CN/images/postgressql_edit.png
  99. BIN
      docs/zh_CN/images/project.png
  100. BIN
      docs/zh_CN/images/run-work.png
  101. Some files were not shown because too many files have changed in this diff Show More

36
.github/ISSUE_TEMPLATE/bug_report.md

@ -0,0 +1,36 @@
---
name: Bug report
about: Create a report to help us improve
title: "[BUG] bug title "
labels: bug
assignees: ''
---
*For better global communication, please give priority to using English description, thx! *
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior, for example:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Which version of Easy Scheduler:**
-[1.1.0-preview]
**Additional context**
Add any other context about the problem here.
**Requirement or improvement
- Please describe about your requirements or improvement suggestions.

20
.github/ISSUE_TEMPLATE/feature_request.md

@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: "[Feature]"
labels: new feature
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

93
CONTRIBUTING.md

@ -1,5 +1,70 @@
EasyScheduler提交代码流程 * First from the remote repository *https://github.com/analysys/EasyScheduler.git* fork code to your own repository
=====
* there are three branches in the remote repository currently:
* master normal delivery branch
After the stable version is released, the code for the stable version branch is merged into the master branch.
* dev daily development branch
The daily development branch, the newly submitted code can pull requests to this branch.
* branch-1.0.0 release version branch
Release version branch, there will be 2.0 ... and other version branches, the version
branch only changes the error, does not add new features.
* Clone your own warehouse to your local
`git clone https://github.com/analysys/EasyScheduler.git`
* Add remote repository address, named upstream
`git remote add upstream https://github.com/analysys/EasyScheduler.git`
* View repository:
`git remote -v`
> There will be two repositories at this time: origin (your own warehouse) and upstream (remote repository)
* Get/update remote repository code (already the latest code, skip it)
`git fetch upstream`
* Synchronize remote repository code to local repository
```
git checkout origin/dev
git merge --no-ff upstream/dev
```
If remote branch has a new branch `DEV-1.0`, you need to synchronize this branch to the local repository.
```
git checkout -b dev-1.0 upstream/dev-1.0
git push --set-upstream origin dev1.0
```
* After modifying the code locally, submit it to your own repository:
`git commit -m 'test commit'`
`git push`
* Submit changes to the remote repository
* On the github page, click on the new pull request.
<p align = "center">
<img src = "http://geek.analysys.cn/static/upload/221/2019-04-02/90f3abbf-70ef-4334-b8d6-9014c9cf4c7f.png"width ="60%"/>
</ p>
* Select the modified local branch and the branch to merge past to create a pull request.
<p align = "center">
<img src = "http://geek.analysys.cn/static/upload/221/2019-04-02/fe7eecfe-2720-4736-951b-b3387cf1ae41.png"width ="60%"/>
</ p>
* Next, the administrator is responsible for **merging** to complete the pull request
---
* 首先从远端仓库*https://github.com/analysys/EasyScheduler.git* fork一份代码到自己的仓库中 * 首先从远端仓库*https://github.com/analysys/EasyScheduler.git* fork一份代码到自己的仓库中
* 远端仓库中目前有三个分支: * 远端仓库中目前有三个分支:
@ -14,7 +79,7 @@ EasyScheduler提交代码流程
* 把自己仓库clone到本地 * 把自己仓库clone到本地
`git clone https://github.com/**/EasyScheduler.git` `git clone https://github.com/analysys/EasyScheduler.git`
* 添加远端仓库地址,命名为upstream * 添加远端仓库地址,命名为upstream
@ -26,17 +91,10 @@ EasyScheduler提交代码流程
> 此时会有两个仓库:origin(自己的仓库)和upstream(远端仓库) > 此时会有两个仓库:origin(自己的仓库)和upstream(远端仓库)
* 获取远端仓库代码(已经是最新代码,就跳过) * 获取/更新远端仓库代码(已经是最新代码,就跳过)
`git fetch upstream ` `git fetch upstream `
* 更新远端仓库代码
```
git checkout upstream/dev
git pull upstream dev
```
* 同步远端仓库代码到本地仓库 * 同步远端仓库代码到本地仓库
@ -54,7 +112,7 @@ git push --set-upstream origin dev1.0
* 在本地修改代码以后,提交到自己仓库: * 在本地修改代码以后,提交到自己仓库:
`git ca -m 'test commit'` `git commit -m 'test commit'`
`git push` `git push`
* 将修改提交到远端仓库 * 将修改提交到远端仓库
@ -68,6 +126,15 @@ git push --set-upstream origin dev1.0
<p align="center"> <p align="center">
<img src="http://geek.analysys.cn/static/upload/221/2019-04-02/fe7eecfe-2720-4736-951b-b3387cf1ae41.png" width="60%" /> <img src="http://geek.analysys.cn/static/upload/221/2019-04-02/fe7eecfe-2720-4736-951b-b3387cf1ae41.png" width="60%" />
</p> </p>
* 接下来由管理员负责将**Merge**完成此次pull request
* 接下来由管理员负责将**Merge**完成此次pull request

41
Dockerfile

@ -1,41 +0,0 @@
#Maintin by jimmy
#Email: zhengge2012@gmail.com
FROM anapsix/alpine-java:8_jdk
WORKDIR /tmp
RUN wget http://archive.apache.org/dist/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz
RUN tar -zxvf apache-maven-3.6.1-bin.tar.gz && rm apache-maven-3.6.1-bin.tar.gz
RUN mv apache-maven-3.6.1 /usr/lib/mvn
RUN chown -R root:root /usr/lib/mvn
RUN ln -s /usr/lib/mvn/bin/mvn /usr/bin/mvn
RUN wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
RUN tar -zxvf zookeeper-3.4.6.tar.gz
RUN mv zookeeper-3.4.6 /opt/zookeeper
RUN rm -rf zookeeper-3.4.6.tar.gz
RUN echo "export ZOOKEEPER_HOME=/opt/zookeeper" >>/etc/profile
RUN echo "export PATH=$PATH:$ZOOKEEPER_HOME/bin" >>/etc/profile
ADD conf/zoo.cfg /opt/zookeeper/conf/zoo.cfg
#RUN source /etc/profile
#RUN zkServer.sh start
RUN apk add --no-cache git npm nginx mariadb mariadb-client mariadb-server-utils pwgen
WORKDIR /opt
RUN git clone https://github.com/analysys/EasyScheduler.git
WORKDIR /opt/EasyScheduler
RUN mvn -U clean package assembly:assembly -Dmaven.test.skip=true
RUN mv /opt/EasyScheduler/target/escheduler-1.0.0-SNAPSHOT /opt/easyscheduler
WORKDIR /opt/EasyScheduler/escheduler-ui
RUN npm install
RUN npm audit fix
RUN npm run build
RUN mkdir -p /opt/escheduler/front/server
RUN cp -rfv dist/* /opt/escheduler/front/server
WORKDIR /
RUN rm -rf /opt/EasyScheduler
#configure mysql server https://github.com/yobasystems/alpine-mariadb/tree/master/alpine-mariadb-amd64
ADD conf/run.sh /scripts/run.sh
RUN mkdir /docker-entrypoint-initdb.d && \
mkdir /scripts/pre-exec.d && \
mkdir /scripts/pre-init.d && \
chmod -R 755 /scripts
RUN rm -rf /var/cache/apk/*
EXPOSE 8888
ENTRYPOINT ["/scripts/run.sh"]

98
README.md

@ -1,72 +1,90 @@
Easy Scheduler Easy Scheduler
============ ============
[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) [![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![Total Lines](https://tokei.rs/b1/github/analysys/EasyScheduler?category=lines)](https://github.com/analysys/EasyScheduler)
> Easy Scheduler for Big Data > Easy Scheduler for Big Data
**设计特点:** 一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中`开箱即用`。 [![Stargazers over time](https://starchart.cc/analysys/EasyScheduler.svg)](https://starchart.cc/analysys/EasyScheduler)
其主要目标如下:
- 以DAG图的方式将Task按照任务的依赖关系关联起来,可实时可视化监控任务的运行状态
- 支持丰富的任务类型:Shell、MR、Spark、SQL(mysql、postgresql、hive、sparksql),Python,Sub_Process、Procedure等
- 支持工作流定时调度、依赖调度、手动调度、手动暂停/停止/恢复,同时支持失败重试/告警、从指定节点恢复失败、Kill任务等操作
- 支持工作流优先级、任务优先级及任务的故障转移及任务超时告警/失败
- 支持工作流全局参数及节点自定义参数设置
- 支持资源文件的在线上传/下载,管理等,支持在线文件创建、编辑
- 支持任务日志在线查看及滚动、在线下载日志等
- 实现集群HA,通过Zookeeper实现Master集群和Worker集群去中心化
- 支持对`Master/Worker` cpu load,memory,cpu在线查看
- 支持工作流运行历史树形/甘特图展示、支持任务状态统计、流程状态统计
- 支持补数
- 支持多租户
- 支持国际化
- 还有更多等待伙伴们探索
### 与同类调度系统的对比 [![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md)
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md)
![调度系统对比](http://geek.analysys.cn/static/upload/47/2019-03-01/9609ca82-cf8b-4d91-8dc0-0e2805194747.jpeg)
### 系统部分截图 ### Design features:
![](http://geek.analysys.cn/static/upload/221/2019-03-29/0a9dea80-fb02-4fa5-a812-633b67035ffc.jpeg) A distributed and easy-to-expand visual DAG workflow scheduling system. Dedicated to solving the complex dependencies in data processing, making the scheduling system `out of the box` for data processing.
Its main objectives are as follows:
![](http://geek.analysys.cn/static/upload/221/2019-04-01/83686def-a54f-4169-8cae-77b1f8300cc1.png) - Associate the Tasks according to the dependencies of the tasks in a DAG graph, which can visualize the running state of task in real time.
- Support for many task types: Shell, MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Sub_Process, Procedure, etc.
- Support process scheduling, dependency scheduling, manual scheduling, manual pause/stop/recovery, support for failed retry/alarm, recovery from specified nodes, Kill task, etc.
- Support process priority, task priority and task failover and task timeout alarm/failure
- Support process global parameters and node custom parameter settings
- Support online upload/download of resource files, management, etc. Support online file creation and editing
- Support task log online viewing and scrolling, online download log, etc.
- Implement cluster HA, decentralize Master cluster and Worker cluster through Zookeeper
- Support online viewing of `Master/Worker` cpu load, memory
- Support process running history tree/gantt chart display, support task status statistics, process status statistics
- Support backfilling data
- Support multi-tenant
- Support internationalization
- There are more waiting partners to explore
![](http://geek.analysys.cn/static/upload/221/2019-03-29/83c937c7-1793-4d7a-aa28-b98460329fe0.jpeg)
### 文档 ### What's in Easy Scheduler
- <a href="https://analysys.github.io/easyscheduler_docs_cn/后端部署文档.html" target="_blank">后端部署文档</a> Stability | Easy to use | Features | Scalability |
-- | -- | -- | --
Decentralized multi-master and multi-worker | Visualization process defines key information such as task status, task type, retry times, task running machine, visual variables and so on at a glance.  |  Support pause, recover operation | support custom task types
HA is supported by itself | All process definition operations are visualized, dragging tasks to draw DAGs, configuring data sources and resources. At the same time, for third-party systems, the api mode operation is provided. | Users on easyscheduler can achieve many-to-one or one-to-one mapping relationship through tenants and Hadoop users, which is very important for scheduling large data jobs. " | The scheduler uses distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster. Master and Worker support dynamic online and offline.
Overload processing: Task queue mechanism, the number of schedulable tasks on a single machine can be flexibly configured, when too many tasks will be cached in the task queue, will not cause machine jam. | One-click deployment | Supports traditional shell tasks, and also support big data platform task scheduling: MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Procedure, Sub_Process | |
- <a href="https://analysys.github.io/easyscheduler_docs_cn/前端部署文档.html" target="_blank">前端部署文档</a>
- [**使用手册**](https://analysys.github.io/easyscheduler_docs_cn/系统使用手册.html?_blank "系统使用手册")
- [**升级文档**](https://analysys.github.io/easyscheduler_docs_cn/升级文档.html?_blank "升级文档")
- <a href="http://52.82.13.76:8888" target="_blank">我要体验</a> 普通用户登录:demo/demo123 ### System partial screenshot
更多文档请参考 <a href="https://analysys.github.io/easyscheduler_docs_cn/" target="_blank">easyscheduler中文在线文档</a> ![image](https://user-images.githubusercontent.com/48329107/61368744-1f5f3b00-a8c1-11e9-9cf1-10f8557a6b3b.png)
![image](https://user-images.githubusercontent.com/48329107/61368966-9dbbdd00-a8c1-11e9-8dcc-a9469d33583e.png)
### 近期研发计划 ![image](https://user-images.githubusercontent.com/48329107/61372146-f347b800-a8c8-11e9-8882-66e8934ada23.png)
EasyScheduler的工作计划:<a href="https://github.com/analysys/EasyScheduler/projects/1" target="_blank">研发计划</a> ,其中 In Develop卡片下是1.0.2版本的功能,TODO卡片是待做事项(包括 feature ideas)
### 贡献代码 ### Document
非常欢迎大家来参与贡献代码,提交代码流程请参考: - <a href="https://analysys.github.io/easyscheduler_docs_cn/后端部署文档.html" target="_blank">Backend deployment documentation</a>
https://github.com/analysys/EasyScheduler/blob/master/CONTRIBUTING.md
- <a href="https://analysys.github.io/easyscheduler_docs_cn/前端部署文档.html" target="_blank">Front-end deployment documentation</a>
### 感谢 - [**User manual**](https://analysys.github.io/easyscheduler_docs_cn/系统使用手册.html?_blank "User manual")
Easy Scheduler使用了很多优秀的开源项目,比如google的guava、guice、grpc,netty,ali的bonecp,quartz,以及apache的众多开源项目等等, - [**Upgrade document**](https://analysys.github.io/easyscheduler_docs_cn/升级文档.html?_blank "Upgrade document")
正是由于站在这些开源项目的肩膀上,才有Easy Scheduler的诞生的可能。对此我们对使用的所有开源软件表示非常的感谢!我们也希望自己不仅是开源的受益者,也能成为开源的
贡献者,于是我们决定把易调度贡献出来,并承诺长期维护。也希望对开源有同样热情和信念的伙伴加入进来,一起为开源献出一份力!
- <a href="http://106.75.43.194:8888" target="_blank">Online Demo</a>
More documentation please refer to <a href="https://analysys.github.io/easyscheduler_docs_cn/" target="_blank">[EasyScheduler online documentation]</a>
### Recent R&D plan
Work plan of Easy Scheduler: [R&D plan](https://github.com/analysys/EasyScheduler/projects/1), where `In Develop` card is the features of 1.1.0 version , TODO card is to be done (including feature ideas)
### How to contribute code
Welcome to participate in contributing code, please refer to the process of submitting the code:
[[How to contribute code](https://github.com/analysys/EasyScheduler/issues/310)]
### Thanks
Easy Scheduler uses a lot of excellent open source projects, such as google guava, guice, grpc, netty, ali bonecp, quartz, and many open source projects of apache, etc.
It is because of the shoulders of these open source projects that the birth of the Easy Scheduler is possible. We are very grateful for all the open source software used! We also hope that we will not only be the beneficiaries of open source, but also be open source contributors, so we decided to contribute to easy scheduling and promised long-term updates. We also hope that partners who have the same passion and conviction for open source will join in and contribute to open source!
### Get Help
The fastest way to get response from our developers is to submit issues, or add our wechat : 510570367
### License
Please refer to [LICENSE](https://github.com/analysys/EasyScheduler/blob/dev/LICENSE) file.
### 帮助
The fastest way to get response from our developers is to submit issues, or add our wechat : 510570367

86
README_zh_CN.md

@ -0,0 +1,86 @@
Easy Scheduler
============
[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![Total Lines](https://tokei.rs/b1/github/analysys/EasyScheduler?category=lines)](https://github.com/analysys/EasyScheduler)
> Easy Scheduler for Big Data
[![Stargazers over time](https://starchart.cc/analysys/EasyScheduler.svg)](https://starchart.cc/analysys/EasyScheduler)
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md)
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md)
**设计特点:** 一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系,使调度系统在数据处理流程中`开箱即用`。
其主要目标如下:
- 以DAG图的方式将Task按照任务的依赖关系关联起来,可实时可视化监控任务的运行状态
- 支持丰富的任务类型:Shell、MR、Spark、SQL(mysql、postgresql、hive、sparksql),Python,Sub_Process、Procedure等
- 支持工作流定时调度、依赖调度、手动调度、手动暂停/停止/恢复,同时支持失败重试/告警、从指定节点恢复失败、Kill任务等操作
- 支持工作流优先级、任务优先级及任务的故障转移及任务超时告警/失败
- 支持工作流全局参数及节点自定义参数设置
- 支持资源文件的在线上传/下载,管理等,支持在线文件创建、编辑
- 支持任务日志在线查看及滚动、在线下载日志等
- 实现集群HA,通过Zookeeper实现Master集群和Worker集群去中心化
- 支持对`Master/Worker` cpu load,memory,cpu在线查看
- 支持工作流运行历史树形/甘特图展示、支持任务状态统计、流程状态统计
- 支持补数
- 支持多租户
- 支持国际化
- 还有更多等待伙伴们探索
### 与同类调度系统的对比
![调度系统对比](http://geek.analysys.cn/static/upload/47/2019-03-01/9609ca82-cf8b-4d91-8dc0-0e2805194747.jpeg)
### 系统部分截图
![](http://geek.analysys.cn/static/upload/221/2019-03-29/0a9dea80-fb02-4fa5-a812-633b67035ffc.jpeg)
![](http://geek.analysys.cn/static/upload/221/2019-04-01/83686def-a54f-4169-8cae-77b1f8300cc1.png)
![](http://geek.analysys.cn/static/upload/221/2019-03-29/83c937c7-1793-4d7a-aa28-b98460329fe0.jpeg)
### 文档
- <a href="https://analysys.github.io/easyscheduler_docs_cn/后端部署文档.html" target="_blank">后端部署文档</a>
- <a href="https://analysys.github.io/easyscheduler_docs_cn/前端部署文档.html" target="_blank">前端部署文档</a>
- [**使用手册**](https://analysys.github.io/easyscheduler_docs_cn/系统使用手册.html?_blank "系统使用手册")
- [**升级文档**](https://analysys.github.io/easyscheduler_docs_cn/升级文档.html?_blank "升级文档")
- <a href="http://106.75.43.194:8888" target="_blank">我要体验</a>
更多文档请参考 <a href="https://analysys.github.io/easyscheduler_docs_cn/" target="_blank">easyscheduler中文在线文档</a>
### 近期研发计划
EasyScheduler的工作计划:<a href="https://github.com/analysys/EasyScheduler/projects/1" target="_blank">研发计划</a> ,其中 In Develop卡片下是1.1.0版本的功能,TODO卡片是待做事项(包括 feature ideas)
### 贡献代码
非常欢迎大家来参与贡献代码,提交代码流程请参考:
[[How to contribute code](https://github.com/analysys/EasyScheduler/issues/310)]
### 感谢
Easy Scheduler使用了很多优秀的开源项目,比如google的guava、guice、grpc,netty,ali的bonecp,quartz,以及apache的众多开源项目等等,
正是由于站在这些开源项目的肩膀上,才有Easy Scheduler的诞生的可能。对此我们对使用的所有开源软件表示非常的感谢!我们也希望自己不仅是开源的受益者,也能成为开源的
贡献者,于是我们决定把易调度贡献出来,并承诺长期维护。也希望对开源有同样热情和信念的伙伴加入进来,一起为开源献出一份力!
### 帮助
The fastest way to get response from our developers is to submit issues, or add our wechat : 510570367

310
conf/install.sh

@ -1,310 +0,0 @@
#!/bin/sh
workDir=`/opt/easyscheduler`
workDir=`cd ${workDir};pwd`
#To be compatible with MacOS and Linux
txt=""
if [[ "$OSTYPE" == "darwin"* ]]; then
# Mac OSX
txt="''"
elif [[ "$OSTYPE" == "linux-gnu" ]]; then
# linux
txt=""
elif [[ "$OSTYPE" == "cygwin" ]]; then
# POSIX compatibility layer and Linux environment emulation for Windows
echo "Easy Scheduler not support Windows operating system"
exit 1
elif [[ "$OSTYPE" == "msys" ]]; then
# Lightweight shell and GNU utilities compiled for Windows (part of MinGW)
echo "Easy Scheduler not support Windows operating system"
exit 1
elif [[ "$OSTYPE" == "win32" ]]; then
echo "Easy Scheduler not support Windows operating system"
exit 1
elif [[ "$OSTYPE" == "freebsd"* ]]; then
# ...
txt=""
else
# Unknown.
echo "Operating system unknown, please tell us(submit issue) for better service"
exit 1
fi
source ${workDir}/conf/config/run_config.conf
source ${workDir}/conf/config/install_config.conf
# mysql配置
# mysql 地址,端口
mysqlHost="127.0.0.1:3306"
# mysql 数据库名称
mysqlDb="easyscheduler"
# mysql 用户名
mysqlUserName="easyscheduler"
# mysql 密码
mysqlPassword="easyschedulereasyscheduler"
# conf/config/install_config.conf配置
# 安装路径,不要当前路径(pwd)一样
installPath="/opt/easyscheduler"
# 部署用户
deployUser="escheduler"
# zk集群
zkQuorum="192.168.xx.xx:2181,192.168.xx.xx:2181,192.168.xx.xx:2181"
# 安装hosts
ips="ark0,ark1,ark2,ark3,ark4"
# conf/config/run_config.conf配置
# 运行Master的机器
masters="ark0,ark1"
# 运行Worker的机器
workers="ark2,ark3,ark4"
# 运行Alert的机器
alertServer="ark3"
# 运行Api的机器
apiServers="ark1"
# alert配置
# 邮件协议
mailProtocol="SMTP"
# 邮件服务host
mailServerHost="smtp.exmail.qq.com"
# 邮件服务端口
mailServerPort="25"
# 发送人
mailSender="xxxxxxxxxx"
# 发送人密码
mailPassword="xxxxxxxxxx"
# 下载Excel路径
xlsFilePath="/tmp/xls"
# hadoop 配置
# 是否启动hdfs,如果启动则为true,需要配置以下hadoop相关参数;
# 不启动设置为false,如果为false,以下配置不需要修改
hdfsStartupSate="false"
# namenode地址,支持HA,需要将core-site.xml和hdfs-site.xml放到conf目录下
namenodeFs="hdfs://mycluster:8020"
# resourcemanager HA配置,如果是单resourcemanager,这里为空即可
yarnHaIps="192.168.xx.xx,192.168.xx.xx"
# 如果是单 resourcemanager,只需要配置一个主机名称,如果是resourcemanager HA,则默认配置就好
singleYarnIp="ark1"
# hdfs根路径,根路径的owner必须是部署用户
hdfsPath="/escheduler"
# common 配置
# 程序路径
programPath="/tmp/escheduler"
#下载路径
downloadPath="/tmp/escheduler/download"
# 任务执行路径
execPath="/tmp/escheduler/exec"
# SHELL环境变量路径
shellEnvPath="$installPath/conf/env/.escheduler_env.sh"
# Python换将变量路径
pythonEnvPath="$installPath/conf/env/escheduler_env.py"
# 资源文件的后缀
resSuffixs="txt,log,sh,conf,cfg,py,java,sql,hql,xml"
# 开发状态,如果是true,对于SHELL脚本可以在execPath目录下查看封装后的SHELL脚本,如果是false则执行完成直接删除
devState="true"
# zk 配置
# zk根目录
zkRoot="/escheduler"
# 用来记录挂掉机器的zk目录
zkDeadServers="/escheduler/dead-servers"
# masters目录
zkMasters="/escheduler/masters"
# workers目录
zkWorkers="/escheduler/workers"
# zk master分布式锁
mastersLock="/escheduler/lock/masters"
# zk worker分布式锁
workersLock="/escheduler/lock/workers"
# zk master容错分布式锁
mastersFailover="/escheduler/lock/failover/masters"
# zk worker容错分布式锁
workersFailover="/escheduler/lock/failover/masters"
# zk session 超时
zkSessionTimeout="300"
# zk 连接超时
zkConnectionTimeout="300"
# zk 重试间隔
zkRetrySleep="100"
# zk重试最大次数
zkRetryMaxtime="5"
# master 配置
# master执行线程最大数,流程实例的最大并行度
masterExecThreads="100"
# master任务执行线程最大数,每一个流程实例的最大并行度
masterExecTaskNum="20"
# master心跳间隔
masterHeartbeatInterval="10"
# master任务提交重试次数
masterTaskCommitRetryTimes="5"
# master任务提交重试时间间隔
masterTaskCommitInterval="100"
# master最大cpu平均负载,用来判断master是否还有执行能力
masterMaxCupLoadAvg="10"
# master预留内存,用来判断master是否还有执行能力
masterReservedMemory="1"
# worker 配置
# worker执行线程
workerExecThreads="100"
# worker心跳间隔
workerHeartbeatInterval="10"
# worker一次抓取任务数
workerFetchTaskNum="10"
# worker最大cpu平均负载,用来判断master是否还有执行能力
workerMaxCupLoadAvg="10"
# worker预留内存,用来判断master是否还有执行能力
workerReservedMemory="1"
# api 配置
# api 服务端口
apiServerPort="12345"
# api session 超时
apiServerSessionTimeout="7200"
# api 上下文路径
apiServerContextPath="/escheduler/"
# spring 最大文件大小
springMaxFileSize="1024MB"
# spring 最大请求文件大小
springMaxRequestSize="1024MB"
# api 最大post请求大小
apiMaxHttpPostSize="5000000"
# 1,替换文件
echo "1,替换文件"
sed -i ${txt} "s#spring.datasource.url.*#spring.datasource.url=jdbc:mysql://${mysqlHost}/${mysqlDb}?characterEncoding=UTF-8#g" conf/dao/data_source.properties
sed -i ${txt} "s#spring.datasource.username.*#spring.datasource.username=${mysqlUserName}#g" conf/dao/data_source.properties
sed -i ${txt} "s#spring.datasource.password.*#spring.datasource.password=${mysqlPassword}#g" conf/dao/data_source.properties
sed -i ${txt} "s#org.quartz.dataSource.myDs.URL.*#org.quartz.dataSource.myDs.URL=jdbc:mysql://${mysqlHost}/${mysqlDb}?characterEncoding=UTF-8#g" conf/quartz.properties
sed -i ${txt} "s#org.quartz.dataSource.myDs.user.*#org.quartz.dataSource.myDs.user=${mysqlUserName}#g" conf/quartz.properties
sed -i ${txt} "s#org.quartz.dataSource.myDs.password.*#org.quartz.dataSource.myDs.password=${mysqlPassword}#g" conf/quartz.properties
sed -i ${txt} "s#fs.defaultFS.*#fs.defaultFS=${namenodeFs}#g" conf/common/hadoop/hadoop.properties
sed -i ${txt} "s#yarn.resourcemanager.ha.rm.ids.*#yarn.resourcemanager.ha.rm.ids=${yarnHaIps}#g" conf/common/hadoop/hadoop.properties
sed -i ${txt} "s#yarn.application.status.address.*#yarn.application.status.address=http://${singleYarnIp}:8088/ws/v1/cluster/apps/%s#g" conf/common/hadoop/hadoop.properties
sed -i ${txt} "s#data.basedir.path.*#data.basedir.path=${programPath}#g" conf/common/common.properties
sed -i ${txt} "s#data.download.basedir.path.*#data.download.basedir.path=${downloadPath}#g" conf/common/common.properties
sed -i ${txt} "s#process.exec.basepath.*#process.exec.basepath=${execPath}#g" conf/common/common.properties
sed -i ${txt} "s#data.store2hdfs.basepath.*#data.store2hdfs.basepath=${hdfsPath}#g" conf/common/common.properties
sed -i ${txt} "s#hdfs.startup.state.*#hdfs.startup.state=${hdfsStartupSate}#g" conf/common/common.properties
sed -i ${txt} "s#escheduler.env.path.*#escheduler.env.path=${shellEnvPath}#g" conf/common/common.properties
sed -i ${txt} "s#escheduler.env.py.*#escheduler.env.py=${pythonEnvPath}#g" conf/common/common.properties
sed -i ${txt} "s#resource.view.suffixs.*#resource.view.suffixs=${resSuffixs}#g" conf/common/common.properties
sed -i ${txt} "s#development.state.*#development.state=${devState}#g" conf/common/common.properties
sed -i ${txt} "s#zookeeper.quorum.*#zookeeper.quorum=${zkQuorum}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.root.*#zookeeper.escheduler.root=${zkRoot}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.dead.servers.*#zookeeper.escheduler.dead.servers=${zkDeadServers}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.masters.*#zookeeper.escheduler.masters=${zkMasters}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.workers.*#zookeeper.escheduler.workers=${zkWorkers}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.lock.masters.*#zookeeper.escheduler.lock.masters=${mastersLock}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.lock.workers.*#zookeeper.escheduler.lock.workers=${workersLock}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.lock.failover.masters.*#zookeeper.escheduler.lock.failover.masters=${mastersFailover}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.escheduler.lock.failover.workers.*#zookeeper.escheduler.lock.failover.workers=${workersFailover}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.session.timeout.*#zookeeper.session.timeout=${zkSessionTimeout}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.connection.timeout.*#zookeeper.connection.timeout=${zkConnectionTimeout}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.retry.sleep.*#zookeeper.retry.sleep=${zkRetrySleep}#g" conf/zookeeper.properties
sed -i ${txt} "s#zookeeper.retry.maxtime.*#zookeeper.retry.maxtime=${zkRetryMaxtime}#g" conf/zookeeper.properties
sed -i ${txt} "s#master.exec.threads.*#master.exec.threads=${masterExecThreads}#g" conf/master.properties
sed -i ${txt} "s#master.exec.task.number.*#master.exec.task.number=${masterExecTaskNum}#g" conf/master.properties
sed -i ${txt} "s#master.heartbeat.interval.*#master.heartbeat.interval=${masterHeartbeatInterval}#g" conf/master.properties
sed -i ${txt} "s#master.task.commit.retryTimes.*#master.task.commit.retryTimes=${masterTaskCommitRetryTimes}#g" conf/master.properties
sed -i ${txt} "s#master.task.commit.interval.*#master.task.commit.interval=${masterTaskCommitInterval}#g" conf/master.properties
sed -i ${txt} "s#master.max.cpuload.avg.*#master.max.cpuload.avg=${masterMaxCupLoadAvg}#g" conf/master.properties
sed -i ${txt} "s#master.reserved.memory.*#master.reserved.memory=${masterReservedMemory}#g" conf/master.properties
sed -i ${txt} "s#worker.exec.threads.*#worker.exec.threads=${workerExecThreads}#g" conf/worker.properties
sed -i ${txt} "s#worker.heartbeat.interval.*#worker.heartbeat.interval=${workerHeartbeatInterval}#g" conf/worker.properties
sed -i ${txt} "s#worker.fetch.task.num.*#worker.fetch.task.num=${workerFetchTaskNum}#g" conf/worker.properties
sed -i ${txt} "s#worker.max.cpuload.avg.*#worker.max.cpuload.avg=${workerMaxCupLoadAvg}#g" conf/worker.properties
sed -i ${txt} "s#worker.reserved.memory.*#worker.reserved.memory=${workerReservedMemory}#g" conf/worker.properties
sed -i ${txt} "s#server.port.*#server.port=${apiServerPort}#g" conf/application.properties
sed -i ${txt} "s#server.session.timeout.*#server.session.timeout=${apiServerSessionTimeout}#g" conf/application.properties
sed -i ${txt} "s#server.context-path.*#server.context-path=${apiServerContextPath}#g" conf/application.properties
sed -i ${txt} "s#spring.http.multipart.max-file-size.*#spring.http.multipart.max-file-size=${springMaxFileSize}#g" conf/application.properties
sed -i ${txt} "s#spring.http.multipart.max-request-size.*#spring.http.multipart.max-request-size=${springMaxRequestSize}#g" conf/application.properties
sed -i ${txt} "s#server.max-http-post-size.*#server.max-http-post-size=${apiMaxHttpPostSize}#g" conf/application.properties
sed -i ${txt} "s#mail.protocol.*#mail.protocol=${mailProtocol}#g" conf/alert.properties
sed -i ${txt} "s#mail.server.host.*#mail.server.host=${mailServerHost}#g" conf/alert.properties
sed -i ${txt} "s#mail.server.port.*#mail.server.port=${mailServerPort}#g" conf/alert.properties
sed -i ${txt} "s#mail.sender.*#mail.sender=${mailSender}#g" conf/alert.properties
sed -i ${txt} "s#mail.passwd.*#mail.passwd=${mailPassword}#g" conf/alert.properties
sed -i ${txt} "s#xls.file.path.*#xls.file.path=${xlsFilePath}#g" conf/alert.properties
sed -i ${txt} "s#installPath.*#installPath=${installPath}#g" conf/config/install_config.conf
sed -i ${txt} "s#deployUser.*#deployUser=${deployUser}#g" conf/config/install_config.conf
sed -i ${txt} "s#ips.*#ips=${ips}#g" conf/config/install_config.conf
sed -i ${txt} "s#masters.*#masters=${masters}#g" conf/config/run_config.conf
sed -i ${txt} "s#workers.*#workers=${workers}#g" conf/config/run_config.conf
sed -i ${txt} "s#alertServer.*#alertServer=${alertServer}#g" conf/config/run_config.conf
sed -i ${txt} "s#apiServers.*#apiServers=${apiServers}#g" conf/config/run_config.conf

105
conf/run.sh

@ -1,105 +0,0 @@
#!/bin/sh
# execute any pre-init scripts
for i in /scripts/pre-init.d/*sh
do
if [ -e "${i}" ]; then
echo "[i] pre-init.d - processing $i"
. "${i}"
fi
done
if [ -d "/run/mysqld" ]; then
echo "[i] mysqld already present, skipping creation"
chown -R mysql:mysql /run/mysqld
else
echo "[i] mysqld not found, creating...."
mkdir -p /run/mysqld
chown -R mysql:mysql /run/mysqld
fi
if [ -d /var/lib/mysql/mysql ]; then
echo "[i] MySQL directory already present, skipping creation"
chown -R mysql:mysql /var/lib/mysql
else
echo "[i] MySQL data directory not found, creating initial DBs"
chown -R mysql:mysql /var/lib/mysql
mysql_install_db --user=mysql --ldata=/var/lib/mysql > /dev/null
if [ "$MYSQL_ROOT_PASSWORD" = "" ]; then
MYSQL_ROOT_PASSWORD=`pwgen 16 1`
echo "[i] MySQL root Password: $MYSQL_ROOT_PASSWORD"
fi
MYSQL_DATABASE="easyscheduler"
MYSQL_USER="easyscheduler"
MYSQL_PASSWORD="easyschedulereasyscheduler"
tfile=`mktemp`
if [ ! -f "$tfile" ]; then
return 1
fi
cat << EOF > $tfile
USE mysql;
FLUSH PRIVILEGES ;
GRANT ALL ON *.* TO 'root'@'%' identified by '$MYSQL_ROOT_PASSWORD' WITH GRANT OPTION ;
GRANT ALL ON *.* TO 'root'@'localhost' identified by '$MYSQL_ROOT_PASSWORD' WITH GRANT OPTION ;
SET PASSWORD FOR 'root'@'localhost'=PASSWORD('${MYSQL_ROOT_PASSWORD}') ;
DROP DATABASE IF EXISTS test ;
FLUSH PRIVILEGES ;
EOF
if [ "$MYSQL_DATABASE" != "" ]; then
echo "[i] Creating database: $MYSQL_DATABASE"
echo "CREATE DATABASE IF NOT EXISTS \`$MYSQL_DATABASE\` CHARACTER SET utf8 COLLATE utf8_general_ci;" >> $tfile
if [ "$MYSQL_USER" != "" ]; then
echo "[i] Creating user: $MYSQL_USER with password $MYSQL_PASSWORD"
echo "GRANT ALL ON \`$MYSQL_DATABASE\`.* to '$MYSQL_USER'@'%' IDENTIFIED BY '$MYSQL_PASSWORD';" >> $tfile
fi
fi
/usr/bin/mysqld --user=mysql --bootstrap --verbose=0 --skip-name-resolve --skip-networking=0 < $tfile
rm -f $tfile
for f in /docker-entrypoint-initdb.d/*; do
case "$f" in
*.sql) echo "$0: running $f"; /usr/bin/mysqld --user=mysql --bootstrap --verbose=0 --skip-name-resolve --skip-networking=0 < "$f"; echo ;;
*.sql.gz) echo "$0: running $f"; gunzip -c "$f" | /usr/bin/mysqld --user=mysql --bootstrap --verbose=0 --skip-name-resolve --skip-networking=0 < "$f"; echo ;;
*) echo "$0: ignoring or entrypoint initdb empty $f" ;;
esac
echo
done
echo
echo 'MySQL init process done. Ready for start up.'
echo
echo "exec /usr/bin/mysqld --user=mysql --console --skip-name-resolve --skip-networking=0" "$@"
fi
# execute any pre-exec scripts
for i in /scripts/pre-exec.d/*sh
do
if [ -e "${i}" ]; then
echo "[i] pre-exec.d - processing $i"
. ${i}
fi
done
mysql -ueasyscheduler -peasyschedulereasyscheduler --one-database easyscheduler -h127.0.0.1 < /opt/easyscheduler/sql/escheduler.sql
mysql -ueasyscheduler -peasyschedulereasyscheduler --one-database easyscheduler -h127.0.0.1 < /opt/easyscheduler/sql/quartz.sql
source /etc/profile
zkServer.sh start
cd /opt/easyscheduler
rm -rf /etc/nginx/conf.d/default.conf
sh ./bin/escheduler-daemon.sh start master-server
sh ./bin/escheduler-daemon.sh start worker-server
sh ./bin/escheduler-daemon.sh start api-server
sh ./bin/escheduler-daemon.sh start logger-server
sh ./bin/escheduler-daemon.sh start alert-server
nginx -c /etc/nginx/nginx.conf
exec /usr/bin/mysqld --user=mysql --console --skip-name-resolve --skip-networking=0 $@

146
dockerfile/Dockerfile

@ -0,0 +1,146 @@
FROM ubuntu:18.04
MAINTAINER journey "825193156@qq.com"
ENV LANG=C.UTF-8
ARG version
ARG tar_version
#1,安装jdk
RUN apt-get update \
&& apt-get -y install openjdk-8-jdk \
&& rm -rf /var/lib/apt/lists/*
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
ENV PATH $JAVA_HOME/bin:$PATH
#安装wget
RUN apt-get update && \
apt-get -y install wget
#2,安装ZK
#RUN cd /opt && \
# wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz && \
# tar -zxvf zookeeper-3.4.6.tar.gz && \
# mv zookeeper-3.4.6 zookeeper && \
# rm -rf ./zookeeper-*tar.gz && \
# mkdir -p /tmp/zookeeper && \
# rm -rf /opt/zookeeper/conf/zoo_sample.cfg
RUN cd /opt && \
wget https://www-us.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz && \
tar -zxvf zookeeper-3.4.14.tar.gz && \
mv zookeeper-3.4.14 zookeeper && \
rm -rf ./zookeeper-*tar.gz && \
mkdir -p /tmp/zookeeper && \
rm -rf /opt/zookeeper/conf/zoo_sample.cfg
ADD ./conf/zookeeper/zoo.cfg /opt/zookeeper/conf
ENV ZK_HOME=/opt/zookeeper
ENV PATH $PATH:$ZK_HOME/bin
#3,安装maven
RUN cd /opt && \
wget http://apache-mirror.rbc.ru/pub/apache/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz && \
tar -zxvf apache-maven-3.3.9-bin.tar.gz && \
mv apache-maven-3.3.9 maven && \
rm -rf ./apache-maven-*tar.gz && \
rm -rf /opt/maven/conf/settings.xml
ADD ./conf/maven/settings.xml /opt/maven/conf
ENV MAVEN_HOME=/opt/maven
ENV PATH $PATH:$MAVEN_HOME/bin
#4,安装node
RUN cd /opt && \
wget https://nodejs.org/download/release/v8.9.4/node-v8.9.4-linux-x64.tar.gz && \
tar -zxvf node-v8.9.4-linux-x64.tar.gz && \
mv node-v8.9.4-linux-x64 node && \
rm -rf ./node-v8.9.4-*tar.gz
ENV NODE_HOME=/opt/node
ENV PATH $PATH:$NODE_HOME/bin
#5,下载escheduler
RUN cd /opt && \
wget https://github.com/analysys/EasyScheduler/archive/${version}.tar.gz && \
tar -zxvf ${version}.tar.gz && \
mv EasyScheduler-${version} easyscheduler_source && \
rm -rf ./${version}.tar.gz
#6,后端编译
RUN cd /opt/easyscheduler_source && \
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
#7,前端编译
RUN chmod -R 777 /opt/easyscheduler_source/escheduler-ui && \
cd /opt/easyscheduler_source/escheduler-ui && \
rm -rf /opt/easyscheduler_source/escheduler-ui/node_modules && \
npm install node-sass --unsafe-perm && \
npm install && \
npm run build
#8,安装mysql
RUN echo "deb http://cn.archive.ubuntu.com/ubuntu/ xenial main restricted universe multiverse" >> /etc/apt/sources.list
RUN echo "mysql-server mysql-server/root_password password root" | debconf-set-selections
RUN echo "mysql-server mysql-server/root_password_again password root" | debconf-set-selections
RUN apt-get update && \
apt-get -y install mysql-server-5.7 && \
mkdir -p /var/lib/mysql && \
mkdir -p /var/run/mysqld && \
mkdir -p /var/log/mysql && \
chown -R mysql:mysql /var/lib/mysql && \
chown -R mysql:mysql /var/run/mysqld && \
chown -R mysql:mysql /var/log/mysql
# UTF-8 and bind-address
RUN sed -i -e "$ a [client]\n\n[mysql]\n\n[mysqld]" /etc/mysql/my.cnf && \
sed -i -e "s/\(\[client\]\)/\1\ndefault-character-set = utf8/g" /etc/mysql/my.cnf && \
sed -i -e "s/\(\[mysql\]\)/\1\ndefault-character-set = utf8/g" /etc/mysql/my.cnf && \
sed -i -e "s/\(\[mysqld\]\)/\1\ninit_connect='SET NAMES utf8'\ncharacter-set-server = utf8\ncollation-server=utf8_general_ci\nbind-address = 0.0.0.0/g" /etc/mysql/my.cnf
#9,安装nginx
RUN apt-get update && \
apt-get install -y nginx && \
rm -rf /var/lib/apt/lists/* && \
echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \
chown -R www-data:www-data /var/lib/nginx
#10,修改escheduler配置文件
#后端配置
RUN mkdir -p /opt/escheduler && \
tar -zxvf /opt/easyscheduler_source/target/escheduler-${tar_version}.tar.gz -C /opt/escheduler && \
rm -rf /opt/escheduler/conf
ADD ./conf/escheduler/conf /opt/escheduler/conf
#前端nginx配置
ADD ./conf/nginx/default.conf /etc/nginx/conf.d
#11,开放端口
EXPOSE 2181 2888 3888 3306 80 12345 8888
#12,安装sudo,python,vim,ping和ssh
RUN apt-get update && \
apt-get -y install sudo && \
apt-get -y install python && \
apt-get -y install vim && \
apt-get -y install iputils-ping && \
apt-get -y install net-tools && \
apt-get -y install openssh-server && \
apt-get -y install python-pip && \
pip install kazoo
COPY ./startup.sh /root/startup.sh
#13,修改权限和设置软连
RUN chmod +x /root/startup.sh && \
chmod +x /opt/escheduler/script/create_escheduler.sh && \
chmod +x /opt/zookeeper/bin/zkServer.sh && \
chmod +x /opt/escheduler/bin/escheduler-daemon.sh && \
rm -rf /bin/sh && \
ln -s /bin/bash /bin/sh && \
mkdir -p /tmp/xls
ENTRYPOINT ["/root/startup.sh"]

30
dockerfile/conf/escheduler/conf/alert.properties

@ -0,0 +1,30 @@
#alert type is EMAIL/SMS
alert.type=EMAIL
# mail server configuration
mail.protocol=SMTP
mail.server.host=smtp.office365.com
mail.server.port=587
mail.sender=qiaozhanwei@outlook.com
mail.passwd=eschedulerBJEG
# TLS
mail.smtp.starttls.enable=true
# SSL
mail.smtp.ssl.enable=false
#xls file path,need create if not exist
xls.file.path=/tmp/xls
# Enterprise WeChat configuration
enterprise.wechat.corp.id=xxxxxxx
enterprise.wechat.secret=xxxxxxx
enterprise.wechat.agent.id=xxxxxxx
enterprise.wechat.users=xxxxxxx
enterprise.wechat.token.url=https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=$corpId&corpsecret=$secret
enterprise.wechat.push.url=https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=$token
enterprise.wechat.team.send.msg={\"toparty\":\"$toParty\",\"agentid\":\"$agentId\",\"msgtype\":\"text\",\"text\":{\"content\":\"$msg\"},\"safe\":\"0\"}
enterprise.wechat.user.send.msg={\"touser\":\"$toUser\",\"agentid\":\"$agentId\",\"msgtype\":\"markdown\",\"markdown\":{\"content\":\"$msg\"}}

31
dockerfile/conf/escheduler/conf/alert_logback.xml

@ -0,0 +1,31 @@
<!-- Logback configuration. See http://logback.qos.ch/manual/index.html -->
<configuration scan="true" scanPeriod="120 seconds"> <!--debug="true" -->
<property name="log.base" value="logs" />
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<appender name="ALERTLOGFILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${log.base}/escheduler-alert.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${log.base}/escheduler-alert.%d{yyyy-MM-dd_HH}.%i.log</fileNamePattern>
<maxHistory>20</maxHistory>
<maxFileSize>64MB</maxFileSize>
</rollingPolicy>
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="ALERTLOGFILE"/>
</root>
</configuration>

42
dockerfile/conf/escheduler/conf/apiserver_logback.xml

@ -0,0 +1,42 @@
<!-- Logback configuration. See http://logback.qos.ch/manual/index.html -->
<configuration scan="true" scanPeriod="120 seconds">
<logger name="org.apache.zookeeper" level="WARN"/>
<logger name="org.apache.hbase" level="WARN"/>
<logger name="org.apache.hadoop" level="WARN"/>
<property name="log.base" value="logs" />
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<appender name="APISERVERLOGFILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<!-- Log level filter -->
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<file>${log.base}/escheduler-api-server.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${log.base}/escheduler-api-server.%d{yyyy-MM-dd_HH}.%i.log</fileNamePattern>
<maxHistory>168</maxHistory>
<maxFileSize>64MB</maxFileSize>
</rollingPolicy>
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="APISERVERLOGFILE" />
</root>
</configuration>

19
dockerfile/conf/escheduler/conf/application.properties

@ -0,0 +1,19 @@
# server port
server.port=12345
# session config
server.servlet.session.timeout=7200
server.servlet.context-path=/escheduler/
# file size limit for upload
spring.servlet.multipart.max-file-size=1024MB
spring.servlet.multipart.max-request-size=1024MB
#post content
server.jetty.max-http-post-size=5000000
spring.messages.encoding=UTF-8
#i18n classpath folder , file prefix messages, if have many files, use "," seperator
spring.messages.basename=i18n/messages

1
dockerfile/conf/escheduler/conf/application_master.properties

@ -0,0 +1 @@
logging.config=classpath:master_logback.xml

42
dockerfile/conf/escheduler/conf/common/common.properties

@ -0,0 +1,42 @@
#task queue implementation, default "zookeeper"
escheduler.queue.impl=zookeeper
# user data directory path, self configuration, please make sure the directory exists and have read write permissions
data.basedir.path=/tmp/escheduler
# directory path for user data download. self configuration, please make sure the directory exists and have read write permissions
data.download.basedir.path=/tmp/escheduler/download
# process execute directory. self configuration, please make sure the directory exists and have read write permissions
process.exec.basepath=/tmp/escheduler/exec
# Users who have permission to create directories under the HDFS root path
hdfs.root.user=hdfs
# data base dir, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。"/escheduler" is recommended
data.store2hdfs.basepath=/escheduler
# resource upload startup type : HDFS,S3,NONE
res.upload.startup.type=NONE
# whether kerberos starts
hadoop.security.authentication.startup.state=false
# java.security.krb5.conf path
java.security.krb5.conf.path=/opt/krb5.conf
# loginUserFromKeytab user
login.user.keytab.username=hdfs-mycluster@ESZ.COM
# loginUserFromKeytab path
login.user.keytab.path=/opt/hdfs.headless.keytab
# system env path. self configuration, please make sure the directory and file exists and have read write execute permissions
escheduler.env.path=/opt/escheduler/conf/env/.escheduler_env.sh
#resource.view.suffixs
resource.view.suffixs=txt,log,sh,conf,cfg,py,java,sql,hql,xml
# is development state? default "false"
development.state=true

18
dockerfile/conf/escheduler/conf/common/hadoop/hadoop.properties

@ -0,0 +1,18 @@
# ha or single namenode,If namenode ha needs to copy core-site.xml and hdfs-site.xml
# to the conf directory,support s3,for example : s3a://escheduler
fs.defaultFS=hdfs://mycluster:8020
# s3 need,s3 endpoint
fs.s3a.endpoint=http://192.168.199.91:9010
# s3 need,s3 access key
fs.s3a.access.key=A3DXS30FO22544RE
# s3 need,s3 secret key
fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK
#resourcemanager ha note this need ips , this empty if single
yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx
# If it is a single resourcemanager, you only need to configure one host name. If it is resourcemanager HA, the default configuration is fine
yarn.application.status.address=http://ark1:8088/ws/v1/cluster/apps/%s

3
dockerfile/conf/escheduler/conf/config/install_config.conf

@ -0,0 +1,3 @@
installPath=/data1_1T/escheduler
deployUser=escheduler
ips=ark0,ark1,ark2,ark3,ark4

4
dockerfile/conf/escheduler/conf/config/run_config.conf

@ -0,0 +1,4 @@
masters=ark0,ark1
workers=ark2,ark3,ark4
alertServer=ark3
apiServers=ark1

53
dockerfile/conf/escheduler/conf/dao/data_source.properties

@ -0,0 +1,53 @@
# base spring data source configuration
spring.datasource.type=com.alibaba.druid.pool.DruidDataSource
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://127.0.0.1:3306/escheduler?characterEncoding=UTF-8
spring.datasource.username=root
spring.datasource.password=root@123
# connection configuration
spring.datasource.initialSize=5
# min connection number
spring.datasource.minIdle=5
# max connection number
spring.datasource.maxActive=50
# max wait time for get a connection in milliseconds. if configuring maxWait, fair locks are enabled by default and concurrency efficiency decreases.
# If necessary, unfair locks can be used by configuring the useUnfairLock attribute to true.
spring.datasource.maxWait=60000
# milliseconds for check to close free connections
spring.datasource.timeBetweenEvictionRunsMillis=60000
# the Destroy thread detects the connection interval and closes the physical connection in milliseconds if the connection idle time is greater than or equal to minEvictableIdleTimeMillis.
spring.datasource.timeBetweenConnectErrorMillis=60000
# the longest time a connection remains idle without being evicted, in milliseconds
spring.datasource.minEvictableIdleTimeMillis=300000
#the SQL used to check whether the connection is valid requires a query statement. If validation Query is null, testOnBorrow, testOnReturn, and testWhileIdle will not work.
spring.datasource.validationQuery=SELECT 1
#check whether the connection is valid for timeout, in seconds
spring.datasource.validationQueryTimeout=3
# when applying for a connection, if it is detected that the connection is idle longer than time Between Eviction Runs Millis,
# validation Query is performed to check whether the connection is valid
spring.datasource.testWhileIdle=true
#execute validation to check if the connection is valid when applying for a connection
spring.datasource.testOnBorrow=true
#execute validation to check if the connection is valid when the connection is returned
spring.datasource.testOnReturn=false
spring.datasource.defaultAutoCommit=true
spring.datasource.keepAlive=true
# open PSCache, specify count PSCache for every connection
spring.datasource.poolPreparedStatements=true
spring.datasource.maxPoolPreparedStatementPerConnectionSize=20
# data quality analysis is not currently in use. please ignore the following configuration
# task record flag
task.record.flag=false
task.record.datasource.url=jdbc:mysql://192.168.xx.xx:3306/etl?characterEncoding=UTF-8
task.record.datasource.username=xx
task.record.datasource.password=xx

3
dockerfile/conf/escheduler/conf/env/.escheduler_env.sh vendored

@ -0,0 +1,3 @@
export PYTHON_HOME=/usr/bin/python
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PYTHON_HOME:$JAVA_HOME/bin:$PATH

235
dockerfile/conf/escheduler/conf/i18n/messages.properties

@ -0,0 +1,235 @@
QUERY_SCHEDULE_LIST_NOTES=query schedule list
EXECUTE_PROCESS_TAG=execute process related operation
PROCESS_INSTANCE_EXECUTOR_TAG=process instance executor related operation
RUN_PROCESS_INSTANCE_NOTES=run process instance
START_NODE_LIST=start node list(node name)
TASK_DEPEND_TYPE=task depend type
COMMAND_TYPE=command type
RUN_MODE=run mode
TIMEOUT=timeout
EXECUTE_ACTION_TO_PROCESS_INSTANCE_NOTES=execute action to process instance
EXECUTE_TYPE=execute type
START_CHECK_PROCESS_DEFINITION_NOTES=start check process definition
GET_RECEIVER_CC_NOTES=query receiver cc
DESC=description
GROUP_NAME=group name
GROUP_TYPE=group type
QUERY_ALERT_GROUP_LIST_NOTES=query alert group list
UPDATE_ALERT_GROUP_NOTES=update alert group
DELETE_ALERT_GROUP_BY_ID_NOTES=delete alert group by id
VERIFY_ALERT_GROUP_NAME_NOTES=verify alert group name, check alert group exist or not
GRANT_ALERT_GROUP_NOTES=grant alert group
USER_IDS=user id list
ALERT_GROUP_TAG=alert group related operation
CREATE_ALERT_GROUP_NOTES=create alert group
WORKER_GROUP_TAG=worker group related operation
SAVE_WORKER_GROUP_NOTES=create worker group
WORKER_GROUP_NAME=worker group name
WORKER_IP_LIST=worker ip list, eg. 192.168.1.1,192.168.1.2
QUERY_WORKER_GROUP_PAGING_NOTES=query worker group paging
QUERY_WORKER_GROUP_LIST_NOTES=query worker group list
DELETE_WORKER_GROUP_BY_ID_NOTES=delete worker group by id
DATA_ANALYSIS_TAG=analysis related operation of task state
COUNT_TASK_STATE_NOTES=count task state
COUNT_PROCESS_INSTANCE_NOTES=count process instance state
COUNT_PROCESS_DEFINITION_BY_USER_NOTES=count process definition by user
COUNT_COMMAND_STATE_NOTES=count command state
COUNT_QUEUE_STATE_NOTES=count the running status of the task in the queue\
ACCESS_TOKEN_TAG=access token related operation
MONITOR_TAG=monitor related operation
MASTER_LIST_NOTES=master server list
WORKER_LIST_NOTES=worker server list
QUERY_DATABASE_STATE_NOTES=query database state
QUERY_ZOOKEEPER_STATE_NOTES=QUERY ZOOKEEPER STATE
TASK_STATE=task instance state
SOURCE_TABLE=SOURCE TABLE
DEST_TABLE=dest table
TASK_DATE=task date
QUERY_HISTORY_TASK_RECORD_LIST_PAGING_NOTES=query history task record list paging
DATA_SOURCE_TAG=data source related operation
CREATE_DATA_SOURCE_NOTES=create data source
DATA_SOURCE_NAME=data source name
DATA_SOURCE_NOTE=data source desc
DB_TYPE=database type
DATA_SOURCE_HOST=DATA SOURCE HOST
DATA_SOURCE_PORT=data source port
DATABASE_NAME=database name
QUEUE_TAG=queue related operation
QUERY_QUEUE_LIST_NOTES=query queue list
QUERY_QUEUE_LIST_PAGING_NOTES=query queue list paging
CREATE_QUEUE_NOTES=create queue
YARN_QUEUE_NAME=yarn(hadoop) queue name
QUEUE_ID=queue id
TENANT_DESC=tenant desc
QUERY_TENANT_LIST_PAGING_NOTES=query tenant list paging
QUERY_TENANT_LIST_NOTES=query tenant list
UPDATE_TENANT_NOTES=update tenant
DELETE_TENANT_NOTES=delete tenant
RESOURCES_TAG=resource center related operation
CREATE_RESOURCE_NOTES=create resource
RESOURCE_TYPE=resource file type
RESOURCE_NAME=resource name
RESOURCE_DESC=resource file desc
RESOURCE_FILE=resource file
RESOURCE_ID=resource id
QUERY_RESOURCE_LIST_NOTES=query resource list
DELETE_RESOURCE_BY_ID_NOTES=delete resource by id
VIEW_RESOURCE_BY_ID_NOTES=view resource by id
ONLINE_CREATE_RESOURCE_NOTES=online create resource
SUFFIX=resource file suffix
CONTENT=resource file content
UPDATE_RESOURCE_NOTES=edit resource file online
DOWNLOAD_RESOURCE_NOTES=download resource file
CREATE_UDF_FUNCTION_NOTES=create udf function
UDF_TYPE=UDF type
FUNC_NAME=function name
CLASS_NAME=package and class name
ARG_TYPES=arguments
UDF_DESC=udf desc
VIEW_UDF_FUNCTION_NOTES=view udf function
UPDATE_UDF_FUNCTION_NOTES=update udf function
QUERY_UDF_FUNCTION_LIST_PAGING_NOTES=query udf function list paging
VERIFY_UDF_FUNCTION_NAME_NOTES=verify udf function name
DELETE_UDF_FUNCTION_NOTES=delete udf function
AUTHORIZED_FILE_NOTES=authorized file
UNAUTHORIZED_FILE_NOTES=unauthorized file
AUTHORIZED_UDF_FUNC_NOTES=authorized udf func
UNAUTHORIZED_UDF_FUNC_NOTES=unauthorized udf func
VERIFY_QUEUE_NOTES=verify queue
TENANT_TAG=tenant related operation
CREATE_TENANT_NOTES=create tenant
TENANT_CODE=tenant code
TENANT_NAME=tenant name
QUEUE_NAME=queue name
PASSWORD=password
DATA_SOURCE_OTHER=jdbc connection params, format:{"key1":"value1",...}
PROJECT_TAG=project related operation
CREATE_PROJECT_NOTES=create project
PROJECT_DESC=project description
UPDATE_PROJECT_NOTES=update project
PROJECT_ID=project id
QUERY_PROJECT_BY_ID_NOTES=query project info by project id
QUERY_PROJECT_LIST_PAGING_NOTES=QUERY PROJECT LIST PAGING
QUERY_ALL_PROJECT_LIST_NOTES=query all project list
DELETE_PROJECT_BY_ID_NOTES=delete project by id
QUERY_UNAUTHORIZED_PROJECT_NOTES=query unauthorized project
QUERY_AUTHORIZED_PROJECT_NOTES=query authorized project
TASK_RECORD_TAG=task record related operation
QUERY_TASK_RECORD_LIST_PAGING_NOTES=query task record list paging
CREATE_TOKEN_NOTES=create token ,note: please login first
QUERY_ACCESS_TOKEN_LIST_NOTES=query access token list paging
SCHEDULE=schedule
WARNING_TYPE=warning type(sending strategy)
WARNING_GROUP_ID=warning group id
FAILURE_STRATEGY=failure strategy
RECEIVERS=receivers
RECEIVERS_CC=receivers cc
WORKER_GROUP_ID=worker server group id
PROCESS_INSTANCE_PRIORITY=process instance priority
UPDATE_SCHEDULE_NOTES=update schedule
SCHEDULE_ID=schedule id
ONLINE_SCHEDULE_NOTES=online schedule
OFFLINE_SCHEDULE_NOTES=offline schedule
QUERY_SCHEDULE_NOTES=query schedule
QUERY_SCHEDULE_LIST_PAGING_NOTES=query schedule list paging
LOGIN_TAG=User login related operations
USER_NAME=user name
PROJECT_NAME=project name
CREATE_PROCESS_DEFINITION_NOTES=create process definition
PROCESS_DEFINITION_NAME=process definition name
PROCESS_DEFINITION_JSON=process definition detail info (json format)
PROCESS_DEFINITION_LOCATIONS=process definition node locations info (json format)
PROCESS_INSTANCE_LOCATIONS=process instance node locations info (json format)
PROCESS_DEFINITION_CONNECTS=process definition node connects info (json format)
PROCESS_INSTANCE_CONNECTS=process instance node connects info (json format)
PROCESS_DEFINITION_DESC=process definition desc
PROCESS_DEFINITION_TAG=process definition related opertation
SIGNOUT_NOTES=logout
USER_PASSWORD=user password
UPDATE_PROCESS_INSTANCE_NOTES=update process instance
QUERY_PROCESS_INSTANCE_LIST_NOTES=query process instance list
VERIFY_PROCCESS_DEFINITION_NAME_NOTES=verify proccess definition name
LOGIN_NOTES=user login
UPDATE_PROCCESS_DEFINITION_NOTES=update proccess definition
PROCESS_DEFINITION_ID=process definition id
PROCESS_DEFINITION_IDS=process definition ids
RELEASE_PROCCESS_DEFINITION_NOTES=release proccess definition
QUERY_PROCCESS_DEFINITION_BY_ID_NOTES=query proccess definition by id
QUERY_PROCCESS_DEFINITION_LIST_NOTES=query proccess definition list
QUERY_PROCCESS_DEFINITION_LIST_PAGING_NOTES=query proccess definition list paging
QUERY_ALL_DEFINITION_LIST_NOTES=query all definition list
PAGE_NO=page no
PROCESS_INSTANCE_ID=process instance id
PROCESS_INSTANCE_JSON=process instance info(json format)
SCHEDULE_TIME=schedule time
SYNC_DEFINE=update the information of the process instance to the process definition\
RECOVERY_PROCESS_INSTANCE_FLAG=whether to recovery process instance
SEARCH_VAL=search val
USER_ID=user id
PAGE_SIZE=page size
LIMIT=limit
VIEW_TREE_NOTES=view tree
GET_NODE_LIST_BY_DEFINITION_ID_NOTES=get task node list by process definition id
PROCESS_DEFINITION_ID_LIST=process definition id list
QUERY_PROCCESS_DEFINITION_All_BY_PROJECT_ID_NOTES=query proccess definition all by project id
DELETE_PROCESS_DEFINITION_BY_ID_NOTES=delete process definition by process definition id
BATCH_DELETE_PROCESS_DEFINITION_BY_IDS_NOTES=batch delete process definition by process definition ids
QUERY_PROCESS_INSTANCE_BY_ID_NOTES=query process instance by process instance id
DELETE_PROCESS_INSTANCE_BY_ID_NOTES=delete process instance by process instance id
TASK_ID=task instance id
SKIP_LINE_NUM=skip line num
QUERY_TASK_INSTANCE_LOG_NOTES=query task instance log
DOWNLOAD_TASK_INSTANCE_LOG_NOTES=download task instance log
USERS_TAG=users related operation
SCHEDULER_TAG=scheduler related operation
CREATE_SCHEDULE_NOTES=create schedule
CREATE_USER_NOTES=create user
TENANT_ID=tenant id
QUEUE=queue
EMAIL=email
PHONE=phone
QUERY_USER_LIST_NOTES=query user list
UPDATE_USER_NOTES=update user
DELETE_USER_BY_ID_NOTES=delete user by id
GRANT_PROJECT_NOTES=GRANT PROJECT
PROJECT_IDS=project ids(string format, multiple projects separated by ",")
GRANT_RESOURCE_NOTES=grant resource file
RESOURCE_IDS=resource ids(string format, multiple resources separated by ",")
GET_USER_INFO_NOTES=get user info
LIST_USER_NOTES=list user
VERIFY_USER_NAME_NOTES=verify user name
UNAUTHORIZED_USER_NOTES=cancel authorization
ALERT_GROUP_ID=alert group id
AUTHORIZED_USER_NOTES=authorized user
GRANT_UDF_FUNC_NOTES=grant udf function
UDF_IDS=udf ids(string format, multiple udf functions separated by ",")
GRANT_DATASOURCE_NOTES=grant datasource
DATASOURCE_IDS=datasource ids(string format, multiple datasources separated by ",")
QUERY_SUBPROCESS_INSTANCE_BY_TASK_ID_NOTES=query subprocess instance by task instance id
QUERY_PARENT_PROCESS_INSTANCE_BY_SUB_PROCESS_INSTANCE_ID_NOTES=query parent process instance info by sub process instance id
QUERY_PROCESS_INSTANCE_GLOBAL_VARIABLES_AND_LOCAL_VARIABLES_NOTES=query process instance global variables and local variables
VIEW_GANTT_NOTES=view gantt
SUB_PROCESS_INSTANCE_ID=sub process instance id
TASK_NAME=task instance name
TASK_INSTANCE_TAG=task instance related operation
LOGGER_TAG=log related operation
PROCESS_INSTANCE_TAG=process instance related operation
EXECUTION_STATUS=runing status for workflow and task nodes
HOST=ip address of running task
START_DATE=start date
END_DATE=end date
QUERY_TASK_LIST_BY_PROCESS_INSTANCE_ID_NOTES=query task list by process instance id
UPDATE_DATA_SOURCE_NOTES=update data source
DATA_SOURCE_ID=DATA SOURCE ID
QUERY_DATA_SOURCE_NOTES=query data source by id
QUERY_DATA_SOURCE_LIST_BY_TYPE_NOTES=query data source list by database type
QUERY_DATA_SOURCE_LIST_PAGING_NOTES=query data source list paging
CONNECT_DATA_SOURCE_NOTES=CONNECT DATA SOURCE
CONNECT_DATA_SOURCE_TEST_NOTES=connect data source test
DELETE_DATA_SOURCE_NOTES=delete data source
VERIFY_DATA_SOURCE_NOTES=verify data source
UNAUTHORIZED_DATA_SOURCE_NOTES=unauthorized data source
AUTHORIZED_DATA_SOURCE_NOTES=authorized data source
DELETE_SCHEDULER_BY_ID_NOTES=delete scheduler by id

235
dockerfile/conf/escheduler/conf/i18n/messages_en_US.properties

@ -0,0 +1,235 @@
QUERY_SCHEDULE_LIST_NOTES=query schedule list
EXECUTE_PROCESS_TAG=execute process related operation
PROCESS_INSTANCE_EXECUTOR_TAG=process instance executor related operation
RUN_PROCESS_INSTANCE_NOTES=run process instance
START_NODE_LIST=start node list(node name)
TASK_DEPEND_TYPE=task depend type
COMMAND_TYPE=command type
RUN_MODE=run mode
TIMEOUT=timeout
EXECUTE_ACTION_TO_PROCESS_INSTANCE_NOTES=execute action to process instance
EXECUTE_TYPE=execute type
START_CHECK_PROCESS_DEFINITION_NOTES=start check process definition
GET_RECEIVER_CC_NOTES=query receiver cc
DESC=description
GROUP_NAME=group name
GROUP_TYPE=group type
QUERY_ALERT_GROUP_LIST_NOTES=query alert group list
UPDATE_ALERT_GROUP_NOTES=update alert group
DELETE_ALERT_GROUP_BY_ID_NOTES=delete alert group by id
VERIFY_ALERT_GROUP_NAME_NOTES=verify alert group name, check alert group exist or not
GRANT_ALERT_GROUP_NOTES=grant alert group
USER_IDS=user id list
ALERT_GROUP_TAG=alert group related operation
CREATE_ALERT_GROUP_NOTES=create alert group
WORKER_GROUP_TAG=worker group related operation
SAVE_WORKER_GROUP_NOTES=create worker group
WORKER_GROUP_NAME=worker group name
WORKER_IP_LIST=worker ip list, eg. 192.168.1.1,192.168.1.2
QUERY_WORKER_GROUP_PAGING_NOTES=query worker group paging
QUERY_WORKER_GROUP_LIST_NOTES=query worker group list
DELETE_WORKER_GROUP_BY_ID_NOTES=delete worker group by id
DATA_ANALYSIS_TAG=analysis related operation of task state
COUNT_TASK_STATE_NOTES=count task state
COUNT_PROCESS_INSTANCE_NOTES=count process instance state
COUNT_PROCESS_DEFINITION_BY_USER_NOTES=count process definition by user
COUNT_COMMAND_STATE_NOTES=count command state
COUNT_QUEUE_STATE_NOTES=count the running status of the task in the queue\
ACCESS_TOKEN_TAG=access token related operation
MONITOR_TAG=monitor related operation
MASTER_LIST_NOTES=master server list
WORKER_LIST_NOTES=worker server list
QUERY_DATABASE_STATE_NOTES=query database state
QUERY_ZOOKEEPER_STATE_NOTES=QUERY ZOOKEEPER STATE
TASK_STATE=task instance state
SOURCE_TABLE=SOURCE TABLE
DEST_TABLE=dest table
TASK_DATE=task date
QUERY_HISTORY_TASK_RECORD_LIST_PAGING_NOTES=query history task record list paging
DATA_SOURCE_TAG=data source related operation
CREATE_DATA_SOURCE_NOTES=create data source
DATA_SOURCE_NAME=data source name
DATA_SOURCE_NOTE=data source desc
DB_TYPE=database type
DATA_SOURCE_HOST=DATA SOURCE HOST
DATA_SOURCE_PORT=data source port
DATABASE_NAME=database name
QUEUE_TAG=queue related operation
QUERY_QUEUE_LIST_NOTES=query queue list
QUERY_QUEUE_LIST_PAGING_NOTES=query queue list paging
CREATE_QUEUE_NOTES=create queue
YARN_QUEUE_NAME=yarn(hadoop) queue name
QUEUE_ID=queue id
TENANT_DESC=tenant desc
QUERY_TENANT_LIST_PAGING_NOTES=query tenant list paging
QUERY_TENANT_LIST_NOTES=query tenant list
UPDATE_TENANT_NOTES=update tenant
DELETE_TENANT_NOTES=delete tenant
RESOURCES_TAG=resource center related operation
CREATE_RESOURCE_NOTES=create resource
RESOURCE_TYPE=resource file type
RESOURCE_NAME=resource name
RESOURCE_DESC=resource file desc
RESOURCE_FILE=resource file
RESOURCE_ID=resource id
QUERY_RESOURCE_LIST_NOTES=query resource list
DELETE_RESOURCE_BY_ID_NOTES=delete resource by id
VIEW_RESOURCE_BY_ID_NOTES=view resource by id
ONLINE_CREATE_RESOURCE_NOTES=online create resource
SUFFIX=resource file suffix
CONTENT=resource file content
UPDATE_RESOURCE_NOTES=edit resource file online
DOWNLOAD_RESOURCE_NOTES=download resource file
CREATE_UDF_FUNCTION_NOTES=create udf function
UDF_TYPE=UDF type
FUNC_NAME=function name
CLASS_NAME=package and class name
ARG_TYPES=arguments
UDF_DESC=udf desc
VIEW_UDF_FUNCTION_NOTES=view udf function
UPDATE_UDF_FUNCTION_NOTES=update udf function
QUERY_UDF_FUNCTION_LIST_PAGING_NOTES=query udf function list paging
VERIFY_UDF_FUNCTION_NAME_NOTES=verify udf function name
DELETE_UDF_FUNCTION_NOTES=delete udf function
AUTHORIZED_FILE_NOTES=authorized file
UNAUTHORIZED_FILE_NOTES=unauthorized file
AUTHORIZED_UDF_FUNC_NOTES=authorized udf func
UNAUTHORIZED_UDF_FUNC_NOTES=unauthorized udf func
VERIFY_QUEUE_NOTES=verify queue
TENANT_TAG=tenant related operation
CREATE_TENANT_NOTES=create tenant
TENANT_CODE=tenant code
TENANT_NAME=tenant name
QUEUE_NAME=queue name
PASSWORD=password
DATA_SOURCE_OTHER=jdbc connection params, format:{"key1":"value1",...}
PROJECT_TAG=project related operation
CREATE_PROJECT_NOTES=create project
PROJECT_DESC=project description
UPDATE_PROJECT_NOTES=update project
PROJECT_ID=project id
QUERY_PROJECT_BY_ID_NOTES=query project info by project id
QUERY_PROJECT_LIST_PAGING_NOTES=QUERY PROJECT LIST PAGING
QUERY_ALL_PROJECT_LIST_NOTES=query all project list
DELETE_PROJECT_BY_ID_NOTES=delete project by id
QUERY_UNAUTHORIZED_PROJECT_NOTES=query unauthorized project
QUERY_AUTHORIZED_PROJECT_NOTES=query authorized project
TASK_RECORD_TAG=task record related operation
QUERY_TASK_RECORD_LIST_PAGING_NOTES=query task record list paging
CREATE_TOKEN_NOTES=create token ,note: please login first
QUERY_ACCESS_TOKEN_LIST_NOTES=query access token list paging
SCHEDULE=schedule
WARNING_TYPE=warning type(sending strategy)
WARNING_GROUP_ID=warning group id
FAILURE_STRATEGY=failure strategy
RECEIVERS=receivers
RECEIVERS_CC=receivers cc
WORKER_GROUP_ID=worker server group id
PROCESS_INSTANCE_PRIORITY=process instance priority
UPDATE_SCHEDULE_NOTES=update schedule
SCHEDULE_ID=schedule id
ONLINE_SCHEDULE_NOTES=online schedule
OFFLINE_SCHEDULE_NOTES=offline schedule
QUERY_SCHEDULE_NOTES=query schedule
QUERY_SCHEDULE_LIST_PAGING_NOTES=query schedule list paging
LOGIN_TAG=User login related operations
USER_NAME=user name
PROJECT_NAME=project name
CREATE_PROCESS_DEFINITION_NOTES=create process definition
PROCESS_DEFINITION_NAME=process definition name
PROCESS_DEFINITION_JSON=process definition detail info (json format)
PROCESS_DEFINITION_LOCATIONS=process definition node locations info (json format)
PROCESS_INSTANCE_LOCATIONS=process instance node locations info (json format)
PROCESS_DEFINITION_CONNECTS=process definition node connects info (json format)
PROCESS_INSTANCE_CONNECTS=process instance node connects info (json format)
PROCESS_DEFINITION_DESC=process definition desc
PROCESS_DEFINITION_TAG=process definition related opertation
SIGNOUT_NOTES=logout
USER_PASSWORD=user password
UPDATE_PROCESS_INSTANCE_NOTES=update process instance
QUERY_PROCESS_INSTANCE_LIST_NOTES=query process instance list
VERIFY_PROCCESS_DEFINITION_NAME_NOTES=verify proccess definition name
LOGIN_NOTES=user login
UPDATE_PROCCESS_DEFINITION_NOTES=update proccess definition
PROCESS_DEFINITION_ID=process definition id
PROCESS_DEFINITION_IDS=process definition ids
RELEASE_PROCCESS_DEFINITION_NOTES=release proccess definition
QUERY_PROCCESS_DEFINITION_BY_ID_NOTES=query proccess definition by id
QUERY_PROCCESS_DEFINITION_LIST_NOTES=query proccess definition list
QUERY_PROCCESS_DEFINITION_LIST_PAGING_NOTES=query proccess definition list paging
QUERY_ALL_DEFINITION_LIST_NOTES=query all definition list
PAGE_NO=page no
PROCESS_INSTANCE_ID=process instance id
PROCESS_INSTANCE_JSON=process instance info(json format)
SCHEDULE_TIME=schedule time
SYNC_DEFINE=update the information of the process instance to the process definition\
RECOVERY_PROCESS_INSTANCE_FLAG=whether to recovery process instance
SEARCH_VAL=search val
USER_ID=user id
PAGE_SIZE=page size
LIMIT=limit
VIEW_TREE_NOTES=view tree
GET_NODE_LIST_BY_DEFINITION_ID_NOTES=get task node list by process definition id
PROCESS_DEFINITION_ID_LIST=process definition id list
QUERY_PROCCESS_DEFINITION_All_BY_PROJECT_ID_NOTES=query proccess definition all by project id
DELETE_PROCESS_DEFINITION_BY_ID_NOTES=delete process definition by process definition id
BATCH_DELETE_PROCESS_DEFINITION_BY_IDS_NOTES=batch delete process definition by process definition ids
QUERY_PROCESS_INSTANCE_BY_ID_NOTES=query process instance by process instance id
DELETE_PROCESS_INSTANCE_BY_ID_NOTES=delete process instance by process instance id
TASK_ID=task instance id
SKIP_LINE_NUM=skip line num
QUERY_TASK_INSTANCE_LOG_NOTES=query task instance log
DOWNLOAD_TASK_INSTANCE_LOG_NOTES=download task instance log
USERS_TAG=users related operation
SCHEDULER_TAG=scheduler related operation
CREATE_SCHEDULE_NOTES=create schedule
CREATE_USER_NOTES=create user
TENANT_ID=tenant id
QUEUE=queue
EMAIL=email
PHONE=phone
QUERY_USER_LIST_NOTES=query user list
UPDATE_USER_NOTES=update user
DELETE_USER_BY_ID_NOTES=delete user by id
GRANT_PROJECT_NOTES=GRANT PROJECT
PROJECT_IDS=project ids(string format, multiple projects separated by ",")
GRANT_RESOURCE_NOTES=grant resource file
RESOURCE_IDS=resource ids(string format, multiple resources separated by ",")
GET_USER_INFO_NOTES=get user info
LIST_USER_NOTES=list user
VERIFY_USER_NAME_NOTES=verify user name
UNAUTHORIZED_USER_NOTES=cancel authorization
ALERT_GROUP_ID=alert group id
AUTHORIZED_USER_NOTES=authorized user
GRANT_UDF_FUNC_NOTES=grant udf function
UDF_IDS=udf ids(string format, multiple udf functions separated by ",")
GRANT_DATASOURCE_NOTES=grant datasource
DATASOURCE_IDS=datasource ids(string format, multiple datasources separated by ",")
QUERY_SUBPROCESS_INSTANCE_BY_TASK_ID_NOTES=query subprocess instance by task instance id
QUERY_PARENT_PROCESS_INSTANCE_BY_SUB_PROCESS_INSTANCE_ID_NOTES=query parent process instance info by sub process instance id
QUERY_PROCESS_INSTANCE_GLOBAL_VARIABLES_AND_LOCAL_VARIABLES_NOTES=query process instance global variables and local variables
VIEW_GANTT_NOTES=view gantt
SUB_PROCESS_INSTANCE_ID=sub process instance id
TASK_NAME=task instance name
TASK_INSTANCE_TAG=task instance related operation
LOGGER_TAG=log related operation
PROCESS_INSTANCE_TAG=process instance related operation
EXECUTION_STATUS=runing status for workflow and task nodes
HOST=ip address of running task
START_DATE=start date
END_DATE=end date
QUERY_TASK_LIST_BY_PROCESS_INSTANCE_ID_NOTES=query task list by process instance id
UPDATE_DATA_SOURCE_NOTES=update data source
DATA_SOURCE_ID=DATA SOURCE ID
QUERY_DATA_SOURCE_NOTES=query data source by id
QUERY_DATA_SOURCE_LIST_BY_TYPE_NOTES=query data source list by database type
QUERY_DATA_SOURCE_LIST_PAGING_NOTES=query data source list paging
CONNECT_DATA_SOURCE_NOTES=CONNECT DATA SOURCE
CONNECT_DATA_SOURCE_TEST_NOTES=connect data source test
DELETE_DATA_SOURCE_NOTES=delete data source
VERIFY_DATA_SOURCE_NOTES=verify data source
UNAUTHORIZED_DATA_SOURCE_NOTES=unauthorized data source
AUTHORIZED_DATA_SOURCE_NOTES=authorized data source
DELETE_SCHEDULER_BY_ID_NOTES=delete scheduler by id

233
dockerfile/conf/escheduler/conf/i18n/messages_zh_CN.properties

@ -0,0 +1,233 @@
QUERY_SCHEDULE_LIST_NOTES=查询定时列表
PROCESS_INSTANCE_EXECUTOR_TAG=流程实例执行相关操作
RUN_PROCESS_INSTANCE_NOTES=运行流程实例
START_NODE_LIST=开始节点列表(节点name)
TASK_DEPEND_TYPE=任务依赖类型
COMMAND_TYPE=指令类型
RUN_MODE=运行模式
TIMEOUT=超时时间
EXECUTE_ACTION_TO_PROCESS_INSTANCE_NOTES=执行流程实例的各种操作(暂停、停止、重跑、恢复等)
EXECUTE_TYPE=执行类型
START_CHECK_PROCESS_DEFINITION_NOTES=检查流程定义
DESC=备注(描述)
GROUP_NAME=组名称
GROUP_TYPE=组类型
QUERY_ALERT_GROUP_LIST_NOTES=告警组列表\
UPDATE_ALERT_GROUP_NOTES=编辑(更新)告警组
DELETE_ALERT_GROUP_BY_ID_NOTES=删除告警组通过ID
VERIFY_ALERT_GROUP_NAME_NOTES=检查告警组是否存在
GRANT_ALERT_GROUP_NOTES=授权告警组
USER_IDS=用户ID列表
ALERT_GROUP_TAG=告警组相关操作
WORKER_GROUP_TAG=Worker分组管理
SAVE_WORKER_GROUP_NOTES=创建Worker分组\
WORKER_GROUP_NAME=Worker分组名称
WORKER_IP_LIST=Worker ip列表,注意:多个IP地址以逗号分割\
QUERY_WORKER_GROUP_PAGING_NOTES=Worker分组管理
QUERY_WORKER_GROUP_LIST_NOTES=查询worker group分组
DELETE_WORKER_GROUP_BY_ID_NOTES=删除worker group通过ID
DATA_ANALYSIS_TAG=任务状态分析相关操作
COUNT_TASK_STATE_NOTES=任务状态统计
COUNT_PROCESS_INSTANCE_NOTES=统计流程实例状态
COUNT_PROCESS_DEFINITION_BY_USER_NOTES=统计用户创建的流程定义
COUNT_COMMAND_STATE_NOTES=统计命令状态
COUNT_QUEUE_STATE_NOTES=统计队列里任务状态
ACCESS_TOKEN_TAG=access token相关操作,需要先登录
MONITOR_TAG=监控相关操作
MASTER_LIST_NOTES=master服务列表
WORKER_LIST_NOTES=worker服务列表
QUERY_DATABASE_STATE_NOTES=查询数据库状态
QUERY_ZOOKEEPER_STATE_NOTES=查询Zookeeper状态
TASK_STATE=任务实例状态
SOURCE_TABLE=源表
DEST_TABLE=目标表
TASK_DATE=任务时间
QUERY_HISTORY_TASK_RECORD_LIST_PAGING_NOTES=分页查询历史任务记录列表
DATA_SOURCE_TAG=数据源相关操作
CREATE_DATA_SOURCE_NOTES=创建数据源
DATA_SOURCE_NAME=数据源名称
DATA_SOURCE_NOTE=数据源描述
DB_TYPE=数据源类型
DATA_SOURCE_HOST=IP主机名
DATA_SOURCE_PORT=数据源端口
DATABASE_NAME=数据库名
QUEUE_TAG=队列相关操作
QUERY_QUEUE_LIST_NOTES=查询队列列表
QUERY_QUEUE_LIST_PAGING_NOTES=分页查询队列列表
CREATE_QUEUE_NOTES=创建队列
YARN_QUEUE_NAME=hadoop yarn队列名
QUEUE_ID=队列ID
TENANT_DESC=租户描述
QUERY_TENANT_LIST_PAGING_NOTES=分页查询租户列表
QUERY_TENANT_LIST_NOTES=查询租户列表
UPDATE_TENANT_NOTES=更新租户
DELETE_TENANT_NOTES=删除租户
RESOURCES_TAG=资源中心相关操作
CREATE_RESOURCE_NOTES=创建资源
RESOURCE_TYPE=资源文件类型
RESOURCE_NAME=资源文件名称
RESOURCE_DESC=资源文件描述
RESOURCE_FILE=资源文件
RESOURCE_ID=资源ID
QUERY_RESOURCE_LIST_NOTES=查询资源列表
DELETE_RESOURCE_BY_ID_NOTES=删除资源通过ID
VIEW_RESOURCE_BY_ID_NOTES=浏览资源通通过ID
ONLINE_CREATE_RESOURCE_NOTES=在线创建资源
SUFFIX=资源文件后缀
CONTENT=资源文件内容
UPDATE_RESOURCE_NOTES=在线更新资源文件
DOWNLOAD_RESOURCE_NOTES=下载资源文件
CREATE_UDF_FUNCTION_NOTES=创建UDF函数
UDF_TYPE=UDF类型
FUNC_NAME=函数名称
CLASS_NAME=包名类名
ARG_TYPES=参数
UDF_DESC=udf描述,使用说明
VIEW_UDF_FUNCTION_NOTES=查看udf函数
UPDATE_UDF_FUNCTION_NOTES=更新udf函数
QUERY_UDF_FUNCTION_LIST_PAGING_NOTES=分页查询udf函数列表
VERIFY_UDF_FUNCTION_NAME_NOTES=验证udf函数名
DELETE_UDF_FUNCTION_NOTES=删除UDF函数
AUTHORIZED_FILE_NOTES=授权文件
UNAUTHORIZED_FILE_NOTES=取消授权文件
AUTHORIZED_UDF_FUNC_NOTES=授权udf函数
UNAUTHORIZED_UDF_FUNC_NOTES=取消udf函数授权
VERIFY_QUEUE_NOTES=验证队列
TENANT_TAG=租户相关操作
CREATE_TENANT_NOTES=创建租户
TENANT_CODE=租户编码
TENANT_NAME=租户名称
QUEUE_NAME=队列名
PASSWORD=密码
DATA_SOURCE_OTHER=jdbc连接参数,格式为:{"key1":"value1",...}
PROJECT_TAG=项目相关操作
CREATE_PROJECT_NOTES=创建项目
PROJECT_DESC=项目描述
UPDATE_PROJECT_NOTES=更新项目
PROJECT_ID=项目ID
QUERY_PROJECT_BY_ID_NOTES=通过项目ID查询项目信息
QUERY_PROJECT_LIST_PAGING_NOTES=分页查询项目列表
QUERY_ALL_PROJECT_LIST_NOTES=查询所有项目
DELETE_PROJECT_BY_ID_NOTES=删除项目通过ID
QUERY_UNAUTHORIZED_PROJECT_NOTES=查询未授权的项目
QUERY_AUTHORIZED_PROJECT_NOTES=查询授权项目
TASK_RECORD_TAG=任务记录相关操作
QUERY_TASK_RECORD_LIST_PAGING_NOTES=分页查询任务记录列表
CREATE_TOKEN_NOTES=创建token,注意需要先登录
QUERY_ACCESS_TOKEN_LIST_NOTES=分页查询access token列表
SCHEDULE=定时
WARNING_TYPE=发送策略
WARNING_GROUP_ID=发送组ID
FAILURE_STRATEGY=失败策略
RECEIVERS=收件人
RECEIVERS_CC=收件人(抄送)
WORKER_GROUP_ID=Worker Server分组ID
PROCESS_INSTANCE_PRIORITY=流程实例优先级
UPDATE_SCHEDULE_NOTES=更新定时
SCHEDULE_ID=定时ID
ONLINE_SCHEDULE_NOTES=定时上线
OFFLINE_SCHEDULE_NOTES=定时下线
QUERY_SCHEDULE_NOTES=查询定时
QUERY_SCHEDULE_LIST_PAGING_NOTES=分页查询定时
LOGIN_TAG=用户登录相关操作
USER_NAME=用户名
PROJECT_NAME=项目名称
CREATE_PROCESS_DEFINITION_NOTES=创建流程定义
PROCESS_DEFINITION_NAME=流程定义名称
PROCESS_DEFINITION_JSON=流程定义详细信息(json格式)
PROCESS_DEFINITION_LOCATIONS=流程定义节点坐标位置信息(json格式)
PROCESS_INSTANCE_LOCATIONS=流程实例节点坐标位置信息(json格式)
PROCESS_DEFINITION_CONNECTS=流程定义节点图标连接信息(json格式)
PROCESS_INSTANCE_CONNECTS=流程实例节点图标连接信息(json格式)
PROCESS_DEFINITION_DESC=流程定义描述信息
PROCESS_DEFINITION_TAG=流程定义相关操作
SIGNOUT_NOTES=退出登录
USER_PASSWORD=用户密码
UPDATE_PROCESS_INSTANCE_NOTES=更新流程实例
QUERY_PROCESS_INSTANCE_LIST_NOTES=查询流程实例列表
VERIFY_PROCCESS_DEFINITION_NAME_NOTES=验证流程定义名字
LOGIN_NOTES=用户登录
UPDATE_PROCCESS_DEFINITION_NOTES=更新流程定义
PROCESS_DEFINITION_ID=流程定义ID
RELEASE_PROCCESS_DEFINITION_NOTES=发布流程定义
QUERY_PROCCESS_DEFINITION_BY_ID_NOTES=查询流程定义通过流程定义ID
QUERY_PROCCESS_DEFINITION_LIST_NOTES=查询流程定义列表
QUERY_PROCCESS_DEFINITION_LIST_PAGING_NOTES=分页查询流程定义列表
QUERY_ALL_DEFINITION_LIST_NOTES=查询所有流程定义
PAGE_NO=页码号
PROCESS_INSTANCE_ID=流程实例ID
PROCESS_INSTANCE_IDS=流程实例ID集合
PROCESS_INSTANCE_JSON=流程实例信息(json格式)
SCHEDULE_TIME=定时时间
SYNC_DEFINE=更新流程实例的信息是否同步到流程定义
RECOVERY_PROCESS_INSTANCE_FLAG=是否恢复流程实例
SEARCH_VAL=搜索值
USER_ID=用户ID
PAGE_SIZE=页大小
LIMIT=显示多少条
VIEW_TREE_NOTES=树状图
GET_NODE_LIST_BY_DEFINITION_ID_NOTES=获得任务节点列表通过流程定义ID
PROCESS_DEFINITION_ID_LIST=流程定义id列表
QUERY_PROCCESS_DEFINITION_All_BY_PROJECT_ID_NOTES=查询流程定义通过项目ID
DELETE_PROCESS_DEFINITION_BY_ID_NOTES=删除流程定义通过流程定义ID
BATCH_DELETE_PROCESS_DEFINITION_BY_IDS_NOTES=批量删除流程定义通过流程定义ID集合
QUERY_PROCESS_INSTANCE_BY_ID_NOTES=查询流程实例通过流程实例ID
DELETE_PROCESS_INSTANCE_BY_ID_NOTES=删除流程实例通过流程实例ID
TASK_ID=任务实例ID
SKIP_LINE_NUM=忽略行数
QUERY_TASK_INSTANCE_LOG_NOTES=查询任务实例日志
DOWNLOAD_TASK_INSTANCE_LOG_NOTES=下载任务实例日志
USERS_TAG=用户相关操作
SCHEDULER_TAG=定时相关操作
CREATE_SCHEDULE_NOTES=创建定时
CREATE_USER_NOTES=创建用户
TENANT_ID=租户ID
QUEUE=使用的队列
EMAIL=邮箱
PHONE=手机号
QUERY_USER_LIST_NOTES=查询用户列表
UPDATE_USER_NOTES=更新用户
DELETE_USER_BY_ID_NOTES=删除用户通过ID
GRANT_PROJECT_NOTES=授权项目
PROJECT_IDS=项目IDS(字符串格式,多个项目以","分割)
GRANT_RESOURCE_NOTES=授权资源文件
RESOURCE_IDS=资源ID列表(字符串格式,多个资源ID以","分割)
GET_USER_INFO_NOTES=获取用户信息
LIST_USER_NOTES=用户列表
VERIFY_USER_NAME_NOTES=验证用户名
UNAUTHORIZED_USER_NOTES=取消授权
ALERT_GROUP_ID=报警组ID
AUTHORIZED_USER_NOTES=授权用户
GRANT_UDF_FUNC_NOTES=授权udf函数
UDF_IDS=udf函数id列表(字符串格式,多个udf函数ID以","分割)
GRANT_DATASOURCE_NOTES=授权数据源
DATASOURCE_IDS=数据源ID列表(字符串格式,多个数据源ID以","分割)
QUERY_SUBPROCESS_INSTANCE_BY_TASK_ID_NOTES=查询子流程实例通过任务实例ID
QUERY_PARENT_PROCESS_INSTANCE_BY_SUB_PROCESS_INSTANCE_ID_NOTES=查询父流程实例信息通过子流程实例ID
QUERY_PROCESS_INSTANCE_GLOBAL_VARIABLES_AND_LOCAL_VARIABLES_NOTES=查询流程实例全局变量和局部变量
VIEW_GANTT_NOTES=浏览Gantt图
SUB_PROCESS_INSTANCE_ID=子流程是咧ID
TASK_NAME=任务实例名
TASK_INSTANCE_TAG=任务实例相关操作
LOGGER_TAG=日志相关操作
PROCESS_INSTANCE_TAG=流程实例相关操作
EXECUTION_STATUS=工作流和任务节点的运行状态
HOST=运行任务的主机IP地址
START_DATE=开始时间
END_DATE=结束时间
QUERY_TASK_LIST_BY_PROCESS_INSTANCE_ID_NOTES=通过流程实例ID查询任务列表
UPDATE_DATA_SOURCE_NOTES=更新数据源
DATA_SOURCE_ID=数据源ID
QUERY_DATA_SOURCE_NOTES=查询数据源通过ID
QUERY_DATA_SOURCE_LIST_BY_TYPE_NOTES=查询数据源列表通过数据源类型
QUERY_DATA_SOURCE_LIST_PAGING_NOTES=分页查询数据源列表
CONNECT_DATA_SOURCE_NOTES=连接数据源
CONNECT_DATA_SOURCE_TEST_NOTES=连接数据源测试
DELETE_DATA_SOURCE_NOTES=删除数据源
VERIFY_DATA_SOURCE_NOTES=验证数据源
UNAUTHORIZED_DATA_SOURCE_NOTES=未授权的数据源
AUTHORIZED_DATA_SOURCE_NOTES=授权的数据源
DELETE_SCHEDULER_BY_ID_NOTES=根据定时id删除定时数据

1
dockerfile/conf/escheduler/conf/mail_templates/alert_mail_template.ftl

@ -0,0 +1 @@
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'><html><head><title> easyscheduler</title><meta name='Keywords' content=''><meta name='Description' content=''><style type="text/css">table { margin-top:0px; padding-top:0px; border:1px solid; font-size: 14px; color: #333333; border-width: 1px; border-color: #666666; border-collapse: collapse; } table th { border-width: 1px; padding: 8px; border-style: solid; border-color: #666666; background-color: #dedede; } table td { border-width: 1px; padding: 8px; border-style: solid; border-color: #666666; background-color: #ffffff; }</style></head><body style="margin:0;padding:0"><table border="1px" cellpadding="5px" cellspacing="-10px"><thead><#if title??> ${title}</#if></thead><#if content??> ${content}</#if></table></body></html>

21
dockerfile/conf/escheduler/conf/master.properties

@ -0,0 +1,21 @@
# master execute thread num
master.exec.threads=100
# master execute task number in parallel
master.exec.task.number=20
# master heartbeat interval
master.heartbeat.interval=10
# master commit task retry times
master.task.commit.retryTimes=5
# master commit task interval
master.task.commit.interval=100
# only less than cpu avg load, master server can work. default value : the number of cpu cores * 2
master.max.cpuload.avg=10
# only larger than reserved memory, master server can work. default value : physical memory * 1/10, unit is G.
master.reserved.memory=1

34
dockerfile/conf/escheduler/conf/master_logback.xml

@ -0,0 +1,34 @@
<!-- Logback configuration. See http://logback.qos.ch/manual/index.html -->
<configuration scan="true" scanPeriod="120 seconds"> <!--debug="true" -->
<property name="log.base" value="logs" />
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<appender name="MASTERLOGFILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${log.base}/escheduler-master.log</file>
<filter class="cn.escheduler.server.master.log.MasterLogFilter">
<level>INFO</level>
</filter>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${log.base}/escheduler-master.%d{yyyy-MM-dd_HH}.%i.log</fileNamePattern>
<maxHistory>168</maxHistory>
<maxFileSize>200MB</maxFileSize>
</rollingPolicy>
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="MASTERLOGFILE"/>
</root>
</configuration>

39
dockerfile/conf/escheduler/conf/quartz.properties

@ -0,0 +1,39 @@
#============================================================================
# Configure Main Scheduler Properties
#============================================================================
org.quartz.scheduler.instanceName = EasyScheduler
org.quartz.scheduler.instanceId = AUTO
org.quartz.scheduler.makeSchedulerThreadDaemon = true
org.quartz.jobStore.useProperties = false
#============================================================================
# Configure ThreadPool
#============================================================================
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.makeThreadsDaemons = true
org.quartz.threadPool.threadCount = 25
org.quartz.threadPool.threadPriority = 5
#============================================================================
# Configure JobStore
#============================================================================
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.tablePrefix = QRTZ_
org.quartz.jobStore.isClustered = true
org.quartz.jobStore.misfireThreshold = 60000
org.quartz.jobStore.clusterCheckinInterval = 5000
org.quartz.jobStore.dataSource = myDs
#============================================================================
# Configure Datasources
#============================================================================
org.quartz.dataSource.myDs.driver = com.mysql.jdbc.Driver
org.quartz.dataSource.myDs.URL=jdbc:mysql://127.0.0.1:3306/escheduler?characterEncoding=utf8
org.quartz.dataSource.myDs.user=root
org.quartz.dataSource.myDs.password=root@123
org.quartz.dataSource.myDs.maxConnections = 10
org.quartz.dataSource.myDs.validationQuery = select 1

15
dockerfile/conf/escheduler/conf/worker.properties

@ -0,0 +1,15 @@
# worker execute thread num
worker.exec.threads=100
# worker heartbeat interval
worker.heartbeat.interval=10
# submit the number of tasks at a time
worker.fetch.task.num = 3
# only less than cpu avg load, worker server can work. default value : the number of cpu cores * 2
#worker.max.cpuload.avg=10
# only larger than reserved memory, worker server can work. default value : physical memory * 1/6, unit is G.
worker.reserved.memory=1

60
dockerfile/conf/escheduler/conf/worker_logback.xml

@ -0,0 +1,60 @@
<!-- Logback configuration. See http://logback.qos.ch/manual/index.html -->
<configuration scan="true" scanPeriod="120 seconds">
<property name="log.base" value="logs"/>
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
</appender>
<appender name="TASKLOGFILE" class="ch.qos.logback.classic.sift.SiftingAppender">
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<filter class="cn.escheduler.server.worker.log.TaskLogFilter"></filter>
<Discriminator class="cn.escheduler.server.worker.log.TaskLogDiscriminator">
<key>taskAppId</key>
</Discriminator>
<sift>
<appender name="FILE-${taskAppId}" class="ch.qos.logback.core.FileAppender">
<file>${log.base}/${taskAppId}.log</file>
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
<append>true</append>
</appender>
</sift>
</appender>
<appender name="WORKERLOGFILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${log.base}/escheduler-worker.log</file>
<filter class="cn.escheduler.server.worker.log.WorkerLogFilter">
<level>INFO</level>
</filter>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${log.base}/escheduler-worker.%d{yyyy-MM-dd_HH}.%i.log</fileNamePattern>
<maxHistory>168</maxHistory>
<maxFileSize>200MB</maxFileSize>
</rollingPolicy>
     
<encoder>
<pattern>
[%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n
</pattern>
<charset>UTF-8</charset>
</encoder>
  
</appender>
<root level="INFO">
<appender-ref ref="TASKLOGFILE"/>
<appender-ref ref="WORKERLOGFILE"/>
</root>
</configuration>

25
dockerfile/conf/escheduler/conf/zookeeper.properties

@ -0,0 +1,25 @@
#zookeeper cluster
zookeeper.quorum=127.0.0.1:2181
#escheduler root directory
zookeeper.escheduler.root=/escheduler
#zookeeper server dirctory
zookeeper.escheduler.dead.servers=/escheduler/dead-servers
zookeeper.escheduler.masters=/escheduler/masters
zookeeper.escheduler.workers=/escheduler/workers
#zookeeper lock dirctory
zookeeper.escheduler.lock.masters=/escheduler/lock/masters
zookeeper.escheduler.lock.workers=/escheduler/lock/workers
#escheduler failover directory
zookeeper.escheduler.lock.failover.masters=/escheduler/lock/failover/masters
zookeeper.escheduler.lock.failover.workers=/escheduler/lock/failover/workers
zookeeper.escheduler.lock.failover.startup.masters=/escheduler/lock/failover/startup-masters
#escheduler failover directory
zookeeper.session.timeout=300
zookeeper.connection.timeout=300
zookeeper.retry.sleep=1000
zookeeper.retry.maxtime=5

263
dockerfile/conf/maven/settings.xml

@ -0,0 +1,263 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!--
| This is the configuration file for Maven. It can be specified at two levels:
|
| 1. User Level. This settings.xml file provides configuration for a single user,
| and is normally provided in ${user.home}/.m2/settings.xml.
|
| NOTE: This location can be overridden with the CLI option:
|
| -s /path/to/user/settings.xml
|
| 2. Global Level. This settings.xml file provides configuration for all Maven
| users on a machine (assuming they're all using the same Maven
| installation). It's normally provided in
| ${maven.home}/conf/settings.xml.
|
| NOTE: This location can be overridden with the CLI option:
|
| -gs /path/to/global/settings.xml
|
| The sections in this sample file are intended to give you a running start at
| getting the most out of your Maven installation. Where appropriate, the default
| values (values used when the setting is not specified) are provided.
|
|-->
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
<!-- localRepository
| The path to the local repository maven will use to store artifacts.
|
| Default: ${user.home}/.m2/repository
<localRepository>/path/to/local/repo</localRepository>
-->
<!-- interactiveMode
| This will determine whether maven prompts you when it needs input. If set to false,
| maven will use a sensible default value, perhaps based on some other setting, for
| the parameter in question.
|
| Default: true
<interactiveMode>true</interactiveMode>
-->
<!-- offline
| Determines whether maven should attempt to connect to the network when executing a build.
| This will have an effect on artifact downloads, artifact deployment, and others.
|
| Default: false
<offline>false</offline>
-->
<!-- pluginGroups
| This is a list of additional group identifiers that will be searched when resolving plugins by their prefix, i.e.
| when invoking a command line like "mvn prefix:goal". Maven will automatically add the group identifiers
| "org.apache.maven.plugins" and "org.codehaus.mojo" if these are not already contained in the list.
|-->
<pluginGroups>
<!-- pluginGroup
| Specifies a further group identifier to use for plugin lookup.
<pluginGroup>com.your.plugins</pluginGroup>
-->
</pluginGroups>
<!-- proxies
| This is a list of proxies which can be used on this machine to connect to the network.
| Unless otherwise specified (by system property or command-line switch), the first proxy
| specification in this list marked as active will be used.
|-->
<proxies>
<!-- proxy
| Specification for one proxy, to be used in connecting to the network.
|
<proxy>
<id>optional</id>
<active>true</active>
<protocol>http</protocol>
<username>proxyuser</username>
<password>proxypass</password>
<host>proxy.host.net</host>
<port>80</port>
<nonProxyHosts>local.net|some.host.com</nonProxyHosts>
</proxy>
-->
</proxies>
<!-- servers
| This is a list of authentication profiles, keyed by the server-id used within the system.
| Authentication profiles can be used whenever maven must make a connection to a remote server.
|-->
<servers>
<!-- server
| Specifies the authentication information to use when connecting to a particular server, identified by
| a unique name within the system (referred to by the 'id' attribute below).
|
| NOTE: You should either specify username/password OR privateKey/passphrase, since these pairings are
| used together.
|
<server>
<id>deploymentRepo</id>
<username>repouser</username>
<password>repopwd</password>
</server>
-->
<!-- Another sample, using keys to authenticate.
<server>
<id>siteServer</id>
<privateKey>/path/to/private/key</privateKey>
<passphrase>optional; leave empty if not used.</passphrase>
</server>
-->
</servers>
<!-- mirrors
| This is a list of mirrors to be used in downloading artifacts from remote repositories.
|
| It works like this: a POM may declare a repository to use in resolving certain artifacts.
| However, this repository may have problems with heavy traffic at times, so people have mirrored
| it to several places.
|
| That repository definition will have a unique id, so we can create a mirror reference for that
| repository, to be used as an alternate download site. The mirror site will be the preferred
| server for that repository.
|-->
<mirrors>
<!-- mirror
| Specifies a repository mirror site to use instead of a given repository. The repository that
| this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
| for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
|
<mirror>
<id>mirrorId</id>
<mirrorOf>repositoryId</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://my.repository.com/repo/path</url>
</mirror>
-->
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>central</mirrorOf>
<name>Nexus aliyun</name>
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
</mirrors>
<!-- profiles
| This is a list of profiles which can be activated in a variety of ways, and which can modify
| the build process. Profiles provided in the settings.xml are intended to provide local machine-
| specific paths and repository locations which allow the build to work in the local environment.
|
| For example, if you have an integration testing plugin - like cactus - that needs to know where
| your Tomcat instance is installed, you can provide a variable here such that the variable is
| dereferenced during the build process to configure the cactus plugin.
|
| As noted above, profiles can be activated in a variety of ways. One way - the activeProfiles
| section of this document (settings.xml) - will be discussed later. Another way essentially
| relies on the detection of a system property, either matching a particular value for the property,
| or merely testing its existence. Profiles can also be activated by JDK version prefix, where a
| value of '1.4' might activate a profile when the build is executed on a JDK version of '1.4.2_07'.
| Finally, the list of active profiles can be specified directly from the command line.
|
| NOTE: For profiles defined in the settings.xml, you are restricted to specifying only artifact
| repositories, plugin repositories, and free-form properties to be used as configuration
| variables for plugins in the POM.
|
|-->
<profiles>
<!-- profile
| Specifies a set of introductions to the build process, to be activated using one or more of the
| mechanisms described above. For inheritance purposes, and to activate profiles via <activatedProfiles/>
| or the command line, profiles have to have an ID that is unique.
|
| An encouraged best practice for profile identification is to use a consistent naming convention
| for profiles, such as 'env-dev', 'env-test', 'env-production', 'user-jdcasey', 'user-brett', etc.
| This will make it more intuitive to understand what the set of introduced profiles is attempting
| to accomplish, particularly when you only have a list of profile id's for debug.
|
| This profile example uses the JDK version to trigger activation, and provides a JDK-specific repo.
<profile>
<id>jdk-1.4</id>
<activation>
<jdk>1.4</jdk>
</activation>
<repositories>
<repository>
<id>jdk14</id>
<name>Repository for JDK 1.4 builds</name>
<url>http://www.myhost.com/maven/jdk14</url>
<layout>default</layout>
<snapshotPolicy>always</snapshotPolicy>
</repository>
</repositories>
</profile>
-->
<!--
| Here is another profile, activated by the system property 'target-env' with a value of 'dev',
| which provides a specific path to the Tomcat instance. To use this, your plugin configuration
| might hypothetically look like:
|
| ...
| <plugin>
| <groupId>org.myco.myplugins</groupId>
| <artifactId>myplugin</artifactId>
|
| <configuration>
| <tomcatLocation>${tomcatPath}</tomcatLocation>
| </configuration>
| </plugin>
| ...
|
| NOTE: If you just wanted to inject this configuration whenever someone set 'target-env' to
| anything, you could just leave off the <value/> inside the activation-property.
|
<profile>
<id>env-dev</id>
<activation>
<property>
<name>target-env</name>
<value>dev</value>
</property>
</activation>
<properties>
<tomcatPath>/path/to/tomcat/instance</tomcatPath>
</properties>
</profile>
-->
</profiles>
<!-- activeProfiles
| List of profiles that are active for all builds.
|
<activeProfiles>
<activeProfile>alwaysActiveProfile</activeProfile>
<activeProfile>anotherAlwaysActiveProfile</activeProfile>
</activeProfiles>
-->
</settings>

12
conf/escheduler.conf → dockerfile/conf/nginx/default.conf

@ -1,23 +1,23 @@
server { server {
listen 8888;# 访问端口 listen 8888;
server_name localhost; server_name localhost;
#charset koi8-r; #charset koi8-r;
#access_log /var/log/nginx/host.access.log main; #access_log /var/log/nginx/host.access.log main;
location / { location / {
root /opt/escheduler/front/server; # 静态文件目录 root /opt/easyscheduler_source/escheduler-ui/dist;
index index.html index.html; index index.html index.html;
} }
location /escheduler { location /escheduler {
proxy_pass http://127.0.0.1:12345; # 接口地址 proxy_pass http://127.0.0.1:12345;
proxy_set_header Host $host; proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Real-IP $remote_addr;
proxy_set_header x_real_ipP $remote_addr; proxy_set_header x_real_ipP $remote_addr;
proxy_set_header remote_addr $remote_addr; proxy_set_header remote_addr $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_http_version 1.1; proxy_http_version 1.1;
proxy_connect_timeout 4s; proxy_connect_timeout 300s;
proxy_read_timeout 30s; proxy_read_timeout 300s;
proxy_send_timeout 12s; proxy_send_timeout 300s;
proxy_set_header Upgrade $http_upgrade; proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade"; proxy_set_header Connection "upgrade";
} }

2
conf/zoo.cfg → dockerfile/conf/zookeeper/zoo.cfg

@ -26,5 +26,3 @@ clientPort=2181
# Purge task interval in hours # Purge task interval in hours
# Set to "0" to disable auto purge feature # Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1 #autopurge.purgeInterval=1
dataDir=/opt/zookeeper/data
dataLogDir=/opt/zookeeper/logs

8
dockerfile/hooks/build

@ -0,0 +1,8 @@
#!/bin/bash
echo "------ escheduler start - build -------"
printenv
docker build --build-arg version=$version --build-arg tar_version=$tar_version -t $DOCKER_REPO:$version .
echo "------ escheduler end - build -------"

8
dockerfile/hooks/push

@ -0,0 +1,8 @@
#!/bin/bash
echo "------ push start -------"
printenv
docker push $DOCKER_REPO:$version
echo "------ push end -------"

81
dockerfile/startup.sh

@ -0,0 +1,81 @@
#! /bin/bash
set -e
if [ `netstat -anop|grep mysql|wc -l` -gt 0 ];then
echo "MySQL is Running."
else
MYSQL_ROOT_PWD="root@123"
ESZ_DB="escheduler"
echo "启动mysql服务"
chown -R mysql:mysql /var/lib/mysql /var/run/mysqld
find /var/lib/mysql -type f -exec touch {} \; && service mysql restart $ sleep 10
if [ ! -f /nohup.out ];then
echo "设置mysql密码"
mysql --user=root --password=root -e "UPDATE mysql.user set authentication_string=password('$MYSQL_ROOT_PWD') where user='root'; FLUSH PRIVILEGES;"
echo "设置mysql权限"
mysql --user=root --password=$MYSQL_ROOT_PWD -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '$MYSQL_ROOT_PWD' WITH GRANT OPTION; FLUSH PRIVILEGES;"
echo "创建escheduler数据库"
mysql --user=root --password=$MYSQL_ROOT_PWD -e "CREATE DATABASE IF NOT EXISTS \`$ESZ_DB\` CHARACTER SET utf8 COLLATE utf8_general_ci; FLUSH PRIVILEGES;"
echo "导入mysql数据"
nohup /opt/escheduler/script/create_escheduler.sh &
sleep 90
fi
if [ `mysql --user=root --password=$MYSQL_ROOT_PWD -s -r -e "SELECT count(TABLE_NAME) FROM information_schema.TABLES WHERE TABLE_SCHEMA='escheduler';" | grep -v count` -eq 38 ];then
echo "\`$ESZ_DB\` 表个数正确"
else
echo "\`$ESZ_DB\` 表个数不正确"
mysql --user=root --password=$MYSQL_ROOT_PWD -e "DROP DATABASE \`$ESZ_DB\`;"
echo "创建escheduler数据库"
mysql --user=root --password=$MYSQL_ROOT_PWD -e "CREATE DATABASE IF NOT EXISTS \`$ESZ_DB\` CHARACTER SET utf8 COLLATE utf8_general_ci; FLUSH PRIVILEGES;"
echo "导入mysql数据"
nohup /opt/escheduler/script/create_escheduler.sh &
sleep 90
fi
fi
/opt/zookeeper/bin/zkServer.sh restart
sleep 90
echo "启动api-server"
/opt/escheduler/bin/escheduler-daemon.sh stop api-server
/opt/escheduler/bin/escheduler-daemon.sh start api-server
echo "启动master-server"
/opt/escheduler/bin/escheduler-daemon.sh stop master-server
python /opt/escheduler/script/del_zk_node.py 127.0.0.1 /escheduler/masters
/opt/escheduler/bin/escheduler-daemon.sh start master-server
echo "启动worker-server"
/opt/escheduler/bin/escheduler-daemon.sh stop worker-server
python /opt/escheduler/script/del_zk_node.py 127.0.0.1 /escheduler/workers
/opt/escheduler/bin/escheduler-daemon.sh start worker-server
echo "启动logger-server"
/opt/escheduler/bin/escheduler-daemon.sh stop logger-server
/opt/escheduler/bin/escheduler-daemon.sh start logger-server
echo "启动alert-server"
/opt/escheduler/bin/escheduler-daemon.sh stop alert-server
/opt/escheduler/bin/escheduler-daemon.sh start alert-server
echo "启动nginx"
/etc/init.d/nginx stop
nginx &
while true
do
sleep 101
done
exec "$@"

16
docs/en_US/1.0.1-release.md

@ -0,0 +1,16 @@
Easy Scheduler Release 1.0.1
===
Easy Scheduler 1.0.1 is the second version in the 1.x series. The update is as follows:
- 1,outlook TSL email support
- 2,servlet and protobuf jar conflict resolution
- 3,create a tenant and establish a Linux user at the same time
- 4,the re-run time is negative
- 5,stand-alone and cluster can be deployed with one click of install.sh
- 6,queue support interface added
- 7,escheduler.t_escheduler_queue added create_time and update_time fields

49
docs/en_US/1.0.2-release.md

@ -0,0 +1,49 @@
Easy Scheduler Release 1.0.2
===
Easy Scheduler 1.0.2 is the third version in the 1.x series. This version adds scheduling open interfaces, worker grouping (the machine group for which the specified task runs), task flow and service monitoring, and support for oracle, clickhouse, etc., as follows:
New features:
===
- [[EasyScheduler-79](https://github.com/analysys/EasyScheduler/issues/79)] scheduling the open interface through the token mode, which can be operated through the api.
- [[EasyScheduler-138](https://github.com/analysys/EasyScheduler/issues/138)] can specify the machine (group) where the task runs.
- [[EasyScheduler-139](https://github.com/analysys/EasyScheduler/issues/139)] task Process Monitoring and Master, Worker, Zookeeper Operation Status Monitoring
- [[EasyScheduler-140](https://github.com/analysys/EasyScheduler/issues/140)] workflow Definition - Increase Process Timeout Alarm
- [[EasyScheduler-134](https://github.com/analysys/EasyScheduler/issues/134)] task type supports Oracle, CLICKHOUSE, SQLSERVER, IMPALA
- [[EasyScheduler-136](https://github.com/analysys/EasyScheduler/issues/136)] sql task node can independently select CC mail users
- [[EasyScheduler-141](https://github.com/analysys/EasyScheduler/issues/141)] user Management—Users can bind queues. The user queue level is higher than the tenant queue level. If the user queue is empty, look for the tenant queue.
Enhanced:
===
- [[EasyScheduler-154](https://github.com/analysys/EasyScheduler/issues/154)] Tenant code allows encoding of pure numbers or underscores
Repair:
===
- [[EasyScheduler-135](https://github.com/analysys/EasyScheduler/issues/135)] Python task can specify python version
- [[EasyScheduler-125](https://github.com/analysys/EasyScheduler/issues/125)] The mobile phone number in the user account does not recognize the opening of Unicom's latest number 166
- [[EasyScheduler-178](https://github.com/analysys/EasyScheduler/issues/178)] Fix subtle spelling mistakes in ProcessDao
- [[EasyScheduler-129](https://github.com/analysys/EasyScheduler/issues/129)] Tenant code, underlined and other special characters cannot pass the check.
Thank:
===
Last but not least, no new version was born without the contributions of the following partners:
Baoqi , chubbyjiang , coreychen , chgxtony, cmdares , datuzi , dingchao, fanguanqun , 风清扬, gaojun416 , googlechorme, hyperknob , hujiang75277381 , huanzui , kinssun, ivivi727 ,jimmy, jiangzhx , kevin5210 , lidongdai , lshmouse , lenboo, lyf198972 , lgcareer , lzy305 , moranrr , millionfor , mazhong8808, programlief, qiaozhanwei , roy110 , swxchappy , sherlock111 , samz406 , swxchappy, qq389401879 , lzy305, vkingnew, William-GuoWei , woniulinux, yyl861, zhangxin1988, yangjiajun2014, yangqinlong, yangjiajun2014, zhzhenqin, zhangluck, zhanghaicheng1, zhuyizhizhi
And many enthusiastic partners in the WeChat group! Thank you very much!

30
docs/en_US/1.0.3-release.md

@ -0,0 +1,30 @@
Easy Scheduler Release 1.0.3
===
Easy Scheduler 1.0.3 is the fourth version in the 1.x series.
Enhanced:
===
- [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql task mail header added support for custom variables
- [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql task failed to send mail, then this sql task is failed
- [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)modify the replacement rule of the custom variable in the sql task, and support the replacement of multiple single quotes and double quotes.
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)when creating a resource file, increase the verification that the resource file already exists on hdfs
Repair:
===
- [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) the process definition list is sorted according to the timing status and update time
- [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) fixes online creation of files, hdfs file is not created, but returns successfully
- [[EasyScheduler-481] ](https://github.com/analysys/EasyScheduler/issues/481)fixes the problem that the job does not exist at the same time.
- [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kills the kill of its child process when killing the task
- [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) fixed an issue where the update time and size were not updated when updating resource files
- [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) fixed an issue where deleting a tenant failed if hdfs was not started when the tenant was deleted
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/486) the shell process exits, the yarn state is not final and waits for judgment.
Thank:
===
Last but not least, no new version was born without the contributions of the following partners:
Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879,
feloxx, coding-now, hymzcn, nysyxxg, chgxtony
And many enthusiastic partners in the WeChat group! Thank you very much!

2
docs/en_US/1.0.4-release.md

@ -0,0 +1,2 @@
# 1.0.4 release

2
docs/en_US/1.0.5-release.md

@ -0,0 +1,2 @@
# 1.0.5 release

55
docs/en_US/1.1.0-release.md

@ -0,0 +1,55 @@
Easy Scheduler Release 1.1.0
===
Easy Scheduler 1.1.0 is the first release in the 1.1.x series.
New features:
===
- [[EasyScheduler-391](https://github.com/analysys/EasyScheduler/issues/391)] run a process under a specified tenement user
- [[EasyScheduler-288](https://github.com/analysys/EasyScheduler/issues/288)] feature/qiye_weixin
- [[EasyScheduler-189](https://github.com/analysys/EasyScheduler/issues/189)] security support such as Kerberos
- [[EasyScheduler-398](https://github.com/analysys/EasyScheduler/issues/398)]dministrator, with tenants (install.sh set default tenant), can create resources, projects and data sources (limited to one administrator)
- [[EasyScheduler-293](https://github.com/analysys/EasyScheduler/issues/293)]click on the parameter selected when running the process, there is no place to view, no save
- [[EasyScheduler-401](https://github.com/analysys/EasyScheduler/issues/401)]timing is easy to time every second. After the timing is completed, you can display the next trigger time on the page.
- [[EasyScheduler-493](https://github.com/analysys/EasyScheduler/pull/493)]add datasource kerberos auth and FAQ modify and add resource upload s3
Enhanced:
===
- [[EasyScheduler-227](https://github.com/analysys/EasyScheduler/issues/227)] upgrade spring-boot to 2.1.x and spring to 5.x
- [[EasyScheduler-434](https://github.com/analysys/EasyScheduler/issues/434)] number of worker nodes zk and mysql are inconsistent
- [[EasyScheduler-435](https://github.com/analysys/EasyScheduler/issues/435)]authentication of the mailbox format
- [[EasyScheduler-441](https://github.com/analysys/EasyScheduler/issues/441)] prohibits running nodes from joining completed node detection
- [[EasyScheduler-400](https://github.com/analysys/EasyScheduler/issues/400)] Home page, queue statistics are not harmonious, command statistics have no data
- [[EasyScheduler-395](https://github.com/analysys/EasyScheduler/issues/395)] For fault-tolerant recovery processes, the status cannot be ** is running
- [[EasyScheduler-529](https://github.com/analysys/EasyScheduler/issues/529)] optimize poll task from zookeeper
- [[EasyScheduler-242](https://github.com/analysys/EasyScheduler/issues/242)]worker-server node gets task performance problem
- [[EasyScheduler-352](https://github.com/analysys/EasyScheduler/issues/352)]worker grouping, queue consumption problem
- [[EasyScheduler-461](https://github.com/analysys/EasyScheduler/issues/461)]view data source parameters, need to encrypt account password information
- [[EasyScheduler-396](https://github.com/analysys/EasyScheduler/issues/396)]Dockerfile optimization, and associated Dockerfile and github to achieve automatic mirroring
- [[EasyScheduler-389](https://github.com/analysys/EasyScheduler/issues/389)]service monitor cannot find the change of master/worker
- [[EasyScheduler-511](https://github.com/analysys/EasyScheduler/issues/511)]support recovery process from stop/kill nodes.
- [[EasyScheduler-399](https://github.com/analysys/EasyScheduler/issues/399)]HadoopUtils specifies user actions instead of **Deploying users
Repair:
===
- [[EasyScheduler-394](https://github.com/analysys/EasyScheduler/issues/394)] When the master&worker is deployed on the same machine, if the master&worker service is restarted, the previously scheduled tasks cannot be scheduled.
- [[EasyScheduler-469](https://github.com/analysys/EasyScheduler/issues/469)]Fix naming errors,monitor page
- [[EasyScheduler-392](https://github.com/analysys/EasyScheduler/issues/392)]Feature request: fix email regex check
- [[EasyScheduler-405](https://github.com/analysys/EasyScheduler/issues/405)]timed modification/addition page, start time and end time cannot be the same
- [[EasyScheduler-517](https://github.com/analysys/EasyScheduler/issues/517)]complement - subworkflow - time parameter
- [[EasyScheduler-532](https://github.com/analysys/EasyScheduler/issues/532)] python node does not execute the problem
- [[EasyScheduler-543](https://github.com/analysys/EasyScheduler/issues/543)]optimize datasource connection params safety
- [[EasyScheduler-569](https://github.com/analysys/EasyScheduler/issues/569)] timed tasks can't really stop
- [[EasyScheduler-463](https://github.com/analysys/EasyScheduler/issues/463)]mailbox verification does not support very suffixed mailboxes
Thank:
===
Last but not least, no new version was born without the contributions of the following partners:
Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, chgxtony, Stanfan, lfyee, thisnew, hujiang75277381, sunnyingit, lgbo-ustc, ivivi, lzy305, JackIllkid, telltime, lipengbo2018, wuchunfu, telltime
And many enthusiastic partners in the WeChat group! Thank you very much!

299
docs/en_US/EasyScheduler Proposal.md

@ -0,0 +1,299 @@
# EasyScheduler Proposal
## Abstract
EasyScheduler is a distributed ETL scheduling engine with powerful DAG visualization interface. EasyScheduler focuses on solving the problem of 'complex task dependencies & triggers ' in data processing. Just like its name, we dedicated to making the scheduling system `out of the box`.
## Proposal
EasyScheduler provides many easy-to-use features to accelerate the engineer enficiency on data ETL workflow job. We propose a new concept of 'instance of process' and 'instance of task' to let developers to tuning their jobs on the running state of workflow instead of changing the task's template. Its main objectives are as follows:
- Define the complex tasks' dependencies & triggers in a DAG graph by dragging and dropping.
- Support cluster HA.
- Support multi-tenant and parallel or serial backfilling data.
- Support automatical failure job retry and recovery.
- Support many data task types and process priority, task priority and relative task timeout alarm.
For now, EasyScheduler has a fairly huge community in China.
It is also widely adopted by many [companies and organizations](https://github.com/analysys/EasyScheduler/issues/57) as its ETL scheduling tool.
We believe that bringing EasyScheduler into ASF could advance development of a much more stronger and more diverse open source community.
Analysys submits this proposal to donate EasyScheduler's source codes and all related documentations to Apache Software Foundation.
The codes are already under Apache License Version 2.0.
- Code base: https://www.github.com/analysys/easyscheduler
- English Documentations: <https://analysys.github.io/easyscheduler_docs>
- Chinese Documentations: <https://analysys.github.io/easyscheduler_docs_cn>
## Background
We want to find a data processing tool with the following features:
- Easy to use,developers can build a ETL process with a very simple drag and drop operation. not only for ETL developers,people who can't write code also can use this tool for ETL operation such as system adminitrator.
- Solving the problem of "complex task dependencies" , and it can monitor the ETL running status.
- Support multi-tenant.
- Support many task types: Shell, MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Sub_Process, Procedure, etc.
- Support HA and linear scalability.
For the above reasons, we realized that no existing product met our requirements, so we decided to develop this tool ourselves. We designed EasyScheduler at the end of 2017. The first internal use version was completed in May 2018. We then iterated several internal versions and the system gradually became stabilized.
Then we open the source code of EasyScheduler on March 2019. It soon gained lot's of ETL developers interest and stars on github.
## Rationale
Many organizations (>30) (refer to [Who is using EasyScheduler](https://github.com/analysys/EasyScheduler/issues/57) ) already benefit from running EasyScheduler to make data process pipelines more easier. More than 100 [feature ideas](https://github.com/analysys/EasyScheduler/projects/1) come from EasyScheduler community. Some 3rd-party projects also plan to integrate with EasyScheduler through task plugin, such as [Scriptis](https://github.com/WeBankFinTech/Scriptis), [waterdrop](https://github.com/InterestingLab/waterdrop). These will strengthen the features of EasyScheduler.
## Current Status
### Meritocracy
EasyScheduler was incubated at Analysys in 2017 and open sourced on GitHub in March 2019. Once open sourced, we have been quickly adopted by multiple organizations,EasyScheduler has contributors and users from many companies; we have set up the Committer Team. New contributors are guided and reviewed by existed committer members.
Contributions are always welcomed and highly valued.
### Community
Now we have set development teams for EasyScheduler in Analysys, and we already have external developers who contributed the code. We already have a user group of more than 1,000 people.
We hope to grow the base of contributors by inviting all those who offer contributions through The Apache Way.
Right now, we make use of github as code hosting as well as gitter for community communication.
### Core Developers
The core developers, including experienced senior developers, are often guided by mentors.
## Known Risks
### Orphaned products
EasyScheduler is widely adopted in China by many [companies and organizations](https://github.com/analysys/EasyScheduler/issues/57). The core developers of EasyScheduler team plan to work full time on this project. Currently there are 10 use cases with more that 1000 activity tasks per day using EasyScheduler in the user's production environment. There is very little risk of EasyScheduler getting orphaned as at least two large companies (xueqiu、fengjr) are widely using it in their production, and developers from these companies have also joined Easy Scheduler's team of contributors, EasyScheduler has eight major releases so far, and and received 373 pull requests from contributors, which further demonstrates EasyScheduler as a very active project. We also plan to extend and diversify this community further through Apache.
Thus, it is very unlikely that EasyScheduler becomes orphaned.
### Inexperience with Open Source
EasyScheduler's core developers have been running it as a community-oriented open source project for some time, several of them already have experience working with open source communities, they are also active in presto, alluxio and other projects. At the same time, we will learn more open source experiences by following the Apache way in our incubator journey.
### Homogenous Developers
The current developers work across a variety of organizations including Analysys, guandata and hydee;
some individual developers are accepted as developers of EasyScheduler as well.
Considering that fengjr and sefonsoft have shown great interests in EasyScheduler, we plan to encourage them to contribute and invite them as contributors to work together.
### Reliance on Salaried Developers
At present, eight of the core developers are paid by their employer to contribute to EasyScheduler project.
we also have some other developers and researchers taking part in the project, and we will make efforts to increase the diversity of the contributors and actively lobby for Domain experts in the workflow space to contribute.
### Relationships with Other Apache Products
EasyScheduler integrates Apache Zookeeper as one of the service registration/discovery mechanisms. EasyScheduler is deeply integrated with Apache products. It currently support many task types like Apache Hive, Apache Spark, Apache Hadoop, and so on
### A Excessive Fascination with the Apache Brand
We recognize the value and reputation that the Apache brand will bring to EasyScheduler.
However, we prefer that the community provided by the Apache Software Foundation will enable the project to achieve long-term stable development. so EasyScheduler is proposing to enter incubation at Apache in order to help efforts to diversify the community, not so much to capitalize on the Apache brand.
## Documentation
A complete set of EasyScheduler documentations is provided on github in both English and Simplified Chinese.
- [English](https://github.com/analysys/easyscheduler_docs)
- [Chinese](https://github.com/analysys/easyscheduler_docs_cn)
## Initial Source
The project consists of three distinct codebases: core and document. The address of two existed git repositories are as follows:
- <https://github.com/analysys/easyscheduler>
- <https://github.com/analysys/easyscheduler_docs>
- <https://github.com/analysys/easyscheduler_docs_cn>
## Source and Intellectual Property Submission Plan
As soon as EasyScheduler is approved to join Apache Incubator, Analysys will provide the Software Grant Agreement(SGA) and intial committers will submit ICLA(s). The code is already licensed under the Apache Software License, version 2.0.
## External Dependencies
As all backend code dependencies are managed using Apache Maven, none of the external libraries need to be packaged in a source distribution.
Most of dependencies have Apache compatible licenses,and the core dependencies are as follows:
### Backend Dependency
| Dependency | License | Comments |
| ------------------------------------------------------ | ------------------------------------------------------------ | ------------- |
| bonecp-0.8.0.RELEASE.jar | Apache v2.0 | |
| byte-buddy-1.9.10.jar | Apache V2.0 | |
| c3p0-0.9.1.1.jar | GNU LESSER GENERAL PUBLIC LICENSE | will remove |
| curator-*-2.12.0.jar | Apache V2.0 | |
| druid-1.1.14.jar | Apache V2.0 | |
| fastjson-1.2.29.jar | Apache V2.0 | |
| fastutil-6.5.6.jar | Apache V2.0 | |
| grpc-*-1.9.0.jar | Apache V2.0 | |
| gson-2.8.5.jar | Apache V2.0 | |
| guava-20.0.jar | Apache V2.0 | |
| guice-*3.0.jar | Apache V2.0 | |
| hadoop-*-2.7.3.jar | Apache V2.0 | |
| hbase-*-1.1.1.jar | Apache V2.0 | |
| hive-*-2.1.0.jar | Apache V2.0 | |
| instrumentation-api-0.4.3.jar | Apache V2.0 | |
| jackson-*-2.9.8.jar | Apache V2.0 | |
| jackson-jaxrs-1.8.3.jar | LGPL Version 2.1 Apache V2.0 | will remove |
| jackson-xc-1.8.3.jar | LGPL Version 2.1 Apache V2.0 | will remove |
| javax.activation-api-1.2.0.jar | CDDL/GPLv2+CE | will remove |
| javax.annotation-api-1.3.2.jar | CDDL + GPLv2 with classpath exception | will remove |
| javax.servlet-api-3.1.0.jar | CDDL + GPLv2 with classpath exception | will remove |
| jaxb-*.jar | (CDDL 1.1) (GPL2 w/ CPE) | will remove |
| jersey-*-1.9.jar | CDDL+GPLv2 | will remove |
| jetty-*-9.4.14.v20181114.jar | Apache V2.0,EPL 1.0 | |
| jna-4.5.2.jar | Apache V2.0,LGPL 2.1 | will remove |
| jna-platform-4.5.2.jar | Apache V2.0,LGPL 2.1 | will remove |
| jsp-api-2.x.jar | CDDL,GPL 2.0 | will remove |
| log4j-1.2.17.jar | Apache V2.0 | |
| log4j-*-2.11.2.jar | Apache V2.0 | |
| logback-x.jar | dual-license EPL 1.0,LGPL 2.1 | |
| mail-1.4.5.jar | CDDL+GPLv2 | will remove |
| mybatis-3.5.1.jar | Apache V2.0 | |
| mybatis-spring-*2.0.1.jar | Apache V2.0 | |
| mysql-connector-java-5.1.34.jar | GPL 2.0 | will remove |
| netty-*-4.1.33.Final.jar | Apache V2.0 | |
| oshi-core-3.5.0.jar | EPL 1.0 | |
| parquet-hadoop-bundle-1.8.1.jar | Apache V2.0 | |
| postgresql-42.1.4.jar | BSD 2-clause | |
| protobuf-java-*3.5.1.jar | BSD 3-clause | |
| quartz-2.2.3.jar | Apache V2.0 | |
| quartz-jobs-2.2.3.jar | Apache V2.0 | |
| slf4j-api-1.7.5.jar | MIT | |
| spring-*-5.1.5.RELEASE.jar | Apache V2.0 | |
| spring-beans-5.1.5.RELEASE.jar | Apache V2.0 | |
| spring-boot-*2.1.3.RELEASE.jar | Apache V2.0 | |
| springfox-*-2.9.2.jar | Apache V2.0 | |
| stringtemplate-3.2.1.jar | BSD | |
| swagger-annotations-1.5.20.jar | Apache V2.0 | |
| swagger-bootstrap-ui-1.9.3.jar | Apache V2.0 | |
| swagger-models-1.5.20.jar | Apache V2.0 | |
| zookeeper-3.4.8.jar | Apache | |
The front-end UI currently relies on many components, and the core dependencies are as follows:
### UI Dependency
| Dependency | License | Comments |
| ------------------------------------------------------- | ------------------------------------ | ----------- |
| autoprefixer | MIT | |
| babel-core | MIT | |
| babel-eslint | MIT | |
| babel-helper-* | MIT | |
| babel-helpers | MIT | |
| babel-loader | MIT | |
| babel-plugin-syntax-* | MIT | |
| babel-plugin-transform-* | MIT | |
| babel-preset-env | MIT | |
| babel-runtime | MIT | |
| bootstrap | MIT | |
| canvg | MIT | |
| clipboard | MIT | |
| codemirror | MIT | |
| copy-webpack-plugin | MIT | |
| cross-env | MIT | |
| css-loader | MIT | |
| cssnano | MIT | |
| cyclist | MIT | |
| d3 | BSD-3-Clause | |
| dayjs | MIT | |
| echarts | Apache V2.0 | |
| env-parse | ISC | |
| extract-text-webpack-plugin | MIT | |
| file-loader | MIT | |
| globby | MIT | |
| html-loader | MIT | |
| html-webpack-ext-plugin | MIT | |
| html-webpack-plugin | MIT | |
| html2canvas | MIT | |
| jsplumb | (MIT OR GPL-2.0) | |
| lodash | MIT | |
| node-sass | MIT | |
| optimize-css-assets-webpack-plugin | MIT | |
| postcss-loader | MIT | |
| rimraf | ISC | |
| sass-loader | MIT | |
| uglifyjs-webpack-plugin | MIT | |
| url-loader | MIT | |
| util.promisify | MIT | |
| vue | MIT | |
| vue-loader | MIT | |
| vue-style-loader | MIT | |
| vue-template-compiler | MIT | |
| vuex-router-sync | MIT | |
| watchpack | MIT | |
| webpack | MIT | |
| webpack-dev-server | MIT | |
| webpack-merge | MIT | |
| xmldom | MIT,LGPL | will remove |
## Required Resources
### Git Repositories
- <https://github.com/analysys/EasyScheduler.git>
- <https://github.com/analysys/easyscheduler_docs.git>
- <https://github.com/analysys/easyscheduler_docs_cn.git>
### Issue Tracking
The community would like to continue using GitHub Issues.
### Continuous Integration tool
Jenkins
### Mailing Lists
- EasyScheduler-dev: for development discussions
- EasyScheduler-private: for PPMC discussions
- EasyScheduler-notifications: for users notifications
## Initial Committers
- William-GuoWei(guowei20m@outlook.com)
- Lidong Dai(lidong.dai@outlook.com)
- Zhanwei Qiao(qiaozhanwei@outlook.com)
- Liang Bao(baoliang.leon@gmail.com)
- Gang Li(lgcareer2019@outlook.com)
- Zijian Gong(quanquansy@gmail.com)
- Jun Gao(gaojun2048@gmail.com)
- Baoqi Wu(wubaoqi@gmail.com)
## Affiliations
- Analysys Inc: William-GuoWei,Zhanwei Qiao,Liang Bao,Gang Li,Jun Gao,Lidong Dai
- Hydee Inc: Zijian Gong
- Guandata Inc: Baoqi Wu
## Sponsors
### Champion
- Sheng Wu ( Apache Incubator PMC, [wusheng@apache.org](mailto:wusheng@apache.org))
### Mentors
- Sheng Wu ( Apache Incubator PMC, [wusheng@apache.org](mailto:wusheng@apache.org))
- ShaoFeng Shi ( Apache Incubator PMC, [shaofengshi@apache.org](mailto:wusheng@apache.org))
- Liang Chen ( Apache Software Foundation Member, [chenliang613@apache.org](mailto:chenliang613@apache.org))
### Sponsoring Entity
We are expecting the Apache Incubator could sponsor this project.

284
docs/en_US/EasyScheduler-FAQ.md

@ -0,0 +1,284 @@
## Q: EasyScheduler service introduction and recommended running memory
A: EasyScheduler consists of 5 services, MasterServer, WorkerServer, ApiServer, AlertServer, LoggerServer and UI.
| Service | Description |
| ------------------------- | ------------------------------------------------------------ |
| MasterServer | Mainly responsible for DAG segmentation and task status monitoring |
| WorkerServer/LoggerServer | Mainly responsible for the submission, execution and update of task status. LoggerServer is used for Rest Api to view logs through RPC |
| ApiServer | Provides the Rest Api service for the UI to call |
| AlertServer | Provide alarm service |
| UI | Front page display |
Note:**Due to the large number of services, it is recommended that the single-machine deployment is preferably 4 cores and 16G or more.**
---
## Q: Why can't an administrator create a project?
A: The administrator is currently "**pure management**". There is no tenant, that is, there is no corresponding user on linux, so there is no execution permission, **so there is no project, resource and data source,** so there is no permission to create. **But there are all viewing permissions**. If you need to create a business operation such as a project, **use the administrator to create a tenant and a normal user, and then use the normal user login to operate**. We will release the administrator's creation and execution permissions in version 1.1.0, and the administrator will have all permissions.
---
## Q: Which mailboxes does the system support?
A: Support most mailboxes, qq, 163, 126, 139, outlook, aliyun, etc. are supported. Support TLS and SSL protocols, optionally configured in alert.properties
---
## Q: What are the common system variable time parameters and how do I use them?
A: Please refer to 'System parameter' in the system-manual
---
## Q: pip install kazoo This installation gives an error. Is it necessary to install?
A: This is the python connection zookeeper needs to use, must be installed
---
## Q: How to specify the machine running task
A: Use **the administrator** to create a Worker group, **specify the Worker group** when the **process definition starts**, or **specify the Worker group on the task node**. If not specified, use Default, **Default is to select one of all the workers in the cluster to use for task submission and execution.**
---
## Q: Priority of the task
A: We also support **the priority of processes and tasks**. Priority We have five levels of **HIGHEST, HIGH, MEDIUM, LOW and LOWEST**. **You can set the priority between different process instances, or you can set the priority of different task instances in the same process instance.** For details, please refer to the task priority design in the architecture-design.
----
## Q: Escheduler-grpc gives an error
A: Execute in the root directory: mvn -U clean package assembly:assembly -Dmaven.test.skip=true , then refresh the entire project
----
## Q: Does EasyScheduler support running on windows?
A: In theory, **only the Worker needs to run on Linux**. Other services can run normally on Windows. But it is still recommended to deploy on Linux.
-----
## Q: UI compiles node-sass prompt in linux: Error: EACCESS: permission denied, mkdir xxxx
A: Install **npm install node-sass --unsafe-perm** separately, then **npm install**
---
## Q: UI cannot log in normally.
A: 1, if it is node startup, check whether the .env API_BASE configuration under escheduler-ui is the Api Server service address.
2, If it is nginx booted and installed via **install-escheduler-ui.sh**, check if the proxy_pass configuration in **/etc/nginx/conf.d/escheduler.conf** is the Api Server service. address
 3, if the above configuration is correct, then please check if the Api Server service is normal, curl http://192.168.xx.xx:12345/escheduler/users/get-user-info, check the Api Server log, if Prompt cn.escheduler.api.interceptor.LoginHandlerInterceptor:[76] - session info is null, which proves that the Api Server service is normal.
4, if there is no problem above, you need to check if **server.context-path and server.port configuration** in **application.properties** is correct
---
## Q: After the process definition is manually started or scheduled, no process instance is generated.
A: 1, first **check whether the MasterServer service exists through jps**, or directly check whether there is a master service in zk from the service monitoring.
2,If there is a master service, check **the command status statistics** or whether new records are added in **t_escheduler_error_command**. If it is added, **please check the message field.**
---
## Q : The task status is always in the successful submission status.
A: 1, **first check whether the WorkerServer service exists through jps**, or directly check whether there is a worker service in zk from the service monitoring.
2,If the **WorkerServer** service is normal, you need to **check whether the MasterServer puts the task task in the zk queue. You need to check whether the task is blocked in the MasterServer log and the zk queue.**
3, if there is no problem above, you need to locate whether the Worker group is specified, but **the machine grouped by the worker is not online**.**
---
## Q: Is there a Docker image and a Dockerfile?
A: Provide Docker image and Dockerfile.
Docker image address: https://hub.docker.com/r/escheduler/escheduler_images
Dockerfile address: https://github.com/qiaozhanwei/escheduler_dockerfile/tree/master/docker_escheduler
------
## Q : Need to pay attention to the problem in install.sh
A: 1, if the replacement variable contains special characters, **use the \ transfer character to transfer**
2, installPath="/data1_1T/escheduler", **this directory can not be the same as the install.sh directory currently installed with one click.**
3, deployUser = "escheduler", **the deployment user must have sudo privileges**, because the worker is executed by sudo -u tenant sh xxx.command
4, monitorServerState = "false", whether the service monitoring script is started, the default is not to start the service monitoring script. **If the service monitoring script is started, the master and worker services are monitored every 5 minutes, and if the machine is down, it will automatically restart.**
5, hdfsStartupSate="false", whether to enable HDFS resource upload function. The default is not enabled. **If it is not enabled, the resource center cannot be used.** If enabled, you need to configure the configuration of fs.defaultFS and yarn in conf/common/hadoop/hadoop.properties. If you use namenode HA, you need to copy core-site.xml and hdfs-site.xml to the conf root directory.
Note: **The 1.0.x version does not automatically create the hdfs root directory, you need to create it yourself, and you need to deploy the user with hdfs operation permission.**
---
## Q : Process definition and process instance offline exception
A : For **versions prior to 1.0.4**, modify the code under the escheduler-api cn.escheduler.api.quartz package.
```
public boolean deleteJob(String jobName, String jobGroupName) {
lock.writeLock().lock();
try {
JobKey jobKey = new JobKey(jobName,jobGroupName);
if(scheduler.checkExists(jobKey)){
logger.info("try to delete job, job name: {}, job group name: {},", jobName, jobGroupName);
return scheduler.deleteJob(jobKey);
}else {
return true;
}
} catch (SchedulerException e) {
logger.error(String.format("delete job : %s failed",jobName), e);
} finally {
lock.writeLock().unlock();
}
return false;
}
```
---
## Q: Can the tenant created before the HDFS startup use the resource center normally?
A: No. Because the tenant created by HDFS is not started, the tenant directory will not be registered in HDFS. So the last resource will report an error.
## Q: In the multi-master and multi-worker state, the service is lost, how to be fault-tolerant
A: **Note:** **Master monitors Master and Worker services.**
1,If the Master service is lost, other Masters will take over the process of the hanged Master and continue to monitor the Worker task status.
2,If the Worker service is lost, the Master will monitor that the Worker service is gone. If there is a Yarn task, the Kill Yarn task will be retried.
Please see the fault-tolerant design in the architecture for details.
---
## Q : Fault tolerance for a machine distributed by Master and Worker
A: The 1.0.3 version only implements the fault tolerance of the Master startup process, and does not take the Worker Fault Tolerance. That is to say, if the Worker hangs, no Master exists. There will be problems with this process. We will add Master and Worker startup fault tolerance in version **1.1.0** to fix this problem. If you want to manually modify this problem, you need to **modify the running task for the running worker task that is running the process across the restart and has been dropped. The running process is set to the failed state across the restart**. Then resume the process from the failed node.
---
## Q : Timing is easy to set to execute every second
A : Note when setting the timing. If the first digit (* * * * * ? *) is set to *, it means execution every second. **We will add a list of recently scheduled times in version 1.1.0.** You can see the last 5 running times online at http://cron.qqe2.com/
## Q: Is there a valid time range for timing?
A: Yes, **if the timing start and end time is the same time, then this timing will be invalid timing. If the end time of the start and end time is smaller than the current time, it is very likely that the timing will be automatically deleted.**
## Q : There are several implementations of task dependencies
A: 1, the task dependency between **DAG**, is **from the zero degree** of the DAG segmentation
2, there are **task dependent nodes**, you can achieve cross-process tasks or process dependencies, please refer to the (DEPENDENT) node design in the system-manual.
Note: **Cross-project processes or task dependencies are not supported**
## Q: There are several ways to start the process definition.
A: 1, in **the process definition list**, click the **Start** button.
2, **the process definition list adds a timer**, scheduling start process definition.
3, process definition **view or edit** the DAG page, any **task node right click** Start process definition.
4, you can define DAG editing for the process, set the running flag of some tasks to **prohibit running**, when the process definition is started, the connection of the node will be removed from the DAG.
## Q : Python task setting Python version
A: 1,**for the version after 1.0.3** only need to modify PYTHON_HOME in conf/env/.escheduler_env.sh
```
export PYTHON_HOME=/bin/python
```
Note: This is **PYTHON_HOME** , which is the absolute path of the python command, not the simple PYTHON_HOME. Also note that when exporting the PATH, you need to directly
```
export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH
```
2,For versions prior to 1.0.3, the Python task only supports the Python version of the system. It does not support specifying the Python version.
## Q:Worker Task will generate a child process through sudo -u tenant sh xxx.command, will kill when kill
A: We will add the kill task in 1.0.4 and kill all the various child processes generated by the task.
## Q : How to use the queue in EasyScheduler, what does the user queue and tenant queue mean?
A : The queue in the EasyScheduler can be configured on the user or the tenant. **The priority of the queue specified by the user is higher than the priority of the tenant queue.** For example, to specify a queue for an MR task, the queue is specified by mapreduce.job.queuename.
Note: When using the above method to specify the queue, the MR uses the following methods:
```
Configuration conf = new Configuration();
GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
String[] remainingArgs = optionParser.getRemainingArgs();
```
If it is a Spark task --queue mode specifies the queue
## Q : Master or Worker reports the following alarm
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/master_worker_lack_res.png" width="60%" />
</p>
A : Change the value of master.properties **master.reserved.memory** under conf to a smaller value, say 0.1 or the value of worker.properties **worker.reserved.memory** is a smaller value, say 0.1
## Q: The hive version is 1.1.0+cdh5.15.0, and the SQL hive task connection is reported incorrectly.
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/cdh_hive_error.png" width="60%" />
</p>
A : Will hive pom
```
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.0</version>
</dependency>
```
change into
```
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.1.0</version>
</dependency>
```

96
docs/en_US/README.md

@ -0,0 +1,96 @@
Easy Scheduler
============
[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![Total Lines](https://tokei.rs/b1/github/analysys/EasyScheduler?category=lines)](https://github.com/analysys/EasyScheduler)
> Easy Scheduler for Big Data
[![Stargazers over time](https://starchart.cc/analysys/EasyScheduler.svg)](https://starchart.cc/analysys/EasyScheduler)
[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md)
[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md)
### Design features:
A distributed and easy-to-expand visual DAG workflow scheduling system. Dedicated to solving the complex dependencies in data processing, making the scheduling system `out of the box` for data processing.
Its main objectives are as follows:
- Associate the Tasks according to the dependencies of the tasks in a DAG graph, which can visualize the running state of task in real time.
- Support for many task types: Shell, MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Sub_Process, Procedure, etc.
- Support process scheduling, dependency scheduling, manual scheduling, manual pause/stop/recovery, support for failed retry/alarm, recovery from specified nodes, Kill task, etc.
- Support process priority, task priority and task failover and task timeout alarm/failure
- Support process global parameters and node custom parameter settings
- Support online upload/download of resource files, management, etc. Support online file creation and editing
- Support task log online viewing and scrolling, online download log, etc.
- Implement cluster HA, decentralize Master cluster and Worker cluster through Zookeeper
- Support online viewing of `Master/Worker` cpu load, memory
- Support process running history tree/gantt chart display, support task status statistics, process status statistics
- Support backfilling data
- Support multi-tenant
- Support internationalization
- There are more waiting partners to explore
### What's in Easy Scheduler
Stability | Easy to use | Features | Scalability |
-- | -- | -- | --
Decentralized multi-master and multi-worker | Visualization process defines key information such as task status, task type, retry times, task running machine, visual variables and so on at a glance.  |  Support pause, recover operation | support custom task types
HA is supported by itself | All process definition operations are visualized, dragging tasks to draw DAGs, configuring data sources and resources. At the same time, for third-party systems, the api mode operation is provided. | Users on easyscheduler can achieve many-to-one or one-to-one mapping relationship through tenants and Hadoop users, which is very important for scheduling large data jobs. | The scheduler uses distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster. Master and Worker support dynamic online and offline.
Overload processing: Task queue mechanism, the number of schedulable tasks on a single machine can be flexibly configured, when too many tasks will be cached in the task queue, will not cause machine jam. | One-click deployment | Supports traditional shell tasks, and also support big data platform task scheduling: MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Procedure, Sub_Process | |
### System partial screenshot
![image](https://user-images.githubusercontent.com/48329107/61368744-1f5f3b00-a8c1-11e9-9cf1-10f8557a6b3b.png)
![image](https://user-images.githubusercontent.com/48329107/61368966-9dbbdd00-a8c1-11e9-8dcc-a9469d33583e.png)
![image](https://user-images.githubusercontent.com/48329107/61372146-f347b800-a8c8-11e9-8882-66e8934ada23.png)
### Document
- <a href="https://analysys.github.io/easyscheduler_docs/backend-deployment.html" target="_blank">Backend deployment documentation</a>
- <a href="https://analysys.github.io/easyscheduler_docs/frontend-deployment.html" target="_blank">Front-end deployment documentation</a>
- [**User manual**](https://analysys.github.io/easyscheduler_docs/system-manual.html?_blank "User manual")
- [**Upgrade document**](https://analysys.github.io/easyscheduler_docs/upgrade.html?_blank "Upgrade document")
- <a href="http://52.82.13.76:8888" target="_blank">Online Demo</a>
More documentation please refer to <a href="https://analysys.github.io/easyscheduler_docs/" target="_blank">[EasyScheduler online documentation]</a>
### Recent R&D plan
Work plan of Easy Scheduler: [R&D plan](https://github.com/analysys/EasyScheduler/projects/1), where `In Develop` card is the features of 1.1.0 version , TODO card is to be done (including feature ideas)
### How to contribute code
Welcome to participate in contributing code, please refer to the process of submitting the code:
[[How to contribute code](https://github.com/analysys/EasyScheduler/issues/310)]
### Thanks
Easy Scheduler uses a lot of excellent open source projects, such as google guava, guice, grpc, netty, ali bonecp, quartz, and many open source projects of apache, etc.
It is because of the shoulders of these open source projects that the birth of the Easy Scheduler is possible. We are very grateful for all the open source software used! We also hope that we will not only be the beneficiaries of open source, but also be open source contributors, so we decided to contribute to easy scheduling and promised long-term updates. We also hope that partners who have the same passion and conviction for open source will join in and contribute to open source!
### Get Help
The fastest way to get response from our developers is to submit issues, or add our wechat : 510570367
### License
Please refer to [LICENSE](https://github.com/analysys/EasyScheduler/blob/dev/LICENSE) file.

50
docs/en_US/SUMMARY.md

@ -0,0 +1,50 @@
# Summary
* [Instruction](README.md)
* Frontend Deployment
* [Preparations](frontend-deployment.md#Preparations)
* [Deployment](frontend-deployment.md#Deployment)
* [FAQ](frontend-deployment.md#FAQ)
* Backend Deployment
* [Preparations](backend-deployment.md#Preparations)
* [Deployment](backend-deployment.md#Deployment)
* [Quick Start](quick-start.md#Quick Start)
* System Use Manual
* [Operational Guidelines](system-manual.md#Operational Guidelines)
* [Security](system-manual.md#Security)
* [Monitor center](system-manual.md#Monitor center)
* [Task Node Type and Parameter Setting](system-manual.md#Task Node Type and Parameter Setting)
* [System parameter](system-manual.md#System parameter)
* [Architecture Design](architecture-design.md)
* Front-end development
* [Development environment](frontend-development.md#Development environment)
* [Project directory structure](frontend-development.md#Project directory structure)
* [System function module](frontend-development.md#System function module)
* [Routing and state management](frontend-development.md#Routing and state management)
* [specification](frontend-development.md#specification)
* [interface](frontend-development.md#interface)
* [Extended development](frontend-development.md#Extended development)
* Backend development documentation
* [Environmental requirements](backend-development.md#Environmental requirements)
* [Project compilation](backend-development.md#Project compilation)
* [Interface documentation](http://52.82.13.76:8888/escheduler/doc.html?language=en_US&lang=en)
* FAQ
* [FAQ](EasyScheduler-FAQ.md)
* EasyScheduler upgrade documentation
* [upgrade documentation](upgrade.md)
* History release notes
* [1.1.0 release](1.1.0-release.md)
* [1.0.5 release](1.0.5-release.md)
* [1.0.4 release](1.0.4-release.md)
* [1.0.3 release](1.0.3-release.md)
* [1.0.2 release](1.0.2-release.md)
* [1.0.1 release](1.0.1-release.md)
* [1.0.0 release]

316
docs/en_US/architecture-design.md

@ -0,0 +1,316 @@
## Architecture Design
Before explaining the architecture of the schedule system, let us first understand the common nouns of the schedule system.
### 1.Noun Interpretation
**DAG:** Full name Directed Acyclic Graph,referred to as DAG。Tasks in the workflow are assembled in the form of directed acyclic graphs, which are topologically traversed from nodes with zero indegrees of ingress until there are no successor nodes. For example, the following picture:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/dag_examples_cn.jpg" alt="dag示例" width="60%" />
<p align="center">
<em>dag example</em>
</p>
</p>
**Process definition**: Visualization **DAG** by dragging task nodes and establishing associations of task nodes
**Process instance**: A process instance is an instantiation of a process definition, which can be generated by manual startup or scheduling. The process definition runs once, a new process instance is generated
**Task instance**: A task instance is the instantiation of a specific task node when a process instance runs, which indicates the specific task execution status
**Task type**: Currently supports SHELL, SQL, SUB_PROCESS (sub-process), PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (dependency), and plans to support dynamic plug-in extension, note: the sub-**SUB_PROCESS** is also A separate process definition that can be launched separately
**Schedule mode** : The system supports timing schedule and manual schedule based on cron expressions. Command type support: start workflow, start execution from current node, resume fault-tolerant workflow, resume pause process, start execution from failed node, complement, timer, rerun, pause, stop, resume waiting thread. Where **recovers the fault-tolerant workflow** and **restores the waiting thread** The two command types are used by the scheduling internal control and cannot be called externally
**Timed schedule**: The system uses **quartz** distributed scheduler and supports the generation of cron expression visualization
**Dependency**: The system does not only support **DAG** Simple dependencies between predecessors and successor nodes, but also provides **task dependencies** nodes, support for custom task dependencies between processes**
**Priority**: Supports the priority of process instances and task instances. If the process instance and task instance priority are not set, the default is first in, first out.
**Mail Alert**: Support **SQL Task** Query Result Email Send, Process Instance Run Result Email Alert and Fault Tolerant Alert Notification
**Failure policy**: For tasks running in parallel, if there are tasks that fail, two failure policy processing methods are provided. **Continue** means that the status of the task is run in parallel until the end of the process failure. **End** means that once a failed task is found, Kill also drops the running parallel task and the process ends.
**Complement**: Complement historical data, support ** interval parallel and serial ** two complement methods
### 2.System architecture
#### 2.1 System Architecture Diagram
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/62609545-8f973480-b934-11e9-9a58-d8133222f14d.png" alt="System Architecture Diagram" />
<p align="center">
<em>System Architecture Diagram</em>
</p>
</p>
#### 2.2 Architectural description
* **MasterServer**
MasterServer adopts the distributed non-central design concept. MasterServer is mainly responsible for DAG task split, task submission monitoring, and monitoring the health status of other MasterServer and WorkerServer.
When the MasterServer service starts, it registers a temporary node with Zookeeper, and listens to the Zookeeper temporary node state change for fault tolerance processing.
##### The service mainly contains:
- **Distributed Quartz** distributed scheduling component, mainly responsible for the start and stop operation of the scheduled task. When the quartz picks up the task, the master internally has a thread pool to be responsible for the subsequent operations of the task.
- **MasterSchedulerThread** is a scan thread that periodically scans the **command** table in the database for different business operations based on different ** command types**
- **MasterExecThread** is mainly responsible for DAG task segmentation, task submission monitoring, logic processing of various command types
- **MasterTaskExecThread** is mainly responsible for task persistence
* **WorkerServer**
- WorkerServer also adopts a distributed, non-central design concept. WorkerServer is mainly responsible for task execution and providing log services. When the WorkerServer service starts, it registers the temporary node with Zookeeper and maintains the heartbeat.
##### This service contains:
- **FetchTaskThread** is mainly responsible for continuously receiving tasks from **Task Queue** and calling **TaskScheduleThread** corresponding executors according to different task types.
- **LoggerServer** is an RPC service that provides functions such as log fragment viewing, refresh and download.
- **ZooKeeper**
The ZooKeeper service, the MasterServer and the WorkerServer nodes in the system all use the ZooKeeper for cluster management and fault tolerance. In addition, the system also performs event monitoring and distributed locking based on ZooKeeper.
We have also implemented queues based on Redis, but we hope that EasyScheduler relies on as few components as possible, so we finally removed the Redis implementation.
- **Task Queue**
The task queue operation is provided. Currently, the queue is also implemented based on Zookeeper. Since there is less information stored in the queue, there is no need to worry about too much data in the queue. In fact, we have over-measured a million-level data storage queue, which has no effect on system stability and performance.
- **Alert**
Provides alarm-related interfaces. The interfaces mainly include **Alarms**. The storage, query, and notification functions of the two types of alarm data. The notification function has two types: **mail notification** and **SNMP (not yet implemented)**.
- **API**
The API interface layer is mainly responsible for processing requests from the front-end UI layer. The service provides a RESTful api to provide request services externally.
Interfaces include workflow creation, definition, query, modification, release, offline, manual start, stop, pause, resume, start execution from this node, and more.
- **UI**
The front-end page of the system provides various visual operation interfaces of the system. For details, see the **[System User Manual] (System User Manual.md)** section.
#### 2.3 Architectural Design Ideas
##### I. Decentralized vs centralization
###### Centralization Thought
The centralized design concept is relatively simple. The nodes in the distributed cluster are divided into two roles according to their roles:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/master_slave.png" alt="master-slave role" width="50%" />
</p>
- The role of Master is mainly responsible for task distribution and supervising the health status of Slave. It can dynamically balance the task to Slave, so that the Slave node will not be "busy" or "free".
- The role of the Worker is mainly responsible for the execution of the task and maintains the heartbeat with the Master so that the Master can assign tasks to the Slave.
Problems in the design of centralized :
- Once the Master has a problem, the group has no leader and the entire cluster will crash. In order to solve this problem, most Master/Slave architecture modes adopt the design scheme of the master and backup masters, which can be hot standby or cold standby, automatic switching or manual switching, and more and more new systems are available. Automatically elects the ability to switch masters to improve system availability.
- Another problem is that if the Scheduler is on the Master, although it can support different tasks in one DAG running on different machines, it will generate overload of the Master. If the Scheduler is on the Slave, all tasks in a DAG can only be submitted on one machine. If there are more parallel tasks, the pressure on the Slave may be larger.
###### Decentralization
<p align="center"
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/decentralization.png" alt="decentralized" width="50%" />
</p>
- In the decentralized design, there is usually no Master/Slave concept, all roles are the same, the status is equal, the global Internet is a typical decentralized distributed system, networked arbitrary node equipment down machine , all will only affect a small range of features.
- The core design of decentralized design is that there is no "manager" that is different from other nodes in the entire distributed system, so there is no single point of failure problem. However, since there is no "manager" node, each node needs to communicate with other nodes to get the necessary machine information, and the unreliable line of distributed system communication greatly increases the difficulty of implementing the above functions.
- In fact, truly decentralized distributed systems are rare. Instead, dynamic centralized distributed systems are constantly emerging. Under this architecture, the managers in the cluster are dynamically selected, rather than preset, and when the cluster fails, the nodes of the cluster will spontaneously hold "meetings" to elect new "managers". Go to preside over the work. The most typical case is the Etcd implemented in ZooKeeper and Go.
- Decentralization of EasyScheduler is the registration of Master/Worker to ZooKeeper. The Master Cluster and the Worker Cluster are not centered, and the Zookeeper distributed lock is used to elect one Master or Worker as the “manager” to perform the task.
##### 二、Distributed lock practice
EasyScheduler uses ZooKeeper distributed locks to implement only one Master to execute the Scheduler at the same time, or only one Worker to perform task submission.
1. The core process algorithm for obtaining distributed locks is as follows
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/distributed_lock.png" alt="Get Distributed Lock Process" width="50%" />
</p>
2. Scheduler thread distributed lock implementation flow chart in EasyScheduler:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/distributed_lock_procss.png" alt="Get Distributed Lock Process" width="50%" />
</p>
##### Third, the thread is insufficient loop waiting problem
- If there is no subprocess in a DAG, if the number of data in the Command is greater than the threshold set by the thread pool, the direct process waits or fails.
- If a large number of sub-processes are nested in a large DAG, the following figure will result in a "dead" state:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/lack_thread.png" alt="Thread is not enough to wait for loop" width="50%" />
</p>
In the above figure, MainFlowThread waits for SubFlowThread1 to end, SubFlowThread1 waits for SubFlowThread2 to end, SubFlowThread2 waits for SubFlowThread3 to end, and SubFlowThread3 waits for a new thread in the thread pool, then the entire DAG process cannot end, and thus the thread cannot be released. This forms the state of the child parent process loop waiting. At this point, the scheduling cluster will no longer be available unless a new Master is started to add threads to break such a "stuck."
It seems a bit unsatisfactory to start a new Master to break the deadlock, so we proposed the following three options to reduce this risk:
1. Calculate the sum of the threads of all Masters, and then calculate the number of threads required for each DAG, that is, pre-calculate before the DAG process is executed. Because it is a multi-master thread pool, the total number of threads is unlikely to be obtained in real time.
2. Judge the single master thread pool. If the thread pool is full, let the thread fail directly.
3. Add a Command type with insufficient resources. If the thread pool is insufficient, the main process will be suspended. This way, the thread pool has a new thread, which can make the process with insufficient resources hang up and wake up again.
Note: The Master Scheduler thread is FIFO-enabled when it gets the Command.
So we chose the third way to solve the problem of insufficient threads.
##### IV. Fault Tolerant Design
Fault tolerance is divided into service fault tolerance and task retry. Service fault tolerance is divided into two types: Master Fault Tolerance and Worker Fault Tolerance.
###### 1. Downtime fault tolerance
Service fault tolerance design relies on ZooKeeper's Watcher mechanism. The implementation principle is as follows:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant.png" alt="EasyScheduler Fault Tolerant Design" width="40%" />
</p>
The Master monitors the directories of other Masters and Workers. If the remove event is detected, the process instance is fault-tolerant or the task instance is fault-tolerant according to the specific business logic.
- Master fault tolerance flow chart:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant_master.png" alt="Master Fault Tolerance Flowchart" width="40%" />
</p>
After the ZooKeeper Master is fault-tolerant, it is rescheduled by the Scheduler thread in EasyScheduler. It traverses the DAG to find the "Running" and "Submit Successful" tasks, and monitors the status of its task instance for the "Running" task. You need to determine whether the Task Queue already exists. If it exists, monitor the status of the task instance. If it does not exist, resubmit the task instance.
- Worker fault tolerance flow chart:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant_worker.png" alt="Worker Fault Tolerance Flowchart" width="40%" />
</p>
Once the Master Scheduler thread finds the task instance as "need to be fault tolerant", it takes over the task and resubmits.
Note: Because the "network jitter" may cause the node to lose the heartbeat of ZooKeeper in a short time, the node's remove event occurs. In this case, we use the easiest way, that is, once the node has timeout connection with ZooKeeper, it will directly stop the Master or Worker service.
###### 2. Task failure retry
Here we must first distinguish between the concept of task failure retry, process failure recovery, and process failure rerun:
- Task failure Retry is task level, which is automatically performed by the scheduling system. For example, if a shell task sets the number of retries to 3 times, then the shell task will try to run up to 3 times after failing to run.
- Process failure recovery is process level, is done manually, recovery can only be performed from the failed node ** or ** from the current node **
- Process failure rerun is also process level, is done manually, rerun is from the start node
Next, let's talk about the topic, we divided the task nodes in the workflow into two types.
- One is a business node, which corresponds to an actual script or processing statement, such as a Shell node, an MR node, a Spark node, a dependent node, and so on.
- There is also a logical node, which does not do the actual script or statement processing, but the logical processing of the entire process flow, such as sub-flow sections.
Each ** service node** can configure the number of failed retries. When the task node fails, it will automatically retry until it succeeds or exceeds the configured number of retries. **Logical node** does not support failed retry. But the tasks in the logical nodes support retry.
If there is a task failure in the workflow that reaches the maximum number of retries, the workflow will fail to stop, and the failed workflow can be manually rerun or process resumed.
##### V. Task priority design
In the early scheduling design, if there is no priority design and fair scheduling design, it will encounter the situation that the task submitted first may be completed simultaneously with the task submitted subsequently, but the priority of the process or task cannot be set. We have redesigned this, and we are currently designing it as follows:
- According to ** different process instance priority ** prioritizes ** same process instance priority ** prioritizes ** task priority within the same process ** takes precedence over ** same process ** commit order from high Go to low for task processing.
- The specific implementation is to resolve the priority according to the json of the task instance, and then save the ** process instance priority _ process instance id_task priority _ task id** information in the ZooKeeper task queue, when obtained from the task queue, Through string comparison, you can get the task that needs to be executed first.
- The priority of the process definition is that some processes need to be processed before other processes. This can be configured at the start of the process or at the time of scheduled start. There are 5 levels, followed by HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/process_priority.png" alt="Process Priority Configuration" width="40%" />
</p>
- The priority of the task is also divided into 5 levels, followed by HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/task_priority.png" alt="task priority configuration" width="35%" />
</p>
##### VI. Logback and gRPC implement log access
- Since the Web (UI) and Worker are not necessarily on the same machine, viewing the log is not as it is for querying local files. There are two options:
- Put the logs on the ES search engine
- Obtain remote log information through gRPC communication
- Considering the lightweightness of EasyScheduler as much as possible, gRPC was chosen to implement remote access log information.
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/grpc.png" alt="grpc remote access" width="50%" />
</p>
- We use a custom Logback FileAppender and Filter function to generate a log file for each task instance.
- The main implementation of FileAppender is as follows:
```java
/**
* task log appender
*/
Public class TaskLogAppender extends FileAppender<ILoggingEvent {
...
@Override
Protected void append(ILoggingEvent event) {
If (currentlyActiveFile == null){
currentlyActiveFile = getFile();
}
String activeFile = currentlyActiveFile;
// thread name: taskThreadName-processDefineId_processInstanceId_taskInstanceId
String threadName = event.getThreadName();
String[] threadNameArr = threadName.split("-");
// logId = processDefineId_processInstanceId_taskInstanceId
String logId = threadNameArr[1];
...
super.subAppend(event);
}
}
```
Generate a log in the form of /process definition id/process instance id/task instance id.log
- Filter matches the thread name starting with TaskLogInfo:
- TaskLogFilter is implemented as follows:
```java
/**
* task log filter
*/
Public class TaskLogFilter extends Filter<ILoggingEvent {
@Override
Public FilterReply decide(ILoggingEvent event) {
If (event.getThreadName().startsWith("TaskLogInfo-")){
Return FilterReply.ACCEPT;
}
Return FilterReply.DENY;
}
}
```
### summary
Starting from the scheduling, this paper introduces the architecture principle and implementation ideas of the big data distributed workflow scheduling system-EasyScheduler. To be continued

207
docs/en_US/backend-deployment.md

@ -0,0 +1,207 @@
# Backend Deployment Document
There are two deployment modes for the backend:
- automatic deployment
- source code compile and then deployment
## Preparations
Download the latest version of the installation package, download address: [gitee download](https://gitee.com/easyscheduler/EasyScheduler/attach_files/) or [github download](https://github.com/analysys/EasyScheduler/releases), download escheduler-backend-x.x.x.tar.gz(back-end referred to as escheduler-backend),escheduler-ui-x.x.x.tar.gz(front-end referred to as escheduler-ui)
#### Preparations 1: Installation of basic software (self-installation of required items)
* [Mysql](http://geek.analysys.cn/topic/124) (5.5+) : Mandatory
* [JDK](https://www.oracle.com/technetwork/java/javase/downloads/index.html) (1.8+) : Mandatory
* [ZooKeeper](https://www.jianshu.com/p/de90172ea680)(3.4.6+) :Mandatory
* [Hadoop](https://blog.csdn.net/Evankaka/article/details/51612437)(2.6+) :Optionally, if you need to use the resource upload function, MapReduce task submission needs to configure Hadoop (uploaded resource files are currently stored on Hdfs)
* [Hive](https://staroon.pro/2017/12/09/HiveInstall/)(1.2.1) : Optional, hive task submission needs to be installed
* Spark(1.x,2.x) : Optional, Spark task submission needs to be installed
* PostgreSQL(8.2.15+) : Optional, PostgreSQL PostgreSQL stored procedures need to be installed
```
Note: Easy Scheduler itself does not rely on Hadoop, Hive, Spark, PostgreSQL, but only calls their Client to run the corresponding tasks.
```
#### Preparations 2: Create deployment users
- Deployment users are created on all machines that require deployment scheduling, because the worker service executes jobs in `sudo-u {linux-user}`, so deployment users need sudo privileges and are confidential.
```
vi /etc/sudoers
# For example, the deployment user is an escheduler account
escheduler ALL=(ALL) NOPASSWD: NOPASSWD: ALL
# And you need to comment out the Default requiretty line
#Default requiretty
```
#### Preparations 3: SSH Secret-Free Configuration
Configure SSH secret-free login on deployment machines and other installation machines. If you want to install easyscheduler on deployment machines, you need to configure native password-free login itself.
- [Connect the host and other machines SSH](http://geek.analysys.cn/topic/113)
#### Preparations 4: database initialization
* Create databases and accounts
Execute the following command to create database and account
```
CREATE DATABASE escheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON escheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}';
GRANT ALL PRIVILEGES ON escheduler.* TO '{user}'@'localhost' IDENTIFIED BY '{password}';
flush privileges;
```
* creates tables and imports basic data
Modify the following attributes in ./conf/dao/data_source.properties
```
spring.datasource.url
spring.datasource.username
spring.datasource.password
```
Execute scripts for creating tables and importing basic data
```
sh ./script/create-escheduler.sh
```
#### Preparations 5: Modify the deployment directory permissions and operation parameters
instruction of escheduler-backend directory
```directory
bin : Basic service startup script
conf : Project Profile
lib : The project relies on jar packages, including individual module jars and third-party jars
script : Cluster Start, Stop and Service Monitor Start and Stop scripts
sql : The project relies on SQL files
install.sh : One-click deployment script
```
- Modify permissions (please modify the 'deployUser' to the corresponding deployment user) so that the deployment user has operational privileges on the escheduler-backend directory
`sudo chown -R deployUser:deployUser escheduler-backend`
- Modify the `.escheduler_env.sh` environment variable in the conf/env/directory
- Modify deployment parameters (depending on your server and business situation):
- Modify the parameters in **install.sh** to replace the values required by your business
- MonitorServerState switch variable, added in version 1.0.3, controls whether to start the self-start script (monitor master, worker status, if off-line will start automatically). The default value of "false" means that the self-start script is not started, and if it needs to start, it is changed to "true".
- 'hdfsStartupSate' switch variable controls whether to start hdfs
The default value of "false" means not to start hdfs
Change the variable to 'true' if you want to use hdfs, you also need to create the hdfs root path by yourself, that 'hdfsPath' in install.sh.
- If you use hdfs-related functions, you need to copy**hdfs-site.xml** and **core-site.xml** to the conf directory
## Deployment
Automated deployment is recommended, and experienced partners can use source deployment as well.
### Automated Deployment
- Install zookeeper tools
`pip install kazoo`
- Switch to deployment user, one-click deployment
`sh install.sh`
- Use the `jps` command to check if the services are started (`jps` comes from `Java JDK`)
```aidl
MasterServer ----- Master Service
WorkerServer ----- Worker Service
LoggerServer ----- Logger Service
ApiApplicationServer ----- API Service
AlertServer ----- Alert Service
```
If all services are normal, the automatic deployment is successful
After successful deployment, the log can be viewed and stored in a specified folder.
```logPath
logs/
├── escheduler-alert-server.log
├── escheduler-master-server.log
|—— escheduler-worker-server.log
|—— escheduler-api-server.log
|—— escheduler-logger-server.log
```
### Compile source code to deploy
After downloading the release version of the source package, unzip it into the root directory
* Execute the compilation command:
```
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
```
* View directory
After normal compilation, ./target/escheduler-{version}/ is generated in the current directory
### Start-and-stop services commonly used in systems (for service purposes, please refer to System Architecture Design for details)
* stop all services in the cluster
` sh ./bin/stop-all.sh`
* start all services in the cluster
` sh ./bin/start-all.sh`
* start and stop one master server
```master
sh ./bin/escheduler-daemon.sh start master-server
sh ./bin/escheduler-daemon.sh stop master-server
```
* start and stop one worker server
```worker
sh ./bin/escheduler-daemon.sh start worker-server
sh ./bin/escheduler-daemon.sh stop worker-server
```
* start and stop api server
```Api
sh ./bin/escheduler-daemon.sh start api-server
sh ./bin/escheduler-daemon.sh stop api-server
```
* start and stop logger server
```Logger
sh ./bin/escheduler-daemon.sh start logger-server
sh ./bin/escheduler-daemon.sh stop logger-server
```
* start and stop alert server
```Alert
sh ./bin/escheduler-daemon.sh start alert-server
sh ./bin/escheduler-daemon.sh stop alert-server
```
## Database Upgrade
Database upgrade is a function added in version 1.0.2. The database can be upgraded automatically by executing the following command:
```upgrade
sh ./script/upgrade-escheduler.sh
```

48
docs/en_US/backend-development.md

@ -0,0 +1,48 @@
# Backend development documentation
## Environmental requirements
* [Mysql](http://geek.analysys.cn/topic/124) (5.5+) : Must be installed
* [JDK](https://www.oracle.com/technetwork/java/javase/downloads/index.html) (1.8+) : Must be installed
* [ZooKeeper](https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper)(3.4.6+) :Must be installed
* [Maven](http://maven.apache.org/download.cgi)(3.3+) :Must be installed
Because the escheduler-rpc module in EasyScheduler uses Grpc, you need to use Maven to compile the generated classes.
For those who are not familiar with maven, please refer to: [maven in five minutes](http://maven.apache.org/guides/getting-started/maven-in-five-minutes.html)(3.3+)
http://maven.apache.org/install.html
## Project compilation
After importing the EasyScheduler source code into the development tools such as Idea, first convert to the Maven project (right click and select "Add Framework Support")
* Execute the compile command:
```
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
```
* View directory
After normal compilation, it will generate ./target/escheduler-{version}/ in the current directory.
```
bin
conf
lib
script
sql
install.sh
```
- Description
```
bin : basic service startup script
conf : project configuration file
lib : the project depends on the jar package, including the various module jars and third-party jars
script : cluster start, stop, and service monitoring start and stop scripts
sql : project depends on sql file
install.sh : one-click deployment script
```

23
docs/en_US/book.json

@ -0,0 +1,23 @@
{
"title": "EasyScheduler",
"author": "",
"description": "Scheduler",
"language": "en-US",
"gitbook": "3.2.3",
"styles": {
"website": "./styles/website.css"
},
"structure": {
"readme": "README.md"
},
"plugins":[
"expandable-chapters",
"insert-logo-link"
],
"pluginsConfig": {
"insert-logo-link": {
"src": "http://geek.analysys.cn/static/upload/236/2019-03-29/379450b4-7919-4707-877c-4d33300377d4.png",
"url": "https://github.com/analysys/EasyScheduler"
}
}
}

115
docs/en_US/frontend-deployment.md

@ -0,0 +1,115 @@
# frontend-deployment
The front-end has three deployment modes: automated deployment, manual deployment and compiled source deployment.
## Preparations
#### Download the installation package
Please download the latest version of the installation package, download address: [gitee](https://gitee.com/easyscheduler/EasyScheduler/attach_files/)
After downloading escheduler-ui-x.x.x.tar.gz,decompress`tar -zxvf escheduler-ui-x.x.x.tar.gz ./`and enter the`escheduler-ui`directory
## Deployment
Automated deployment is recommended for either of the following two ways
### Automated Deployment
Edit the installation file`vi install-escheduler-ui.sh` in the` escheduler-ui` directory
Change the front-end access port and the back-end proxy interface address
```
# Configure the front-end access port
esc_proxy="8888"
# Configure proxy back-end interface
esc_proxy_port="http://192.168.xx.xx:12345"
```
>Front-end automatic deployment based on Linux system `yum` operation, before deployment, please install and update`yum`
under this directory, execute`./install-escheduler-ui.sh`
### Manual Deployment
Install epel source `yum install epel-release -y`
Install Nginx `yum install nginx -y`
> #### Nginx configuration file address
```
/etc/nginx/conf.d/default.conf
```
> #### Configuration information (self-modifying)
```
server {
listen 8888;# access port
server_name localhost;
#charset koi8-r;
#access_log /var/log/nginx/host.access.log main;
location / {
root /xx/dist; # the dist directory address decompressed by the front end above (self-modifying)
index index.html index.html;
}
location /escheduler {
proxy_pass http://192.168.xx.xx:12345; # nterface address (self-modifying)
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header x_real_ipP $remote_addr;
proxy_set_header remote_addr $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_http_version 1.1;
proxy_connect_timeout 4s;
proxy_read_timeout 30s;
proxy_send_timeout 12s;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
```
> #### Restart the Nginx service
```
systemctl restart nginx
```
#### nginx command
- enable `systemctl enable nginx`
- restart `systemctl restart nginx`
- status `systemctl status nginx`
## FAQ
#### Upload file size limit
Edit the configuration file `vi /etc/nginx/nginx.conf`
```
# change upload size
client_max_body_size 1024m
```

650
docs/en_US/frontend-development.md

@ -0,0 +1,650 @@
# Front-end development documentation
### Technical selection
```
Vue mvvm framework
Es6 ECMAScript 6.0
Ans-ui Analysys-ui
D3 Visual Library Chart Library
Jsplumb connection plugin library
Lodash high performance JavaScript utility library
```
### Development environment
- #### Node installation
Node package download (note version 8.9.4) `https://nodejs.org/download/release/v8.9.4/`
- #### Front-end project construction
Use the command line mode `cd` enter the `escheduler-ui` project directory and execute `npm install` to pull the project dependency package.
> If `npm install` is very slow
> You can enter the Taobao image command line to enter `npm install -g cnpm --registry=https://registry.npm.taobao.org`
> Run `cnpm install`
- Create a new `.env` file or the interface that interacts with the backend
Create a new` .env` file in the `escheduler-ui `directory, add the ip address and port of the backend service to the file, and use it to interact with the backend. The contents of the` .env` file are as follows:
```
# Proxy interface address (modified by yourself)
API_BASE = http://192.168.xx.xx:12345
# If you need to access the project with ip, you can remove the "#" (example)
#DEV_HOST = 192.168.xx.xx
```
> ##### ! ! ! Special attention here. If the project reports a "node-sass error" error while pulling the dependency package, execute the following command again after execution.
```
npm install node-sass --unsafe-perm //单独安装node-sass依赖
```
- #### Development environment operation
- `npm start` project development environment (after startup address http://localhost:8888/#/)
#### Front-end project release
- `npm run build` project packaging (after packaging, the root directory will create a folder called dist for publishing Nginx online)
Run the `npm run build` command to generate a package file (dist) package
Copy it to the corresponding directory of the server (front-end service static page storage directory)
Visit address` http://localhost:8888/#/`
#### Start with node and daemon under Liunx
Install pm2 `npm install -g pm2`
Execute `pm2 start npm -- run dev` to start the project in the project `escheduler-ui `root directory
#### command
- Start `pm2 start npm -- run dev`
- Stop `pm2 stop npm`
- delete `pm2 delete npm`
- Status `pm2 list`
```
[root@localhost escheduler-ui]# pm2 start npm -- run dev
[PM2] Applying action restartProcessId on app [npm](ids: 0)
[PM2] [npm](0) ✓
[PM2] Process successfully started
┌──────────┬────┬─────────┬──────┬──────┬────────┬─────────┬────────┬─────┬──────────┬──────┬──────────┐
│ App name │ id │ version │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ user │ watching │
├──────────┼────┼─────────┼──────┼──────┼────────┼─────────┼────────┼─────┼──────────┼──────┼──────────┤
│ npm │ 0 │ N/A │ fork │ 6168 │ online │ 31 │ 0s │ 0% │ 5.6 MB │ root │ disabled │
└──────────┴────┴─────────┴──────┴──────┴────────┴─────────┴────────┴─────┴──────────┴──────┴──────────┘
Use `pm2 show <id|name>` to get more details about an app
```
### Project directory structure
`build` some webpack configurations for packaging and development environment projects
`node_modules` development environment node dependency package
`src` project required documents
`src => combo` project third-party resource localization `npm run combo` specific view `build/combo.js`
`src => font` Font icon library can be added by visiting https://www.iconfont.cn Note: The font library uses its own secondary development to reintroduce its own library `src/sass/common/_font.scss`
`src => images` public image storage
`src => js` js/vue
`src => lib` internal components of the company (company component library can be deleted after open source)
`src => sass` sass file One page corresponds to a sass file
`src => view` page file One page corresponds to an html file
```
> Projects are developed using vue single page application (SPA)
- All page entry files are in the `src/js/conf/${ corresponding page filename => home} index.js` entry file
- The corresponding sass file is in `src/sass/conf/${corresponding page filename => home}/index.scss`
- The corresponding html file is in `src/view/${corresponding page filename => home}/index.html`
```
Public module and utill `src/js/module`
`components` => internal project common components
`download` => download component
`echarts` => chart component
`filter` => filter and vue pipeline
`i18n` => internationalization
`io` => io request encapsulation based on axios
`mixin` => vue mixin public part for disabled operation
`permissions` => permission operation
`util` => tool
### System function module
Home => `http://localhost:8888/#/home`
Project Management => `http://localhost:8888/#/projects/list`
```
| Project Home
| Workflow
- Workflow definition
- Workflow instance
- Task instance
```
Resource Management => `http://localhost:8888/#/resource/file`
```
| File Management
| udf Management
- Resource Management
- Function management
```
Data Source Management => `http://localhost:8888/#/datasource/list`
Security Center => `http://localhost:8888/#/security/tenant`
```
| Tenant Management
| User Management
| Alarm Group Management
- master
- worker
```
User Center => `http://localhost:8888/#/user/account`
## Routing and state management
The project `src/js/conf/home` is divided into
`pages` => route to page directory
```
The page file corresponding to the routing address
```
`router` => route management
```
vue router, the entry file index.js in each page will be registered. Specific operations: https://router.vuejs.org/zh/
```
`store` => status management
```
The page corresponding to each route has a state management file divided into:
actions => mapActions => Details:https://vuex.vuejs.org/zh/guide/actions.html
getters => mapGetters => Details:https://vuex.vuejs.org/zh/guide/getters.html
index => entrance
mutations => mapMutations => Details:https://vuex.vuejs.org/zh/guide/mutations.html
state => mapState => Details:https://vuex.vuejs.org/zh/guide/state.html
Specific action:https://vuex.vuejs.org/zh/
```
## specification
## Vue specification
##### 1.Component name
The component is named multiple words and is connected with a wire (-) to avoid conflicts with HTML tags and a clearer structure.
```
// positive example
export default {
name: 'page-article-item'
}
```
##### 2.Component files
The internal common component of the `src/js/module/components` project writes the folder name with the same name as the file name. The subcomponents and util tools that are split inside the common component are placed in the internal `_source` folder of the component.
```
└── components
├── header
├── header.vue
└── _source
└── nav.vue
└── util.js
├── conditions
├── conditions.vue
└── _source
└── serach.vue
└── util.js
```
##### 3.Prop
When you define Prop, you should always name it in camel format (camelCase) and use the connection line (-) when assigning values to the parent component.This follows the characteristics of each language, because it is case-insensitive in HTML tags, and the use of links is more friendly; in JavaScript, the more natural is the hump name.
```
// Vue
props: {
articleStatus: Boolean
}
// HTML
<article-item :article-status="true"></article-item>
```
The definition of Prop should specify its type, defaults, and validation as much as possible.
Example:
```
props: {
attrM: Number,
attrA: {
type: String,
required: true
},
attrZ: {
type: Object,
// The default value of the array/object should be returned by a factory function
default: function () {
return {
msg: 'achieve you and me'
}
}
},
attrE: {
type: String,
validator: function (v) {
return !(['success', 'fail'].indexOf(v) === -1)
}
}
}
```
##### 4.v-for
When performing v-for traversal, you should always bring a key value to make rendering more efficient when updating the DOM.
```
<ul>
<li v-for="item in list" :key="item.id">
{{ item.title }}
</li>
</ul>
```
v-for should be avoided on the same element as v-if (`for example: <li>`) because v-for has a higher priority than v-if. To avoid invalid calculations and rendering, you should try to use v-if Put it on top of the container's parent element.
```
<ul v-if="showList">
<li v-for="item in list" :key="item.id">
{{ item.title }}
</li>
</ul>
```
##### 5.v-if / v-else-if / v-else
If the elements in the same set of v-if logic control are logically identical, Vue reuses the same part for more efficient element switching, `such as: value`. In order to avoid the unreasonable effect of multiplexing, you should add key to the same element for identification.
```
<div v-if="hasData" key="mazey-data">
<span>{{ mazeyData }}</span>
</div>
<div v-else key="mazey-none">
<span>no data</span>
</div>
```
##### 6.Instruction abbreviation
In order to unify the specification, the instruction abbreviation is always used. Using `v-bind`, `v-on` is not bad. Here is only a unified specification.
```
<input :value="mazeyUser" @click="verifyUser">
```
##### 7.Top-level element order of single file components
Styles are packaged in a file, all the styles defined in a single vue file, the same name in other files will also take effect. All will have a top class name before creating a component.
Note: The sass plugin has been added to the project, and the sas syntax can be written directly in a single vue file.
For uniformity and ease of reading, they should be placed in the order of `<template>`、`<script>`、`<style>`.
```
<template>
<div class="test-model">
test
</div>
</template>
<script>
export default {
name: "test",
data() {
return {}
},
props: {},
methods: {},
watch: {},
beforeCreate() {
},
created() {
},
beforeMount() {
},
mounted() {
},
beforeUpdate() {
},
updated() {
},
beforeDestroy() {
},
destroyed() {
},
computed: {},
components: {},
}
</script>
<style lang="scss" rel="stylesheet/scss">
.test-model {
}
</style>
```
## JavaScript specification
##### 1.var / let / const
It is recommended to no longer use var, but use let / const, prefer const. The use of any variable must be declared in advance, except that the function defined by function can be placed anywhere.
##### 2.quotes
```
const foo = 'after division'
const bar = `${foo},ront-end engineer`
```
##### 3.function
Anonymous functions use the arrow function uniformly. When multiple parameters/return values are used, the object's structure assignment is used first.
```
function getPersonInfo ({name, sex}) {
// ...
return {name, gender}
}
```
The function name is uniformly named with a camel name. The beginning of the capital letter is a constructor. The lowercase letters start with ordinary functions, and the new operator should not be used to operate ordinary functions.
##### 4.object
```
const foo = {a: 0, b: 1}
const bar = JSON.parse(JSON.stringify(foo))
const foo = {a: 0, b: 1}
const bar = {...foo, c: 2}
const foo = {a: 3}
Object.assign(foo, {b: 4})
const myMap = new Map([])
for (let [key, value] of myMap.entries()) {
// ...
}
```
##### 5.module
Unified management of project modules using import / export.
```
// lib.js
export default {}
// app.js
import app from './lib'
```
Import is placed at the top of the file.
If the module has only one output value, use `export default`,otherwise no.
## HTML / CSS
##### 1.Label
Do not write the type attribute when referencing external CSS or JavaScript. The HTML5 default type is the text/css and text/javascript properties, so there is no need to specify them.
```
<link rel="stylesheet" href="//www.test.com/css/test.css">
<script src="//www.test.com/js/test.js"></script>
```
##### 2.Naming
The naming of Class and ID should be semantic, and you can see what you are doing by looking at the name; multiple words are connected by a link.
```
// positive example
.test-header{
font-size: 20px;
}
```
##### 3.Attribute abbreviation
CSS attributes use abbreviations as much as possible to improve the efficiency and ease of understanding of the code.
```
// counter example
border-width: 1px;
border-style: solid;
border-color: #ccc;
// positive example
border: 1px solid #ccc;
```
##### 4.Document type
The HTML5 standard should always be used.
```
<!DOCTYPE html>
```
##### 5.Notes
A block comment should be written to a module file.
```
/**
* @module mazey/api
* @author Mazey <mazey@mazey.net>
* @description test.
* */
```
## interface
##### All interfaces are returned as Promise
Note that non-zero is wrong for catching catch
```
const test = () => {
return new Promise((resolve, reject) => {
resolve({
a:1
})
})
}
// transfer
test.then(res => {
console.log(res)
// {a:1}
})
```
Normal return
```
{
code:0,
data:{}
msg:'success'
}
```
错误返回
```
{
code:10000,
data:{}
msg:'failed'
}
```
##### Related interface path
dag related interface `src/js/conf/home/store/dag/actions.js`
Data Source Center Related Interfaces `src/js/conf/home/store/datasource/actions.js`
Project Management Related Interfaces `src/js/conf/home/store/projects/actions.js`
Resource Center Related Interfaces `src/js/conf/home/store/resource/actions.js`
Security Center Related Interfaces `src/js/conf/home/store/security/actions.js`
User Center Related Interfaces `src/js/conf/home/store/user/actions.js`
## Extended development
##### 1.Add node
(1) First place the icon icon of the node in the `src/js/conf/home/pages/dag/img `folder, and note the English name of the node defined by the `toolbar_${in the background. For example: SHELL}.png`
(2) Find the `tasksType` object in `src/js/conf/home/pages/dag/_source/config.js` and add it to it.
```
'DEPENDENT': { // The background definition node type English name is used as the key value
desc: 'DEPENDENT', // tooltip desc
color: '#2FBFD8' // The color represented is mainly used for tree and gantt
}
```
(3) Add a `${node type (lowercase)}`.vue file in `src/js/conf/home/pages/dag/_source/formModel/tasks`. The contents of the components related to the current node are written here. Must belong to a node component must have a function _verification () After the verification is successful, the relevant data of the current component is thrown to the parent component.
```
/**
* Verification
*/
_verification () {
// datasource subcomponent verification
if (!this.$refs.refDs._verifDatasource()) {
return false
}
// verification function
if (!this.method) {
this.$message.warning(`${i18n.$t('Please enter method')}`)
return false
}
// localParams subcomponent validation
if (!this.$refs.refLocalParams._verifProp()) {
return false
}
// store
this.$emit('on-params', {
type: this.type,
datasource: this.datasource,
method: this.method,
localParams: this.localParams
})
return true
}
```
(4) Common components used inside the node component are under` _source`, and `commcon.js` is used to configure public data.
##### 2.Increase the status type
(1) Find the `tasksState` object in `src/js/conf/home/pages/dag/_source/config.js` and add it to it.
```
'WAITTING_DEPEND': { // 'WAITTING_DEPEND': { //后端定义状态类型 前端用作key值
id: 11, // front-end definition id is used as a sort
desc: `${i18n.$t('waiting for dependency')}`, // tooltip desc
color: '#5101be', // The color represented is mainly used for tree and gantt
icoUnicode: '&#xe68c;', // font icon
isSpin: false // whether to rotate (requires code judgment)
}
```
##### 3.Add the action bar tool
(1) Find the `toolOper` object in `src/js/conf/home/pages/dag/_source/config.js` and add it to it.
```
{
code: 'pointer', // tool identifier
icon: '&#xe781;', // tool icon
disable: disable, // disable
desc: `${i18n.$t('Drag node and selected item')}` // tooltip desc
}
```
(2) Tool classes are returned as a constructor `src/js/conf/home/pages/dag/_source/plugIn`
`downChart.js` => dag image download processing
`dragZoom.js` => mouse zoom effect processing
`jsPlumbHandle.js` => drag and drop line processing
`util.js` => belongs to the `plugIn` tool class
The operation is handled in the `src/js/conf/home/pages/dag/_source/dag.js` => `toolbarEvent` event.
##### 3.Add a routing page
(1) First add a routing address`src/js/conf/home/router/index.js` in route management
```
routing address{
path: '/test', // routing address
name: 'test', // alias
component: resolve => require(['../pages/test/index'], resolve), // route corresponding component entry file
meta: {
title: `${i18n.$t('test')} - EasyScheduler` // title display
}
},
```
(2)Create a `test` folder in `src/js/conf/home/pages` and create an `index.vue `entry file in the folder.
This will give you direct access to`http://localhost:8888/#/test`
##### 4.Increase the preset mailbox
Find the `src/lib/localData/email.js` startup and timed email address input to automatically pull down the match.
```
export default ["test@analysys.com.cn","test1@analysys.com.cn","test3@analysys.com.cn"]
```
##### 5.Authority management and disabled state processing
The permission gives the userType according to the backUser interface `getUserInfo` interface: `"ADMIN_USER/GENERAL_USER" `permission to control whether the page operation button is `disabled`.
specific operation:`src/js/module/permissions/index.js`
disabled processing:`src/js/module/mixin/disabledState.js`

BIN
docs/en_US/images/auth-project.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

BIN
docs/en_US/images/complement.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 359 KiB

BIN
docs/en_US/images/depend-b-and-c.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 509 KiB

BIN
docs/en_US/images/depend-last-tuesday.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 549 KiB

BIN
docs/en_US/images/depend-week.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 504 KiB

BIN
docs/en_US/images/save-definition.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 287 KiB

BIN
docs/en_US/images/save-global-parameters.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 83 KiB

BIN
docs/en_US/images/start-process.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 332 KiB

BIN
docs/en_US/images/timing.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

53
docs/en_US/quick-start.md

@ -0,0 +1,53 @@
# Quick Start
* Administrator user login
> Address:192.168.xx.xx:8888 Username and password:admin/escheduler123
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/61701549-ee738000-ad70-11e9-8d75-87ce04a0152f.png" width="60%" />
</p>
* Create queue
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/61701943-896c5a00-ad71-11e9-99b8-a279762f1bc8.png" width="60%" />
</p>
* Create tenant
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/61702051-bb7dbc00-ad71-11e9-86e1-1c328cafe916.png" width="60%" />
</p>
* Creating Ordinary Users
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61704402-3517a900-ad76-11e9-865a-6325041d97e2.png" width="60%" />
</p>
* Create an alarm group
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61704553-845dd980-ad76-11e9-85f1-05f33111409e.png" width="60%" />
</p>
* Log in with regular users
> Click on the user name in the upper right corner to "exit" and re-use the normal user login.
* Project Management - > Create Project - > Click on Project Name
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61704688-dd2d7200-ad76-11e9-82ee-0833b16bd88f.png" width="60%" />
</p>
* Click Workflow Definition - > Create Workflow Definition - > Online Process Definition
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61705638-c425c080-ad78-11e9-8619-6c21b61a24c9.png" width="60%" />
</p>
* Running Process Definition - > Click Workflow Instance - > Click Process Instance Name - > Double-click Task Node - > View Task Execution Log
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61705356-34801200-ad78-11e9-8d60-9b7494231028.png" width="60%" />
</p>

699
docs/en_US/system-manual.md

@ -0,0 +1,699 @@
# System Use Manual
## Operational Guidelines
### Create a project
- Click "Project - > Create Project", enter project name, description, and click "Submit" to create a new project.
- Click on the project name to enter the project home page.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61776719-2ee50380-ae2e-11e9-9d11-41de8907efb5.png" width="60%" />
</p>
> Project Home Page contains task status statistics, process status statistics.
- Task State Statistics: It refers to the statistics of the number of tasks to be run, failed, running, completed and succeeded in a given time frame.
- Process State Statistics: It refers to the statistics of the number of waiting, failing, running, completing and succeeding process instances in a specified time range.
- Process Definition Statistics: The process definition created by the user and the process definition granted by the administrator to the user are counted.
### Creating Process definitions
- Go to the project home page, click "Process definitions" and enter the list page of process definition.
- Click "Create process" to create a new process definition.
- Drag the "SHELL" node to the canvas and add a shell task.
- Fill in the Node Name, Description, and Script fields.
- Selecting "task priority" will give priority to high-level tasks in the execution queue. Tasks with the same priority will be executed in the first-in-first-out order.
- Timeout alarm. Fill in "Overtime Time". When the task execution time exceeds the overtime, it can alarm and fail over time.
- Fill in "Custom Parameters" and refer to [Custom Parameters](#Custom Parameters)
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61778402-42459e00-ae31-11e9-96c6-8fd7fed8fed2.png" width="60%" />
</p>
- Increase the order of execution between nodes: click "line connection". As shown, task 1 and task 3 are executed in parallel. When task 1 is executed, task 2 and task 3 are executed simultaneously.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61778247-f98de500-ae30-11e9-8f11-cce0530c3ff2.png" width="60%" />
</p>
- Delete dependencies: Click on the arrow icon to "drag nodes and select items", select the connection line, click on the delete icon to delete dependencies between nodes.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61778800-052ddb80-ae32-11e9-8ac0-4f13466d3515.png" width="60%" />
</p>
- Click "Save", enter the name of the process definition, the description of the process definition, and set the global parameters.
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/save-definition.png" width="60%" />
</p>
- For other types of nodes, refer to [task node types and parameter settings](#task node types and parameter settings)
### Execution process definition
- **The process definition of the off-line state can be edited, but not run**, so the on-line workflow is the first step.
> Click on the Process definition, return to the list of process definitions, click on the icon "online", online process definition.
> Before setting workflow offline, the timed tasks in timed management should be offline, so that the definition of workflow can be set offline successfully.
- Click "Run" to execute the process. Description of operation parameters:
* Failure strategy:**When a task node fails to execute, other parallel task nodes need to execute the strategy**。”Continue "Representation: Other task nodes perform normally", "End" Representation: Terminate all ongoing tasks and terminate the entire process.
* Notification strategy:When the process is over, send process execution information notification mail according to the process status.
* Process priority: The priority of process running is divided into five levels:the highest, the high, the medium, the low, and the lowest . High-level processes are executed first in the execution queue, and processes with the same priority are executed first in first out order.
* Worker group: This process can only be executed in a specified machine group. Default, by default, can be executed on any worker.
* Notification group: When the process ends or fault tolerance occurs, process information is sent to all members of the notification group by mail.
* Recipient: Enter the mailbox and press Enter key to save. When the process ends and fault tolerance occurs, an alert message is sent to the recipient list.
* Cc: Enter the mailbox and press Enter key to save. When the process is over and fault-tolerant occurs, alarm messages are copied to the copier list.
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/start-process.png" width="60%" />
</p>
* Complement: To implement the workflow definition of a specified date, you can select the time range of the complement (currently only support for continuous days), such as the data from May 1 to May 10, as shown in the figure:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/complement.png" width="60%" />
</p>
> Complement execution mode includes serial execution and parallel execution. In serial mode, the complement will be executed sequentially from May 1 to May 10. In parallel mode, the tasks from May 1 to May 10 will be executed simultaneously.
### Timing Process Definition
- Create Timing: "Process Definition - > Timing"
- Choose start-stop time, in the start-stop time range, regular normal work, beyond the scope, will not continue to produce timed workflow instances.
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/timing.png" width="60%" />
</p>
- Add a timer to be executed once a day at 5:00 a.m. as shown below:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61781968-d9adef80-ae37-11e9-9e90-3d9f0b3eb998.png" width="60%" />
</p>
- Timely online,**the newly created timer is offline. You need to click "Timing Management - >online" to work properly.**
### View process instances
> Click on "Process Instances" to view the list of process instances.
> Click on the process name to see the status of task execution.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61855837-6ff31b80-aef3-11e9-8464-2fb5773709df.png" width="60%" />
</p>
> Click on the task node, click "View Log" to view the task execution log.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783070-bdab4d80-ae39-11e9-9ada-355614fbb7f7.png" width="60%" />
</p>
> Click on the task instance node, click **View History** to view the list of task instances that the process instance runs.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783240-05ca7000-ae3a-11e9-8c10-591a7635834a.png" width="60%" />
</p>
> Operations on workflow instances:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783291-21357b00-ae3a-11e9-837c-fc3d85404410.png" width="60%" />
</p>
* Editor: You can edit the terminated process. When you save it after editing, you can choose whether to update the process definition or not.
* Rerun: A process that has been terminated can be re-executed.
* Recovery failure: For a failed process, a recovery failure operation can be performed, starting at the failed node.
* Stop: Stop the running process, the background will `kill` he worker process first, then `kill -9` operation.
* Pause:The running process can be **suspended**, the system state becomes **waiting to be executed**, waiting for the end of the task being executed, and suspending the next task to be executed.
* Restore pause: **The suspended process** can be restored and run directly from the suspended node
* Delete: Delete process instances and task instances under process instances
* Gantt diagram: The vertical axis of Gantt diagram is the topological ordering of task instances under a process instance, and the horizontal axis is the running time of task instances, as shown in the figure:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783596-aa4cb200-ae3a-11e9-9798-e795f80dae96.png" width="60%" />
</p>
### View task instances
> Click on "Task Instance" to enter the Task List page and query the performance of the task.
>
>
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783544-91dc9780-ae3a-11e9-9dca-dfd901f1fe83.png" width="60%" />
</p>
> Click "View Log" in the action column to view the log of task execution.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783441-60fc6280-ae3a-11e9-8631-963dcf78467b.png" width="60%" />
</p>
### Create data source
> Data Source Center supports MySQL, POSTGRESQL, HIVE and Spark data sources.
#### Create and edit MySQL data source
- Click on "Datasource - > Create Datasources" to create different types of datasources according to requirements.
- Datasource: Select MYSQL
- Datasource Name: Name of Input Datasource
- Description: Description of input datasources
- IP: Enter the IP to connect to MySQL
- Port: Enter the port to connect MySQL
- User name: Set the username to connect to MySQL
- Password: Set the password to connect to MySQL
- Database name: Enter the name of the database connecting MySQL
- Jdbc connection parameters: parameter settings for MySQL connections, filled in as JSON
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783812-129b9380-ae3b-11e9-9b9c-77870371c5f3.png" width="60%" />
</p>
> Click "Test Connect" to test whether the data source can be successfully connected.
>
>
#### Create and edit POSTGRESQL data source
- Datasource: Select POSTGRESQL
- Datasource Name: Name of Input Data Source
- Description: Description of input data sources
- IP: Enter IP to connect to POSTGRESQL
- Port: Input port to connect POSTGRESQL
- Username: Set the username to connect to POSTGRESQL
- Password: Set the password to connect to POSTGRESQL
- Database name: Enter the name of the database connecting to POSTGRESQL
- Jdbc connection parameters: parameter settings for POSTGRESQL connections, filled in as JSON
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61783968-60180080-ae3b-11e9-91b7-36d49246a205.png" width="60%" />
</p>
#### Create and edit HIVE data source
1.Connect with HiveServer 2
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61784129-b9802f80-ae3b-11e9-8a27-7be23e0953be.png" width="60%" />
</p>
- Datasource: Select HIVE
- Datasource Name: Name of Input Datasource
- Description: Description of input datasources
- IP: Enter IP to connect to HIVE
- Port: Input port to connect to HIVE
- Username: Set the username to connect to HIVE
- Password: Set the password to connect to HIVE
- Database Name: Enter the name of the database connecting to HIVE
- Jdbc connection parameters: parameter settings for HIVE connections, filled in in as JSON
2.Connect using Hive Server 2 HA Zookeeper mode
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61784420-3dd2b280-ae3c-11e9-894a-5b896863d37a.png" width="60%" />
</p>
Note: If **kerberos** is turned on, you need to fill in **Principal**
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61784847-0adcee80-ae3d-11e9-8ac7-ba8a13aef90c.png" width="60%" />
</p>
#### Create and Edit Datasource
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/61853431-7af77d00-aeee-11e9-8e2e-95ba6cea43c8.png" width="60%" />
</p>
- Datasource: Select Spark
- Datasource Name: Name of Input Datasource
- Description: Description of input datasources
- IP: Enter the IP to connect to Spark
- Port: Input port to connect Spark
- Username: Set the username to connect to Spark
- Password: Set the password to connect to Spark
- Database name: Enter the name of the database connecting to Spark
- Jdbc Connection Parameters: Parameter settings for Spark Connections, filled in as JSON
Note: If **kerberos** If Kerberos is turned on, you need to fill in **Principal**
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/61853668-0709a480-aeef-11e9-8960-92107dd1a9ca.png" width="60%" />
</p>
### Upload Resources
- Upload resource files and udf functions, all uploaded files and resources will be stored on hdfs, so the following configuration items are required:
```
conf/common/common.properties
-- hdfs.startup.state=true
conf/common/hadoop.properties
-- fs.defaultFS=hdfs://xxxx:8020
-- yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx
-- yarn.application.status.address=http://xxxx:8088/ws/v1/cluster/apps/%s
```
#### File Manage
> It is the management of various resource files, including creating basic txt/log/sh/conf files, uploading jar packages and other types of files, editing, downloading, deleting and other operations.
>
>
> <p align="center">
> <img src="https://user-images.githubusercontent.com/53217792/61785274-ed5c5480-ae3d-11e9-8461-2178f49b228d.png" width="60%" />
> </p>
* Create file
> File formats support the following types:txt、log、sh、conf、cfg、py、java、sql、xml、hql
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61841049-f133b980-aec5-11e9-8ac8-db97cdccc599.png" width="60%" />
</p>
* Upload Files
> Upload Files: Click the Upload button to upload, drag the file to the upload area, and the file name will automatically complete the uploaded file name.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61841179-73bc7900-aec6-11e9-8780-28756e684754.png" width="60%" />
</p>
* File View
> For viewable file types, click on the file name to view file details
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61841247-9cdd0980-aec6-11e9-9f6f-0a7dd145f865.png" width="60%" />
</p>
* Download files
> You can download a file by clicking the download button in the top right corner of the file details, or by downloading the file under the download button after the file list.
* File rename
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61841322-f47b7500-aec6-11e9-93b1-b00328e7b69e.png" width="60%" />
</p>
#### Delete
> File List - > Click the Delete button to delete the specified file
#### Resource management
> Resource management and file management functions are similar. The difference is that resource management is the UDF function of uploading, and file management uploads user programs, scripts and configuration files.
* Upload UDF resources
> The same as uploading files.
#### Function management
* Create UDF Functions
> Click "Create UDF Function", enter parameters of udf function, select UDF resources, and click "Submit" to create udf function.
>
>
>
> Currently only temporary udf functions for HIVE are supported
>
>
>
> - UDF function name: name when entering UDF Function
> - Package Name: Full Path of Input UDF Function
> - Parameter: Input parameters used to annotate functions
> - Database Name: Reserved Field for Creating Permanent UDF Functions
> - UDF Resources: Set up the resource files corresponding to the created UDF
>
>
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61841562-c6e2fb80-aec7-11e9-9481-4202d63dab6f.png" width="60%" />
</p>
## Security
- The security has the functions of queue management, tenant management, user management, warning group management, worker group manager, token manage and other functions. It can also authorize resources, data sources, projects, etc.
- Administrator login, default username password: admin/escheduler 123
### Create queues
- Queues are used to execute spark, mapreduce and other programs, which require the use of "queue" parameters.
- "Security" - > "Queue Manage" - > "Creat Queue"
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61841945-078f4480-aec9-11e9-92fb-05b6f42f07d6.png" width="60%" />
</p>
### Create Tenants
- The tenant corresponds to the account of Linux, which is used by the worker server to submit jobs. If Linux does not have this user, the worker would create the account when executing the task.
- Tenant Code:**the tenant code is the only account on Linux that can't be duplicated.**
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61842372-8042d080-aeca-11e9-8c54-e3dee583eeff.png" width="60%" />
</p>
### Create Ordinary Users
- User types are **ordinary users** and **administrator users**..
* Administrators have **authorization and user management** privileges, and no privileges to **create project and process-defined operations**.
* Ordinary users can **create projects and create, edit, and execute process definitions**.
* Note: **If the user switches the tenant, all resources under the tenant will be copied to the switched new tenant.**
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61842461-da439600-aeca-11e9-98e3-f8327dbafa60.png" width="60%" />
</p>
### Create alarm group
* The alarm group is a parameter set at start-up. After the process is finished, the status of the process and other information will be sent to the alarm group by mail.
* New and Editorial Warning Group
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61842553-34445b80-aecb-11e9-84a8-3cc66b6c6135.png" width="60%" />
</p>
### Create Worker Group
- Worker group provides a mechanism for tasks to run on a specified worker. Administrators create worker groups, which can be specified in task nodes and operation parameters. If the specified grouping is deleted or no grouping is specified, the task will run on any worker.
- Multiple IP addresses within a worker group (**aliases can not be written**), separated by **commas in English**
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61842630-6b1a7180-aecb-11e9-8988-b4444de16b36.png" width="60%" />
</p>
### Token manage
- Because the back-end interface has login check and token management, it provides a way to operate the system by calling the interface.
- Call examples:
```令牌调用示例
/**
* test token
*/
public void doPOSTParam()throws Exception{
// create HttpClient
CloseableHttpClient httpclient = HttpClients.createDefault();
// create http post request
HttpPost httpPost = new HttpPost("http://127.0.0.1:12345/escheduler/projects/create");
httpPost.setHeader("token", "123");
// set parameters
List<NameValuePair> parameters = new ArrayList<NameValuePair>();
parameters.add(new BasicNameValuePair("projectName", "qzw"));
parameters.add(new BasicNameValuePair("desc", "qzw"));
UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(parameters);
httpPost.setEntity(formEntity);
CloseableHttpResponse response = null;
try {
// execute
response = httpclient.execute(httpPost);
// eponse status code 200
if (response.getStatusLine().getStatusCode() == 200) {
String content = EntityUtils.toString(response.getEntity(), "UTF-8");
System.out.println(content);
}
} finally {
if (response != null) {
response.close();
}
httpclient.close();
}
}
```
### Grant authority
- Granting permissions includes project permissions, resource permissions, datasource permissions, UDF Function permissions.
> Administrators can authorize projects, resources, data sources and UDF Functions that are not created by ordinary users. Because project, resource, data source and UDF Function are all authorized in the same way, the project authorization is introduced as an example.
> Note:For projects created by the user himself, the user has all the permissions. The list of items and the list of selected items will not be reflected
- 1.Click on the authorization button of the designated person as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843204-71a9e880-aecd-11e9-83ad-365d7bf99375.png" width="60%" />
</p>
- 2.Select the project button to authorize the project
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/auth-project.png" width="60%" />
</p>
### Monitor center
- Service management is mainly to monitor and display the health status and basic information of each service in the system.
#### Master monitor
- Mainly related information about master.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843245-8edeb700-aecd-11e9-9916-ea50080e7d08.png" width="60%" />
</p>
#### Worker monitor
- Mainly related information of worker.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843277-ae75df80-aecd-11e9-9667-b9f1615b6f3b.png" width="60%" />
</p>
#### Zookeeper monitor
- Mainly the configuration information of each worker and master in zookpeeper.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843323-c64d6380-aecd-11e9-8392-1ca9b84cd794.png" width="60%" />
</p>
#### Mysql monitor
- Mainly the health status of mysql
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843358-e11fd800-aecd-11e9-86d1-9490e48dc955.png" width="60%" />
</p>
## Task Node Type and Parameter Setting
### Shell
- The shell node, when the worker executes, generates a temporary shell script, which is executed by a Linux user with the same name as the tenant.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_SHELL.png) task node in the toolbar onto the palette and double-click the task node as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843728-6788e980-aecf-11e9-8006-241a7ec5024b.png" width="60%" />
</p>`
- Node name: The node name in a process definition is unique
- Run flag: Identify whether the node can be scheduled properly, and if it does not need to be executed, you can turn on the forbidden execution switch.
- Description : Describes the function of the node
- Number of failed retries: Number of failed task submissions, support drop-down and manual filling
- Failure Retry Interval: Interval between tasks that fail to resubmit tasks, support drop-down and manual filling
- Script: User-developed SHELL program
- Resources: A list of resource files that need to be invoked in a script
- Custom parameters: User-defined parameters that are part of SHELL replace the contents of scripts with ${variables}
### SUB_PROCESS
- The sub-process node is to execute an external workflow definition as an task node.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs_cn/images/toolbar_SUB_PROCESS.png) task node in the toolbar onto the palette and double-click the task node as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61843799-adde4880-aecf-11e9-846e-f1696107029f.png" width="60%" />
</p>
- Node name: The node name in a process definition is unique
- Run flag: Identify whether the node is scheduled properly
- Description: Describes the function of the node
- Sub-node: The process definition of the selected sub-process is selected, and the process definition of the selected sub-process can be jumped to by entering the sub-node in the upper right corner.
### DEPENDENT
- Dependent nodes are **dependent checking nodes**. For example, process A depends on the successful execution of process B yesterday, and the dependent node checks whether process B has a successful execution instance yesterday.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_DEPENDENT.png) ask node in the toolbar onto the palette and double-click the task node as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61844369-be8fbe00-aed1-11e9-965d-ddb9aeeba9db.png" width="60%" />
</p>
> Dependent nodes provide logical judgment functions, such as checking whether yesterday's B process was successful or whether the C process was successfully executed.
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/depend-b-and-c.png" width="80%" />
</p>
> For example, process A is a weekly task and process B and C are daily tasks. Task A requires that task B and C be successfully executed every day of the last week, as shown in the figure:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/depend-week.png" width="80%" />
</p>
> If weekly A also needs to be implemented successfully on Tuesday:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/depend-last-tuesday.png" width="80%" />
</p>
### PROCEDURE
- The procedure is executed according to the selected data source.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_PROCEDURE.png) task node in the toolbar onto the palette and double-click the task node as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61844464-1af2dd80-aed2-11e9-9486-6cf1b8585aa5.png" width="60%" />
</p>
- Datasource: The data source type of stored procedure supports MySQL and POSTGRRESQL, and chooses the corresponding data source.
- Method: The method name of the stored procedure
- Custom parameters: Custom parameter types of stored procedures support IN and OUT, and data types support nine data types: VARCHAR, INTEGER, LONG, FLOAT, DOUBLE, DATE, TIME, TIMESTAMP and BOOLEAN.
### SQL
- Execute non-query SQL functionality
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61850397-d7569e80-aee6-11e9-9da0-c4d96deaa8a1.png" width="60%" />
</p>
- Executing the query SQL function, you can choose to send mail in the form of tables and attachments to the designated recipients.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_SQL.png) task node in the toolbar onto the palette and double-click the task node as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61850594-4d5b0580-aee7-11e9-9c9e-1934c91962b9.png" width="60%" />
</p>
- Datasource: Select the corresponding datasource
- sql type: support query and non-query, query is select type query, there is a result set returned, you can specify mail notification as table, attachment or table attachment three templates. Non-query is not returned by result set, and is for update, delete, insert three types of operations
- sql parameter: input parameter format is key1 = value1; key2 = value2...
- sql statement: SQL statement
- UDF function: For HIVE type data sources, you can refer to UDF functions created in the resource center, other types of data sources do not support UDF functions for the time being.
- Custom parameters: SQL task type, and stored procedure is to customize the order of parameters to set values for methods. Custom parameter type and data type are the same as stored procedure task type. The difference is that the custom parameter of the SQL task type replaces the ${variable} in the SQL statement.
### SPARK
- Through SPARK node, SPARK program can be directly executed. For spark node, worker will use `spark-submit` mode to submit tasks.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_SPARK.png) task node in the toolbar onto the palette and double-click the task node as follows:
>
>
<p align="center">
<img src="https://user-images.githubusercontent.com/48329107/61852935-3d462480-aeed-11e9-8241-415314bfc2e5.png" width="60%" />
</p>
- Program Type: Support JAVA, Scala and Python
- Class of the main function: The full path of Main Class, the entry to the Spark program
- Master jar package: It's Spark's jar package
- Deployment: support three modes: yarn-cluster, yarn-client, and local
- Driver Kernel Number: Driver Kernel Number and Memory Number can be set
- Executor Number: Executor Number, Executor Memory Number and Executor Kernel Number can be set
- Command Line Parameters: Setting the input parameters of Spark program to support the replacement of custom parameter variables.
- Other parameters: support - jars, - files, - archives, - conf format
- Resource: If a resource file is referenced in other parameters, you need to select the specified resource.
- Custom parameters: User-defined parameters in MR locality that replace the contents in scripts with ${variables}
Note: JAVA and Scala are just used for identification, no difference. If it's a Spark developed by Python, there's no class of the main function, and everything else is the same.
### MapReduce(MR)
- Using MR nodes, MR programs can be executed directly. For Mr nodes, worker submits tasks using `hadoop jar`
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_MR.png) task node in the toolbar onto the palette and double-click the task node as follows:
1. JAVA program
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61851102-91023f00-aee8-11e9-9ac0-dbe588d860c2.png" width="60%" />
</p>
- Class of the main function: The full path of the MR program's entry Main Class
- Program Type: Select JAVA Language
- Master jar package: MR jar package
- Command Line Parameters: Setting the input parameters of MR program to support the replacement of custom parameter variables
- Other parameters: support - D, - files, - libjars, - archives format
- Resource: If a resource file is referenced in other parameters, you need to select the specified resource.
- Custom parameters: User-defined parameters in MR locality that replace the contents in scripts with ${variables}
2. Python program
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61851224-f3f3d600-aee8-11e9-8862-435220bbda93.png" width="60%" />
</p>
- Program Type: Select Python Language
- Main jar package: Python jar package running MR
- Other parameters: support - D, - mapper, - reducer, - input - output format, where user-defined parameters can be set, such as:
- mapper "mapper.py 1" - file mapper.py-reducer reducer.py-file reducer.py-input/journey/words.txt-output/journey/out/mr/${current TimeMillis}
- Among them, mapper. py 1 after - mapper is two parameters, the first parameter is mapper. py, and the second parameter is 1.
- Resource: If a resource file is referenced in other parameters, you need to select the specified resource.
- Custom parameters: User-defined parameters in MR locality that replace the contents in scripts with ${variables}
### Python
- With Python nodes, Python scripts can be executed directly. For Python nodes, worker will use `python ** `to submit tasks.
> Drag the ![PNG](https://analysys.github.io/easyscheduler_docs/images/toolbar_PYTHON.png) task node in the toolbar onto the palette and double-click the task node as follows:
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61851959-daec2480-aeea-11e9-83fd-3e00a030cb84.png" width="60%" />
</p>
- Script: User-developed Python program
- Resource: A list of resource files that need to be invoked in a script
- Custom parameters: User-defined parameters that are part of Python that replace the contents in the script with ${variables}
### System parameter
<table>
<tr><th>variable</th><th>meaning</th></tr>
<tr>
<td>${system.biz.date}</td>
<td>The timing time of routine dispatching instance is one day before, in yyyyyMMdd format. When data is supplemented, the date + 1</td>
</tr>
<tr>
<td>${system.biz.curdate}</td>
<td> Daily scheduling example timing time, format is yyyyyMMdd, when supplementing data, the date + 1</td>
</tr>
<tr>
<td>${system.datetime}</td>
<td>Daily scheduling example timing time, format is yyyyyMMddHmmss, when supplementing data, the date + 1</td>
</tr>
</table>
### Time Customization Parameters
> Support code to customize the variable name, declaration: ${variable name}. It can refer to "system parameters" or specify "constants".
> When we define this benchmark variable as $[...], [yyyyMMddHHmmss] can be decomposed and combined arbitrarily, such as:$[yyyyMMdd], $[HHmmss], $[yyyy-MM-dd] ,etc.
> Can also do this:
>
>
- Later N years: $[add_months (yyyyyyMMdd, 12*N)]
- The previous N years: $[add_months (yyyyyyMMdd, -12*N)]
- Later N months: $[add_months (yyyyyMMdd, N)]
- The first N months: $[add_months (yyyyyyMMdd, -N)]
- Later N weeks: $[yyyyyyMMdd + 7*N]
- The first N weeks: $[yyyyyMMdd-7*N]
- The day after that: $[yyyyyyMMdd + N]
- The day before yesterday: $[yyyyyMMdd-N]
- Later N hours: $[HHmmss + N/24]
- First N hours: $[HHmmss-N/24]
- After N minutes: $[HHmmss + N/24/60]
- First N minutes: $[HHmmss-N/24/60]
### User-defined parameters
> User-defined parameters are divided into global parameters and local parameters. Global parameters are the global parameters passed when the process definition and process instance are saved. Global parameters can be referenced by local parameters of any task node in the whole process.
> For example:
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs/images/save-global-parameters.png" width="60%" />
</p>
> global_bizdate is a global parameter, referring to system parameters.
<p align="center">
<img src="https://user-images.githubusercontent.com/53217792/61857313-78992100-aef6-11e9-9ba3-521c6ca33ce3.png" width="60%" />
</p>
> In tasks, local_param_bizdate refers to global parameters by ${global_bizdate} for scripts, the value of variable local_param_bizdate can be referenced by${local_param_bizdate}, or the value of local_param_bizdate can be set directly by JDBC.

39
docs/en_US/upgrade.md

@ -0,0 +1,39 @@
# EasyScheduler upgrade documentation
## 1. Back up the previous version of the files and database
## 2. Stop all services of escheduler
`sh ./script/stop-all.sh`
## 3. Download the new version of the installation package
- [gitee](https://gitee.com/easyscheduler/EasyScheduler/attach_files), download the latest version of the front and back installation packages (backend referred to as escheduler-backend, front end referred to as escheduler-ui)
- The following upgrade operations need to be performed in the new version of the directory
## 4. Database upgrade
- Modify the following properties in conf/dao/data_source.properties
```
spring.datasource.url
spring.datasource.username
spring.datasource.password
```
- Execute database upgrade script
`sh ./script/upgrade-escheduler.sh`
## 5. Backend service upgrade
- Modify the content of the install.sh configuration and execute the upgrade script
`sh install.sh`
## 6. Frontend service upgrade
- Overwrite the previous version of the dist directory
- Restart the nginx service
`systemctl restart nginx`

30
docs/zh_CN/1.0.3-release.md

@ -0,0 +1,30 @@
Easy Scheduler Release 1.0.3
===
Easy Scheduler 1.0.3是1.x系列中的第四个版本。
增强:
===
- [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql任务中的邮件标题增加了对自定义变量的支持
- [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql任务中的发邮件失败,则此sql任务为失败
- [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)修改sql任务中自定义变量的替换规则,支持多个单引号和双引号的替换
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)创建资源文件时,增加对该资源文件是否在hdfs上已存在的验证
修复:
===
- [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) 流程定义列表根据定时状态和更新时间进行排序
- [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) 修复在线创建文件,hdfs文件未创建,却返回成功
- [[EasyScheduler-481]](https://github.com/analysys/EasyScheduler/issues/481)修复job不存在定时无法下线的问题
- [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kill任务时增加对其子进程的kill
- [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) 修复更新资源文件时更新时间和大小未更新的问题
- [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) 修复删除租户时,如果未启动hdfs,则删除租户失败的问题
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/486) shell进程退出,yarn状态非终态等待判断
感谢:
===
最后但最重要的是,没有以下伙伴的贡献就没有新版本的诞生:
Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879,
feloxx, coding-now, hymzcn, nysyxxg, chgxtony
以及微信群里众多的热心伙伴!在此非常感谢!

28
docs/zh_CN/1.0.4-release.md

@ -0,0 +1,28 @@
Easy Scheduler Release 1.0.4
===
Easy Scheduler 1.0.4是1.x系列中的第五个版本。
**修复**:
- [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) 流程定义列表根据定时状态和更新时间进行排序
- [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) 修复在线创建文件,hdfs文件未创建,却返回成功
- [[EasyScheduler-481]](https://github.com/analysys/EasyScheduler/issues/481)修复job不存在定时无法下线的问题
- [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kill任务时增加对其子进程的kill
- [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) 修复更新资源文件时更新时间和大小未更新的问题
- [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) 修复删除租户时,如果未启动hdfs,则删除租户失败的问题
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/486) shell进程退出,yarn状态非终态等待判断
**增强**:
- [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql任务中的邮件标题增加了对自定义变量的支持
- [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql任务中的发邮件失败,则此sql任务为失败
- [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)修改sql任务中自定义变量的替换规则,支持多个单引号和双引号的替换
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)创建资源文件时,增加对该资源文件是否在hdfs上已存在的验证
感谢:
===
最后但最重要的是,没有以下伙伴的贡献就没有新版本的诞生(排名不分先后):
Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879,
feloxx, coding-now, hymzcn, nysyxxg, chgxtony, lfyee, Crossoverrr, gj-zhang, sunnyingit, xianhu, zhengqiangtan
以及微信群/钉钉群里众多的热心伙伴!在此非常感谢!

23
docs/zh_CN/1.0.5-release.md

@ -0,0 +1,23 @@
Easy Scheduler Release 1.0.5
===
Easy Scheduler 1.0.5是1.x系列中的第六个版本。
增强:
===
- [[EasyScheduler-597]](https://github.com/analysys/EasyScheduler/issues/597)child process cannot extend father's receivers and cc
修复
===
- [[EasyScheduler-516]](https://github.com/analysys/EasyScheduler/issues/516)The task instance of MR cannot stop in some cases
- [[EasyScheduler-594]](https://github.com/analysys/EasyScheduler/issues/594)soft kill task 后 进程依旧存在(父进程 子进程)
感谢:
===
最后但最重要的是,没有以下伙伴的贡献就没有新版本的诞生:
Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, feloxx, coding-now, hymzcn, nysyxxg, chgxtony, gj-zhang, xianhu, sunnyingit,
zhengqiangtan, chinashenkai
以及微信群里众多的热心伙伴!在此非常感谢!

63
docs/zh_CN/1.1.0-release.md

@ -0,0 +1,63 @@
Easy Scheduler Release 1.1.0
===
Easy Scheduler 1.1.0是1.1.x系列中的第一个版本。
新特性:
===
- [[EasyScheduler-391](https://github.com/analysys/EasyScheduler/issues/391)] run a process under a specified tenement user
- [[EasyScheduler-288](https://github.com/analysys/EasyScheduler/issues/288)] Feature/qiye_weixin
- [[EasyScheduler-189](https://github.com/analysys/EasyScheduler/issues/189)] Kerberos等安全支持
- [[EasyScheduler-398](https://github.com/analysys/EasyScheduler/issues/398)]管理员,有租户(install.sh设置默认租户),可以创建资源、项目和数据源(限制有一个管理员)
- [[EasyScheduler-293](https://github.com/analysys/EasyScheduler/issues/293)]点击运行流程时候选择的参数,没有地方可查看,也没有保存
- [[EasyScheduler-401](https://github.com/analysys/EasyScheduler/issues/401)]定时很容易定时每秒一次,定时完成以后可以在页面显示一下下次触发时间
- [[EasyScheduler-493](https://github.com/analysys/EasyScheduler/pull/493)]add datasource kerberos auth and FAQ modify and add resource upload s3
增强:
===
- [[EasyScheduler-227](https://github.com/analysys/EasyScheduler/issues/227)] upgrade spring-boot to 2.1.x and spring to 5.x
- [[EasyScheduler-434](https://github.com/analysys/EasyScheduler/issues/434)] worker节点数量 zk和mysql中不一致
- [[EasyScheduler-435](https://github.com/analysys/EasyScheduler/issues/435)]邮箱格式的验证
- [[EasyScheduler-441](https://github.com/analysys/EasyScheduler/issues/441)] 禁止运行节点加入已完成节点检测
- [[EasyScheduler-400](https://github.com/analysys/EasyScheduler/issues/400)] 首页页面,队列统计不和谐,命令统计无数据
- [[EasyScheduler-395](https://github.com/analysys/EasyScheduler/issues/395)] 对于容错恢复的流程,状态不能为 **正在运行
- [[EasyScheduler-529](https://github.com/analysys/EasyScheduler/issues/529)] optimize poll task from zookeeper
- [[EasyScheduler-242](https://github.com/analysys/EasyScheduler/issues/242)]worker-server节点获取任务性能问题
- [[EasyScheduler-352](https://github.com/analysys/EasyScheduler/issues/352)]worker 分组, 队列消费问题
- [[EasyScheduler-461](https://github.com/analysys/EasyScheduler/issues/461)]查看数据源参数,需要加密账号密码信息
- [[EasyScheduler-396](https://github.com/analysys/EasyScheduler/issues/396)]Dockerfile优化,并关联Dockerfile和github实现自动打镜像
- [[EasyScheduler-389](https://github.com/analysys/EasyScheduler/issues/389)]service monitor cannot find the change of master/worker
- [[EasyScheduler-511](https://github.com/analysys/EasyScheduler/issues/511)]support recovery process from stop/kill nodes.
- [[EasyScheduler-399](https://github.com/analysys/EasyScheduler/issues/399)]HadoopUtils指定用户操作,而不是 **部署用户
- [[EasyScheduler-378](https://github.com/analysys/EasyScheduler/issues/378)]Mailbox regular match
- [[EasyScheduler-625](https://github.com/analysys/EasyScheduler/issues/625)]EasyScheduler call shell "task instance not set host"
- [[EasyScheduler-622](https://github.com/analysys/EasyScheduler/issues/622)]Front-end interface deployment k8s, background deployment big data cluster session error
修复:
===
- [[EasyScheduler-394](https://github.com/analysys/EasyScheduler/issues/394)] master&worker部署在同一台机器上时,如果重启master&worker服务,会导致之前调度的任务无法继续调度
- [[EasyScheduler-469](https://github.com/analysys/EasyScheduler/issues/469)]Fix naming errors,monitor page
- [[EasyScheduler-392](https://github.com/analysys/EasyScheduler/issues/392)]Feature request: fix email regex check
- [[EasyScheduler-405](https://github.com/analysys/EasyScheduler/issues/405)]定时修改/添加页面,开始时间和结束时间不能相同
- [[EasyScheduler-517](https://github.com/analysys/EasyScheduler/issues/517)]补数 - 子工作流 - 时间参数
- [[EasyScheduler-532](https://github.com/analysys/EasyScheduler/issues/532)]python节点不执行的问题
- [[EasyScheduler-543](https://github.com/analysys/EasyScheduler/issues/543)]optimize datasource connection params safety
- [[EasyScheduler-569](https://github.com/analysys/EasyScheduler/issues/569)]定时任务无法真正停止
- [[EasyScheduler-463](https://github.com/analysys/EasyScheduler/issues/463)]邮箱验证不支持非常见后缀邮箱
- [[EasyScheduler-650](https://github.com/analysys/EasyScheduler/issues/650)]Creating a hive data source without a principal will cause the connection to fail
- [[EasyScheduler-641](https://github.com/analysys/EasyScheduler/issues/641)]The cellphone is not supported for 199 telecom segment when create a user
- [[EasyScheduler-627](https://github.com/analysys/EasyScheduler/issues/627)]Different sql node task logs in parallel in the same workflow will be mixed
- [[EasyScheduler-655](https://github.com/analysys/EasyScheduler/issues/655)]when deploy a spark task,the tentant queue not empty,set with a empty queue name
- [[EasyScheduler-667](https://github.com/analysys/EasyScheduler/issues/667)]HivePreparedStatement can't print the actual SQL executed
感谢:
===
最后但最重要的是,没有以下伙伴的贡献就没有新版本的诞生:
Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, chgxtony, Stanfan, lfyee, thisnew, hujiang75277381, sunnyingit, lgbo-ustc, ivivi, lzy305, JackIllkid, telltime, lipengbo2018, wuchunfu, telltime, chenyuan9028, zhangzhipeng621, thisnew, 307526982, crazycarry
以及微信群里众多的热心伙伴!在此非常感谢!

287
docs/zh_CN/EasyScheduler-FAQ.md

@ -0,0 +1,287 @@
## Q:EasyScheduler服务介绍及建议运行内存
A: EasyScheduler由5个服务组成,MasterServer、WorkerServer、ApiServer、AlertServer、LoggerServer和UI。
| 服务 | 说明 |
| ------------------------- | ------------------------------------------------------------ |
| MasterServer | 主要负责 **DAG** 的切分和任务状态的监控 |
| WorkerServer/LoggerServer | 主要负责任务的提交、执行和任务状态的更新。LoggerServer用于Rest Api通过 **RPC** 查看日志 |
| ApiServer | 提供Rest Api服务,供UI进行调用 |
| AlertServer | 提供告警服务 |
| UI | 前端页面展示 |
注意:**由于服务比较多,建议单机部署最好是4核16G以上**
---
## Q: 管理员为什么不能创建项目
A:管理员目前属于"**纯管理**", 没有租户,即没有linux上对应的用户,所以没有执行权限, **故没有所属的项目、资源及数据源**,所以没有创建权限。**但是有所有的查看权限**。如果需要创建项目等业务操作,**请使用管理员创建租户和普通用户,然后使用普通用户登录进行操作**。我们将会在1.1.0版本中将管理员的创建和执行权限放开,管理员将会有所有的权限
---
## Q:系统支持哪些邮箱?
A:支持绝大多数邮箱,qq、163、126、139、outlook、aliyun等皆支持。支持**TLS和SSL**协议,可以在alert.properties中选择性配置
---
## Q:常用的系统变量时间参数有哪些,如何使用?
A:请参考 https://analysys.github.io/easyscheduler_docs_cn/%E7%B3%BB%E7%BB%9F%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C.html#%E7%B3%BB%E7%BB%9F%E5%8F%82%E6%95%B0
---
## Q:pip install kazoo 这个安装报错。是必须安装的吗?
A: 这个是python连接zookeeper需要使用到的,必须要安装
---
## Q: 怎么指定机器运行任务
A:使用 **管理员** 创建Worker分组,在 **流程定义启动** 的时候可**指定Worker分组**或者在**任务节点上指定Worker分组**。如果不指定,则使用Default,**Default默认是使用的集群里所有的Worker中随机选取一台来进行任务提交、执行**
---
## Q:任务的优先级
A:我们同时 **支持流程和任务的优先级**。优先级我们有 **HIGHEST、HIGH、MEDIUM、LOW和LOWEST** 五种级别。**可以设置不同流程实例之间的优先级,也可以设置同一个流程实例中不同任务实例的优先级**。详细内容请参考任务优先级设计 https://analysys.github.io/easyscheduler_docs_cn/%E7%B3%BB%E7%BB%9F%E6%9E%B6%E6%9E%84%E8%AE%BE%E8%AE%A1.html#%E7%B3%BB%E7%BB%9F%E6%9E%B6%E6%9E%84%E8%AE%BE%E8%AE%A1
----
## Q:escheduler-grpc报错
A:在根目录下执行:mvn -U clean package assembly:assembly -Dmaven.test.skip=true , 然后刷新下整个项目
----
## Q:EasyScheduler支持windows上运行么
A: 理论上只有**Worker是需要在Linux上运行的**,其它的服务都是可以在windows上正常运行的。但是还是建议最好能在linux上部署使用
-----
## Q:UI 在 linux 编译node-sass提示:Error:EACCESS:permission denied,mkdir xxxx
A:单独安装 **npm install node-sass --unsafe-perm**,之后再 **npm install**
---
## Q:UI 不能正常登陆访问
A: 1,如果是node启动的查看escheduler-ui下的.env API_BASE配置是否是Api Server服务地址
2,如果是nginx启动的并且是通过 **install-escheduler-ui.sh** 安装的,查看 **/etc/nginx/conf.d/escheduler.conf** 中的proxy_pass配置是否是Api Server服务地址
3,如果以上配置都是正确的,那么请查看Api Server服务是否是正常的,curl http://192.168.xx.xx:12345/escheduler/users/get-user-info,查看Api Server日志,如果提示 cn.escheduler.api.interceptor.LoginHandlerInterceptor:[76] - session info is null,则证明Api Server服务是正常的
4,如果以上都没有问题,需要查看一下 **application.properties** 中的 **server.context-path 和 server.port 配置**是否正确
---
## Q: 流程定义手动启动或调度启动之后,没有流程实例生成
A: 1,首先通过**jps 查看MasterServer服务是否存在**,或者从服务监控直接查看zk中是否存在master服务
2,如果存在master服务,查看 **命令状态统计** 或者 **t_escheduler_error_command** 中是否增加的新记录,如果增加了,**请查看 message 字段定位启动异常原因**
---
## Q : 任务状态一直处于提交成功状态
A: 1,首先通过**jps 查看WorkerServer服务是否存在**,或者从服务监控直接查看zk中是否存在worker服务
2,如果 **WorkerServer** 服务正常,需要 **查看MasterServer是否把task任务放到zk队列中** ,**需要查看MasterServer日志及zk队列中是否有任务阻塞**
3,如果以上都没有问题,需要定位是否指定了Worker分组,但是 **Worker分组的机器不是在线状态**
---
## Q: 是否提供Docker镜像及Dockerfile
A: 提供Docker镜像及Dockerfile。
Docker镜像地址:https://hub.docker.com/r/escheduler/escheduler_images
Dockerfile地址:https://github.com/qiaozhanwei/escheduler_dockerfile/tree/master/docker_escheduler
---
## Q : install.sh 中需要注意问题
A: 1,如果替换变量中包含特殊字符,**请用 \ 转移符进行转移**
2,installPath="/data1_1T/escheduler",**这个目录不能和当前要一键安装的install.sh目录是一样的**
3,deployUser="escheduler",**部署用户必须具有sudo权限**,因为worker是通过sudo -u 租户 sh xxx.command进行执行的
4,monitorServerState="false",服务监控脚本是否启动,默认是不启动服务监控脚本的。**如果启动服务监控脚本,则每5分钟定时来监控master和worker的服务是否down机,如果down机则会自动重启**
5,hdfsStartupSate="false",是否开启HDFS资源上传功能。默认是不开启的,**如果不开启则资源中心是不能使用的**。如果开启,需要conf/common/hadoop/hadoop.properties中配置fs.defaultFS和yarn的相关配置,如果使用namenode HA,需要将core-site.xml和hdfs-site.xml复制到conf根目录下
注意:**1.0.x版本是不会自动创建hdfs根目录的,需要自行创建,并且需要部署用户有hdfs的操作权限**
---
## Q : 流程定义和流程实例下线异常
A : 对于 **1.0.4 以前的版本中**,修改escheduler-api cn.escheduler.api.quartz包下的代码即可
```
public boolean deleteJob(String jobName, String jobGroupName) {
lock.writeLock().lock();
try {
JobKey jobKey = new JobKey(jobName,jobGroupName);
if(scheduler.checkExists(jobKey)){
logger.info("try to delete job, job name: {}, job group name: {},", jobName, jobGroupName);
return scheduler.deleteJob(jobKey);
}else {
return true;
}
} catch (SchedulerException e) {
logger.error(String.format("delete job : %s failed",jobName), e);
} finally {
lock.writeLock().unlock();
}
return false;
}
```
---
## Q : HDFS启动之前创建的租户,能正常使用资源中心吗
A: 不能。因为在未启动HDFS创建的租户,不会在HDFS中注册租户目录。所以上次资源会报错
## Q : 多Master和多Worker状态下,服务掉了,怎么容错
A: **注意:Master监控Master及Worker服务。**
1,如果Master服务掉了,其它的Master会接管挂掉的Master的流程,继续监控Worker task状态
2,如果Worker服务掉,Master会监控到Worker服务掉了,如果存在Yarn任务,Kill Yarn任务之后走重试
具体请看容错设计:https://analysys.github.io/easyscheduler_docs_cn/%E7%B3%BB%E7%BB%9F%E6%9E%B6%E6%9E%84%E8%AE%BE%E8%AE%A1.html#%E7%B3%BB%E7%BB%9F%E6%9E%B6%E6%9E%84%E8%AE%BE%E8%AE%A1
---
## Q : 对于Master和Worker一台机器伪分布式下的容错
A : 1.0.3 版本只实现了Master启动流程容错,不走Worker容错。也就是说如果Worker挂掉的时候,没有Master存在。这流程将会出现问题。我们会在 **1.1.0** 版本中增加Master和Worker启动自容错,修复这个问题。如果想手动修改这个问题,需要针对 **跨重启正在运行流程** **并且已经掉的正在运行的Worker任务,需要修改为失败**,**同时跨重启正在运行流程设置为失败状态**。然后从失败节点进行流程恢复即可
---
## Q : 定时容易设置成每秒执行
A : 设置定时的时候需要注意,如果第一位(* * * * * ? *)设置成 \* ,则表示每秒执行。**我们将会在1.1.0版本中加入显示最近调度的时间列表** ,使用http://cron.qqe2.com/ 可以在线看近5次运行时间
## Q: 定时有有效时间范围吗
A:有的,**如果定时的起止时间是同一个时间,那么此定时将是无效的定时**。**如果起止时间的结束时间比当前的时间小,很有可能定时会被自动删除**
## Q : 任务依赖有几种实现
A: 1,**DAG** 之间的任务依赖关系,是从 **入度为零** 进行DAG切分的
2,有 **任务依赖节点** ,可以实现跨流程的任务或者流程依赖,具体请参考 依赖(DEPENDENT)节点:https://analysys.github.io/easyscheduler_docs_cn/%E7%B3%BB%E7%BB%9F%E4%BD%BF%E7%94%A8%E6%89%8B%E5%86%8C.html#%E4%BB%BB%E5%8A%A1%E8%8A%82%E7%82%B9%E7%B1%BB%E5%9E%8B%E5%92%8C%E5%8F%82%E6%95%B0%E8%AE%BE%E7%BD%AE
注意:**不支持跨项目的流程或任务依赖**
## Q: 流程定义有几种启动方式
A: 1,在 **流程定义列表**,点击 **启动** 按钮
2,**流程定义列表添加定时器**,调度启动流程定义
3,流程定义 **查看或编辑** DAG 页面,任意 **任务节点右击** 启动流程定义
4,可以对流程定义 DAG 编辑,设置某些任务的运行标志位 **禁止运行**,则在启动流程定义的时候,将该节点的连线将从DAG中去掉
## Q : Python任务设置Python版本
A: 1,对于1**.0.3之后的版本**只需要修改 conf/env/.escheduler_env.sh中的PYTHON_HOME
```
export PYTHON_HOME=/bin/python
```
注意:这了 **PYTHON_HOME** ,是python命令的绝对路径,而不是单纯的 PYTHON_HOME,还需要注意的是 export PATH 的时候,需要直接
```
export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH
```
2,对 1.0.3 之前的版本,Python任务只能支持系统的Python版本,不支持指定Python版本
## Q: Worker Task 通过sudo -u 租户 sh xxx.command会产生子进程,在kill的时候,是否会杀掉
A: 我们会在1.0.4中增加kill任务同时,kill掉任务产生的各种所有子进程
## Q : EasyScheduler中的队列怎么用,用户队列和租户队列是什么意思
A : EasyScheduler 中的队列可以在用户或者租户上指定队列,**用户指定的队列优先级是高于租户队列的优先级的。**,例如:对MR任务指定队列,是通过 mapreduce.job.queuename 来指定队列的。
注意:MR在用以上方法指定队列的时候,传递参数请使用如下方式:
```
Configuration conf = new Configuration();
GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
String[] remainingArgs = optionParser.getRemainingArgs();
```
如果是Spark任务 --queue 方式指定队列
## Q : Master 或者 Worker报如下告警
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/master_worker_lack_res.png" width="60%" />
</p>
A : 修改conf下的 master.properties **master.reserved.memory** 的值为更小的值,比如说0.1 或者
worker.properties **worker.reserved.memory** 的值为更小的值,比如说0.1
## Q : hive版本是1.1.0+cdh5.15.0,SQL hive任务连接报错
<p align="center">
<img src="https://analysys.github.io/easyscheduler_docs_cn/images/cdh_hive_error.png" width="60%" />
</p>
A : 将 hive pom
```
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.0</version>
</dependency>
```
修改为
```
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.1.0</version>
</dependency>
```

3
docs/zh_CN/README.md

@ -43,7 +43,7 @@ Easy Scheduler
- [**升级文档**](https://analysys.github.io/easyscheduler_docs_cn/升级文档.html?_blank "升级文档") - [**升级文档**](https://analysys.github.io/easyscheduler_docs_cn/升级文档.html?_blank "升级文档")
- <a href="http://52.82.13.76:8888" target="_blank">我要体验</a> 普通用户登录:demo/demo123 - <a href="http://52.82.13.76:8888" target="_blank">我要体验</a>
更多文档请参考 <a href="https://analysys.github.io/easyscheduler_docs_cn/" target="_blank">easyscheduler中文在线文档</a> 更多文档请参考 <a href="https://analysys.github.io/easyscheduler_docs_cn/" target="_blank">easyscheduler中文在线文档</a>
@ -63,3 +63,4 @@ The fastest way to get response from our developers is to submit issues, or ad

16
docs/zh_CN/SUMMARY.md

@ -8,7 +8,14 @@
* 后端部署文档 * 后端部署文档
* [准备工作](后端部署文档.md#1、准备工作) * [准备工作](后端部署文档.md#1、准备工作)
* [部署](后端部署文档.md#2、部署) * [部署](后端部署文档.md#2、部署)
* [系统使用手册](系统使用手册.md#使用手册) * [快速上手](快速上手.md#快速上手)
* 系统使用手册
* [快速上手](系统使用手册.md#快速上手)
* [操作指南](系统使用手册.md#操作指南)
* [安全中心(权限系统)](系统使用手册.md#安全中心(权限系统))
* [监控中心](系统使用手册.md#监控中心)
* [任务节点类型和参数设置](系统使用手册.md#任务节点类型和参数设置)
* [系统参数](系统使用手册.md#系统参数)
* [系统架构设计](系统架构设计.md#系统架构设计) * [系统架构设计](系统架构设计.md#系统架构设计)
* 前端开发文档 * 前端开发文档
* [开发环境搭建](前端开发文档.md#开发环境搭建) * [开发环境搭建](前端开发文档.md#开发环境搭建)
@ -22,9 +29,16 @@
* [开发环境搭建](后端开发文档.md#项目编译) * [开发环境搭建](后端开发文档.md#项目编译)
* [自定义任务插件文档](任务插件开发.md#任务插件开发) * [自定义任务插件文档](任务插件开发.md#任务插件开发)
* [接口文档](http://52.82.13.76:8888/escheduler/doc.html?language=zh_CN&lang=cn)
* FAQ
* [FAQ](EasyScheduler-FAQ.md)
* 系统版本升级文档 * 系统版本升级文档
* [版本升级](升级文档.md) * [版本升级](升级文档.md)
* 历次版本发布内容 * 历次版本发布内容
* [1.1.0 release](1.1.0-release.md)
* [1.0.5 release](1.0.5-release.md)
* [1.0.4 release](1.0.4-release.md)
* [1.0.3 release](1.0.3-release.md)
* [1.0.2 release](1.0.2-release.md) * [1.0.2 release](1.0.2-release.md)
* [1.0.1 release](1.0.1-release.md) * [1.0.1 release](1.0.1-release.md)
* [1.0.0 release 正式开源] * [1.0.0 release 正式开源]

2
docs/zh_CN/book.json

@ -1,6 +1,6 @@
{ {
"title": "调度系统-EasyScheduler", "title": "调度系统-EasyScheduler",
"author": "YIGUAN", "author": "",
"description": "调度系统", "description": "调度系统",
"language": "zh-hans", "language": "zh-hans",
"gitbook": "3.2.3", "gitbook": "3.2.3",

BIN
docs/zh_CN/images/cdh_hive_error.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

BIN
docs/zh_CN/images/complement.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

BIN
docs/zh_CN/images/create-queue.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 83 KiB

BIN
docs/zh_CN/images/dag1.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 278 KiB

BIN
docs/zh_CN/images/dag2.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 243 KiB

BIN
docs/zh_CN/images/dag3.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 267 KiB

BIN
docs/zh_CN/images/dag4.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

BIN
docs/zh_CN/images/depend-node.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 242 KiB

BIN
docs/zh_CN/images/depend-node2.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 243 KiB

BIN
docs/zh_CN/images/depend-node3.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 215 KiB

BIN
docs/zh_CN/images/file-manage.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

BIN
docs/zh_CN/images/gant-pic.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 179 KiB

BIN
docs/zh_CN/images/global_parameter.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 101 KiB

After

Width:  |  Height:  |  Size: 113 KiB

BIN
docs/zh_CN/images/hive_edit.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

After

Width:  |  Height:  |  Size: 46 KiB

BIN
docs/zh_CN/images/hive_edit2.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

After

Width:  |  Height:  |  Size: 47 KiB

BIN
docs/zh_CN/images/hive_kerberos.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

BIN
docs/zh_CN/images/instance-detail.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 245 KiB

BIN
docs/zh_CN/images/instance-list.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 193 KiB

BIN
docs/zh_CN/images/local_parameter.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 25 KiB

BIN
docs/zh_CN/images/master-jk.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 114 KiB

BIN
docs/zh_CN/images/master2.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 100 KiB

BIN
docs/zh_CN/images/master_worker_lack_res.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

BIN
docs/zh_CN/images/mysql-jk.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

BIN
docs/zh_CN/images/mysql.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

BIN
docs/zh_CN/images/mysql_edit.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 100 KiB

After

Width:  |  Height:  |  Size: 47 KiB

BIN
docs/zh_CN/images/postgressql_edit.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

BIN
docs/zh_CN/images/project.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

BIN
docs/zh_CN/images/run-work.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save