Merge remote-tracking branch 'upstream/dev' into dev

5 years ago · db2ac7f74a
7 changed files with 65 additions and 66 deletions
--- a/README.md
+++ b/README.md
@ -31,21 +31,21 @@ Its main objectives are as follows:
   | EasyScheduler | Azkaban | Airflow
 -- | -- | -- | --
 **Stability** |   |   |  
-Single point of failure | Decentralized   multi-master and multi-worker | Yes     Single Web and Scheduler Combination Node | Yes.     Single Scheduler
+Single point of failure | Decentralized   multi-master and multi-worker | Yes <br/> Single Web and Scheduler Combination Node | Yes <br/>    Single Scheduler
 Additional HA requirements | Not   required (HA is supported by itself) | DB | Celery   / Dask / Mesos + Load Balancer + DB
 Overload processing | Task   queue mechanism, the number of schedulable tasks on a single machine can be   flexibly configured, when too many tasks will be cached in the task queue,   will not cause machine jam. | Jammed   the server when there are too many tasks | Jammed   the server when there are too many tasks
 **Easy to use** |   |   |  
-DAG Monitoring Interface | Visualization   process defines key information such as task status, task type, retry times,   task running machine, visual variables and so on at a glance. | Only   task status can be seen | Can't   visually distinguish task types
+DAG Monitoring Interface | Visualization process defines key information such as task status, task type, retry times,   task running machine, visual variables and so on at a glance. | Only task status can be seen | Can't visually distinguish task types
-Visual process definition | Yes     All process definition operations are visualized, dragging tasks to draw   DAGs, configuring data sources and resources. At the same time, for   third-party systems, the api mode operation is provided. | No     DAG and custom upload via custom DSL | No     DAG is drawn through Python code, which is inconvenient to use, especially   for business people who can't write code.
+Visual process definition | Yes <br/> All process definition operations are visualized, dragging tasks to draw   DAGs, configuring data sources and resources. At the same time, for   third-party systems, the api mode operation is provided. | No <br/> DAG and custom upload via custom DSL | No <br/> DAG is drawn through Python code, which is inconvenient to use, especially   for business people who can't write code.
-Quick deployment | One-click   deployment | Complex   clustering deployment | Complex   clustering deployment
+Quick deployment | One-click   deployment | Complex  clustering deployment | Complex  clustering deployment
 **Features** |   |   |  
-Suspend and resume | Support   pause, recover operation | No     Can only kill the workflow first and then re-run | No     Can only kill the workflow first and then re-run
+Suspend and resume | Support   pause, recover operation | No <br/>  Can only kill the workflow first and then re-run | No <br/>  Can only kill the workflow first and then re-run
 Whether to support multiple tenants | Users   on easyscheduler can achieve many-to-one or one-to-one mapping relationship   through tenants and Hadoop users, which is very important for scheduling   large data jobs. "     Supports traditional shell tasks, while supporting large data platform task   scheduling: MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python,   Procedure, Sub_Process | No | No
 Task type | Supports   traditional shell tasks, and also support big data platform task scheduling:   MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Procedure,   Sub_Process | shell、gobblin、hadoopJava、java、hive、pig、spark、hdfsToTeradata、teradataToHdfs | BashOperator、DummyOperator、MySqlOperator、HiveOperator、EmailOperator、HTTPOperator、SqlOperator
 Compatibility | Support   the scheduling of big data jobs like spark, hive, Mr. At the same time, it is   more compatible with big data business because it supports multiple tenants. | Because   it does not support multi-tenant, it is not flexible enough to use business   in big data platform. | Because   it does not support multi-tenant, it is not flexible enough to use business   in big data platform.
 **Scalability** |   |   |  
 Whether to support custom task types | Yes | Yes | Yes
-Is Cluster Extension Supported? | Yes     The scheduler uses distributed scheduling, and the overall scheduling   capability will increase linearly with the scale of the cluster. Master and  Worker support dynamic online and offline. | Yes,   but complicated     Executor horizontal extend | Yes,   but complicated     Executor horizontal extend
+Is Cluster Extension Supported? | Yes <br/> The scheduler uses distributed scheduling, and the overall scheduling   capability will increase linearly with the scale of the cluster. Master and  Worker support dynamic online and offline. | Yes <br/>    but complicated     Executor horizontal extend | Yes  <br/>   but complicated     Executor horizontal extend
@ -84,10 +84,13 @@ https://github.com/analysys/EasyScheduler/blob/master/CONTRIBUTING.md
 ### Thanks
 Easy Scheduler uses a lot of excellent open source projects, such as google guava, guice, grpc, netty, ali bonecp, quartz, and many open source projects of apache, etc.
-It is because of the shoulders of these open source projects that the birth of the Easy Scheduler is possible. We are very grateful for all the open source software used! We also hope that we will not only be the beneficiaries of open source, but also be open source contributors, so we decided to contribute to easy scheduling and promised long-term updates. I also hope that partners who have the same passion and conviction for open source will join in and contribute to open source!
+It is because of the shoulders of these open source projects that the birth of the Easy Scheduler is possible. We are very grateful for all the open source software used! We also hope that we will not only be the beneficiaries of open source, but also be open source contributors, so we decided to contribute to easy scheduling and promised long-term updates. We also hope that partners who have the same passion and conviction for open source will join in and contribute to open source!
-### Help
+### Get Help
 The fastest way to get response from our developers is to submit issues,  or add our wechat : 510570367
 ### License
 Please refer to [LICENSE](https://github.com/analysys/EasyScheduler/blob/dev/LICENSE) file.
--- a/docs/zh_CN/1.0.3-release.md
+++ b/docs/zh_CN/1.0.3-release.md
@ -2,45 +2,22 @@ Easy Scheduler Release 1.0.3
 ===
 Easy Scheduler 1.0.3是1.x系列中的第四个版本。
 新特性：
 ===
 - [[EasyScheduler-254](https://github.com/analysys/EasyScheduler/issues/254)] 流程定义删除和批量删除
 - [[EasyScheduler-347](https://github.com/analysys/EasyScheduler/issues/347)] 任务依赖增加“今日”
 - [[EasyScheduler-273](https://github.com/analysys/EasyScheduler/issues/273)]sql任务添加title
 - [[EasyScheduler-247](https://github.com/analysys/EasyScheduler/issues/247)]API在线文档
 - [[EasyScheduler-319](https://github.com/analysys/EasyScheduler/issues/319)] 单机容错
 - [[EasyScheduler-253](https://github.com/analysys/EasyScheduler/issues/253)] 项目增加流程定义统计和运行流程实例统计
 - [[EasyScheduler-292](https://github.com/analysys/EasyScheduler/issues/292)] 启用SSL的邮箱发送邮件
 - [[EasyScheduler-77](https://github.com/analysys/EasyScheduler/issues/77)] 定时管理、工作流定义添加删除功能
 - [[EasyScheduler-380](https://github.com/analysys/EasyScheduler/issues/380)] 服务监控功能
 - [[EasyScheduler-380](https://github.com/analysys/EasyScheduler/issues/382)] 项目增加流程定义统计和运行流程实例统计
 增强：
 ===
- [[EasyScheduler-192](https://github.com/analysys/EasyScheduler/issues/192)] 租户删除前可以考虑校验租户和资源
+-  [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql任务中的邮件标题增加了对自定义变量的支持
- [[EasyScheduler-376](https://github.com/analysys/EasyScheduler/issues/294)] 删除实例时候，没有删除对应zookeeper队列里的任务
+-  [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql任务中的发邮件失败,则此sql任务为失败
- [[EasyScheduler-185](https://github.com/analysys/EasyScheduler/issues/185)] 项目删除工作流定义还存在
+-  [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)修改sql任务中自定义变量的替换规则,支持多个单引号和双引号的替换
- [[EasyScheduler-206](https://github.com/analysys/EasyScheduler/issues/206)] 优化部署，完善docker化支持
+-  [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)创建资源文件时，增加对该资源文件是否在hdfs上已存在的验证
 - [[EasyScheduler-381](https://github.com/analysys/EasyScheduler/issues/381)] 前端一键部署脚本支持ubuntu
 修复：
 ===
- [[EasyScheduler-255](https://github.com/analysys/EasyScheduler/issues/255)]子父流程全局变量覆盖，子流程继承父流程全局变量并可以重写
+-  [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) 流程定义列表根据定时状态和更新时间进行排序
- [[EasyScheduler-256](https://github.com/analysys/EasyScheduler/issues/256)]子父流程参数显示异常
+-  [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) 修复在线创建文件，hdfs文件未创建，却返回成功
- [[EasyScheduler-186](https://github.com/analysys/EasyScheduler/issues/186)]所有查询中只要输入%会返回所有数据
+-  [[EasyScheduler-481]](https://github.com/analysys/EasyScheduler/issues/481)修复job不存在定时无法下线的问题
- [[EasyScheduler-185](https://github.com/analysys/EasyScheduler/issues/185)]项目删除工作流定义还存在
+-  [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kill任务时增加对其子进程的kill
- [[EasyScheduler-266](https://github.com/analysys/EasyScheduler/issues/266)]Stop process return: process definition 1 not on line 
+-  [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) 修复更新资源文件时更新时间和大小未更新的问题
- [[EasyScheduler-300](https://github.com/analysys/EasyScheduler/issues/300)] 超时告警时间单位
+-  [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) 修复删除租户时,如果未启动hdfs,则删除租户失败的问题
- [[EasyScheduler-235](https://github.com/analysys/EasyScheduler/issues/235)]nginx超时连接问题修复
+-  [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/486) shell进程退出，yarn状态非终态等待判断
 - [[EasyScheduler-272](https://github.com/analysys/EasyScheduler/issues/272)]管理员不能生成token
 - [[EasyScheduler-272](https://github.com/analysys/EasyScheduler/issues/277)]save global parameters error
 - [[EasyScheduler-183](https://github.com/analysys/EasyScheduler/issues/183)]创建中文名称的Worker分组报错
 - [[EasyScheduler-377](https://github.com/analysys/EasyScheduler/issues/377)]资源文件重命名只修改描述时会报名称已存在错误
 - [[EasyScheduler-235](https://github.com/analysys/EasyScheduler/issues/235)]创建spark数据源，点击“测试连接”，系统回退回到登入页面
 - [[EasyScheduler-83](https://github.com/analysys/EasyScheduler/issues/83)]1.0.1版本启动api server报错
 - [[EasyScheduler-379](https://github.com/analysys/EasyScheduler/issues/379)]跨天恢复执行定时任务时，时间参数不对
 - [[EasyScheduler-383](https://github.com/analysys/EasyScheduler/issues/383)]sql邮件不显示前面的空行
 感谢：
 ===
--- a/docs/zh_CN/1.0.4-release.md
+++ b/docs/zh_CN/1.0.4-release.md
@ -2,30 +2,27 @@ Easy Scheduler Release 1.0.4
 ===
 Easy Scheduler 1.0.4是1.x系列中的第五个版本。
-增强：
+**修复**：
-===
+-  [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) 流程定义列表根据定时状态和更新时间进行排序
- [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql任务中的邮件标题增加了对自定义变量的支持
+-  [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) 修复在线创建文件，hdfs文件未创建，却返回成功
- [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql任务中的发邮件失败,则此sql任务为失败
+-  [[EasyScheduler-481]](https://github.com/analysys/EasyScheduler/issues/481)修复job不存在定时无法下线的问题
- [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)修改sql任务中自定义变量的替换规则,支持多个单引号和双引号的替换
+-  [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kill任务时增加对其子进程的kill
- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)创建资源文件时，增加对该资源文件是否在hdfs上已存在的验证
+-  [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) 修复更新资源文件时更新时间和大小未更新的问题
- [[EasyScheduler-486]](https://github.com/analysys/EasyScheduler/issues/486)shell进程退出，yarn状态非终态等待判断
+-  [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) 修复删除租户时,如果未启动hdfs,则删除租户失败的问题
 -  [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/486) shell进程退出，yarn状态非终态等待判断
-修复
+**增强**:
-===
+-  [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql任务中的邮件标题增加了对自定义变量的支持
- [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) 流程定义列表根据定时状态和更新时间进行排序
+-  [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql任务中的发邮件失败,则此sql任务为失败
- [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) 修复在线创建文件，hdfs文件未创建，却返回成功
+-  [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)修改sql任务中自定义变量的替换规则,支持多个单引号和双引号的替换
- [[EasyScheduler-481]](https://github.com/analysys/EasyScheduler/issues/481)修复job不存在定时无法下线的问题
+-  [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)创建资源文件时，增加对该资源文件是否在hdfs上已存在的验证
 - [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kill任务时增加对其子进程的kill
 - [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) 修复更新资源文件时更新时间和大小未更新的问题
 - [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) 修复删除租户时,如果未启动hdfs,则删除租户失败的问题
 感谢：
 ===
-最后但最重要的是，没有以下伙伴的贡献就没有新版本的诞生：
+最后但最重要的是，没有以下伙伴的贡献就没有新版本的诞生(排名不分先后)：
 Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, feloxx, coding-now, hymzcn, nysyxxg, chgxtony, gj-zhang, xianhu, sunnyingit,
 zhengqiangtan
-以及微信群里众多的热心伙伴！在此非常感谢！
+Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, 
 feloxx, coding-now, hymzcn, nysyxxg, chgxtony, lfyee, Crossoverrr, gj-zhang, sunnyingit, xianhu, zhengqiangtan
 以及微信群/钉钉群里众多的热心伙伴！在此非常感谢！
--- a/escheduler-dao/src/main/java/cn/escheduler/dao/ProcessDao.java
+++ b/escheduler-dao/src/main/java/cn/escheduler/dao/ProcessDao.java
@ -931,6 +931,9 @@ public class ProcessDao extends AbstractBaseDao {
            cmdParam.put(CMDPARAM_COMPLEMENT_DATA_START_DATE, startTime);
            processMapStr = JSONUtils.toJson(cmdParam);
        }
        updateSubProcessDefinitionByParent(parentProcessInstance, childDefineId);
        Command command = new Command();
        command.setWarningType(parentProcessInstance.getWarningType());
        command.setWarningGroupId(parentProcessInstance.getWarningGroupId());
@ -945,6 +948,16 @@ public class ProcessDao extends AbstractBaseDao {
        logger.info("sub process command created: {} ", command.toString());
    }
    private void updateSubProcessDefinitionByParent(ProcessInstance parentProcessInstance, int childDefinitionId) {
        ProcessDefinition fatherDefinition = this.findProcessDefineById(parentProcessInstance.getProcessDefinitionId());
        ProcessDefinition childDefinition = this.findProcessDefineById(childDefinitionId);
        if(childDefinition != null && fatherDefinition != null){
            childDefinition.setReceivers(fatherDefinition.getReceivers());
            childDefinition.setReceiversCc(fatherDefinition.getReceiversCc());
            processDefineMapper.update(childDefinition);
        }
    }
    /**
     * submit task to mysql
     * @param taskInstance
--- a/escheduler-dao/src/main/java/cn/escheduler/dao/mapper/ProcessDefinitionMapperProvider.java
+++ b/escheduler-dao/src/main/java/cn/escheduler/dao/mapper/ProcessDefinitionMapperProvider.java
@ -55,6 +55,8 @@ public class ProcessDefinitionMapperProvider {
                VALUES("`connects`", "#{processDefinition.connects}");
                VALUES("`create_time`", "#{processDefinition.createTime}");
                VALUES("`update_time`", "#{processDefinition.updateTime}");
                VALUES("`receivers` ","#{processDefinition.receivers}");
                VALUES("`receivers_cc`", "#{processDefinition.receiversCc}");
                VALUES("`timeout`", "#{processDefinition.timeout}");
                VALUES("`tenant_id`", "#{processDefinition.tenantId}");
                VALUES("`flag`", EnumFieldUtil.genFieldStr("processDefinition.flag", ReleaseState.class));
@ -102,6 +104,8 @@ public class ProcessDefinitionMapperProvider {
                SET("`global_params`=#{processDefinition.globalParams}");
                SET("`create_time`=#{processDefinition.createTime}");
                SET("`update_time`=#{processDefinition.updateTime}");
                SET("`receivers`=#{processDefinition.receivers}");
                SET("`receivers_cc`=#{processDefinition.receiversCc}");
                SET("`timeout`=#{processDefinition.timeout}");
                SET("`tenant_id`=#{processDefinition.tenantId}");
                SET("`flag`="+EnumFieldUtil.genFieldStr("processDefinition.flag", Flag.class));
--- a/escheduler-dao/src/main/resources/dao/data_source.properties
+++ b/escheduler-dao/src/main/resources/dao/data_source.properties
@ -1,9 +1,9 @@
 # base spring data source configuration
 spring.datasource.type=com.alibaba.druid.pool.DruidDataSource
 spring.datasource.driver-class-name=com.mysql.jdbc.Driver
-spring.datasource.url=jdbc:mysql://192.168.220.188:3306/escheduler_new?characterEncoding=UTF-8
+spring.datasource.url=jdbc:mysql://192.168.xx.xx:3306/escheduler?characterEncoding=UTF-8
-spring.datasource.username=root
+spring.datasource.username=xx
-spring.datasource.password=root@123
+spring.datasource.password=xx
 # connection configuration
 spring.datasource.initialSize=5
--- a/escheduler-server/src/main/java/cn/escheduler/server/worker/task/AbstractCommandExecutor.java
+++ b/escheduler-server/src/main/java/cn/escheduler/server/worker/task/AbstractCommandExecutor.java
@ -162,7 +162,12 @@ public abstract class AbstractCommandExecutor {
                exitStatusCode = updateState(processDao, exitStatusCode, pid, taskInstId);
            } else {
-                cancelApplication();
+                TaskInstance taskInstance = processDao.findTaskInstanceById(taskInstId);
                if (taskInstance == null) {
                    logger.error("task instance id:{} not exist", taskInstId);
                } else {
                    ProcessUtils.kill(taskInstance);
                }
                exitStatusCode = -1;
                logger.warn("process timeout, work dir:{}, pid:{}", taskDir, pid);
            }