分布式调度框架。
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

417 lines
11 KiB

/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.dolphinscheduler.server.zk;
import org.apache.curator.framework.recipes.cache.TreeCacheEvent;
import org.apache.dolphinscheduler.common.Constants;
import org.apache.dolphinscheduler.common.enums.ExecutionStatus;
import org.apache.dolphinscheduler.common.enums.ZKNodeType;
import org.apache.dolphinscheduler.common.model.Server;
import org.apache.dolphinscheduler.common.zk.AbstractZKClient;
import org.apache.dolphinscheduler.dao.AlertDao;
import org.apache.dolphinscheduler.dao.DaoFactory;
import org.apache.dolphinscheduler.dao.ProcessDao;
import org.apache.dolphinscheduler.dao.entity.ProcessInstance;
import org.apache.dolphinscheduler.dao.entity.TaskInstance;
import org.apache.dolphinscheduler.server.utils.ProcessUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.recipes.locks.InterProcessMutex;
import org.apache.curator.utils.ThreadUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import java.util.Date;
import java.util.List;
import java.util.concurrent.ThreadFactory;
/**
* zookeeper master client
*
* single instance
*/
@Component
public class ZKMasterClient extends AbstractZKClient {
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* logger
*/
private static final Logger logger = LoggerFactory.getLogger(ZKMasterClient.class);
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* thread factory
*/
private static final ThreadFactory defaultThreadFactory = ThreadUtils.newGenericThreadFactory("Master-Main-Thread");
/**
* master znode
*/
private String masterZNode = null;
/**
* alert database access
*/
private AlertDao alertDao = null;
/**
* flow database access
*/
@Autowired
private ProcessDao processDao;
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* default constructor
*/
private ZKMasterClient(){}
/**
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
* init
*/
public void init(){
logger.info("initialize master client...");
// init dao
this.initDao();
InterProcessMutex mutex = null;
try {
// create distributed lock with the root node path of the lock space as /dolphinscheduler/lock/failover/master
String znodeLock = getMasterStartUpLockPath();
mutex = new InterProcessMutex(zkClient, znodeLock);
mutex.acquire();
// init system znode
this.initSystemZNode();
// register master
this.registerMaster();
// check if fault tolerance is required,failure and tolerance
if (getActiveMasterNum() == 1) {
5 years ago
failoverWorker(null, true);
failoverMaster(null);
}
}catch (Exception e){
logger.error("master start up exception",e);
}finally {
releaseMutex(mutex);
}
}
/**
* init dao
*/
public void initDao(){
this.alertDao = DaoFactory.getDaoInstance(AlertDao.class);
}
/**
* get alert dao
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
*
* @return AlertDao
*/
public AlertDao getAlertDao() {
return alertDao;
}
/**
* register master znode
*/
public void registerMaster(){
try {
String serverPath = registerServer(ZKNodeType.MASTER);
if(StringUtils.isEmpty(serverPath)){
System.exit(-1);
}
masterZNode = serverPath;
} catch (Exception e) {
logger.error("register master failure ",e);
System.exit(-1);
}
}
/**
* handle path events that this class cares about
* @param client zkClient
* @param event path event
* @param path zk path
*/
@Override
protected void dataChanged(CuratorFramework client, TreeCacheEvent event, String path) {
if(path.startsWith(getZNodeParentPath(ZKNodeType.MASTER)+Constants.SINGLE_SLASH)){ //monitor master
handleMasterEvent(event,path);
}else if(path.startsWith(getZNodeParentPath(ZKNodeType.WORKER)+Constants.SINGLE_SLASH)){ //monitor worker
handleWorkerEvent(event,path);
}
//other path event, ignore
}
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* remove zookeeper node path
*
* @param path zookeeper node path
* @param zkNodeType zookeeper node type
* @param failover is failover
*/
private void removeZKNodePath(String path, ZKNodeType zkNodeType, boolean failover) {
logger.info("{} node deleted : {}", zkNodeType.toString(), path);
InterProcessMutex mutex = null;
try {
String failoverPath = getFailoverLockPath(zkNodeType);
// create a distributed lock
mutex = new InterProcessMutex(getZkClient(), failoverPath);
mutex.acquire();
String serverHost = getHostByEventDataPath(path);
// handle dead server
handleDeadServer(path, zkNodeType, Constants.ADD_ZK_OP);
//alert server down.
alertServerDown(serverHost, zkNodeType);
//failover server
if(failover){
failoverServerWhenDown(serverHost, zkNodeType);
}
}catch (Exception e){
logger.error("{} server failover failed.", zkNodeType.toString());
logger.error("failover exception ",e);
}
finally {
releaseMutex(mutex);
}
}
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* failover server when server down
*
* @param serverHost server host
* @param zkNodeType zookeeper node type
* @throws Exception exception
*/
private void failoverServerWhenDown(String serverHost, ZKNodeType zkNodeType) throws Exception {
if(StringUtils.isEmpty(serverHost)){
return ;
}
switch (zkNodeType){
case MASTER:
failoverMaster(serverHost);
break;
case WORKER:
failoverWorker(serverHost, true);
default:
break;
}
}
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* get failover lock path
*
* @param zkNodeType zookeeper node type
* @return fail over lock path
*/
private String getFailoverLockPath(ZKNodeType zkNodeType){
switch (zkNodeType){
case MASTER:
return getMasterFailoverLockPath();
case WORKER:
return getWorkerFailoverLockPath();
default:
return "";
}
}
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
/**
* send alert when server down
*
* @param serverHost server host
* @param zkNodeType zookeeper node type
*/
private void alertServerDown(String serverHost, ZKNodeType zkNodeType) {
String serverType = zkNodeType.toString();
alertDao.sendServerStopedAlert(1, serverHost, serverType);
}
/**
* monitor master
*/
public void handleMasterEvent(TreeCacheEvent event, String path){
switch (event.getType()) {
case NODE_ADDED:
logger.info("master node added : {}", path);
break;
case NODE_REMOVED:
String serverHost = getHostByEventDataPath(path);
if (checkServerSelfDead(serverHost, ZKNodeType.MASTER)) {
return;
}
removeZKNodePath(path, ZKNodeType.MASTER, true);
break;
default:
break;
}
}
/**
* monitor worker
*/
public void handleWorkerEvent(TreeCacheEvent event, String path){
switch (event.getType()) {
case NODE_ADDED:
logger.info("worker node added : {}", path);
break;
case NODE_REMOVED:
logger.info("worker node deleted : {}", path);
removeZKNodePath(path, ZKNodeType.WORKER, true);
break;
default:
break;
}
}
/**
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
* get master znode
*
* @return master zookeeper node
*/
public String getMasterZNode() {
return masterZNode;
}
/**
* task needs failover if task start before worker starts
*
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
* @param taskInstance task instance
* @return true if task instance need fail over
*/
private boolean checkTaskInstanceNeedFailover(TaskInstance taskInstance) throws Exception {
boolean taskNeedFailover = true;
//now no host will execute this task instance,so no need to failover the task
if(taskInstance.getHost() == null){
return false;
}
// if the worker node exists in zookeeper, we must check the task starts after the worker
if(checkZKNodeExists(taskInstance.getHost(), ZKNodeType.WORKER)){
//if task start after worker starts, there is no need to failover the task.
if(checkTaskAfterWorkerStart(taskInstance)){
taskNeedFailover = false;
}
}
return taskNeedFailover;
}
/**
* check task start after the worker server starts.
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
*
* @param taskInstance task instance
* @return true if task instance start time after worker server start date
*/
private boolean checkTaskAfterWorkerStart(TaskInstance taskInstance) {
if(StringUtils.isEmpty(taskInstance.getHost())){
return false;
}
Date workerServerStartDate = null;
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
List<Server> workerServers = getServersList(ZKNodeType.WORKER);
for(Server workerServer : workerServers){
if(workerServer.getHost().equals(taskInstance.getHost())){
workerServerStartDate = workerServer.getCreateTime();
break;
}
}
if(workerServerStartDate != null){
return taskInstance.getStartTime().after(workerServerStartDate);
}else{
return false;
}
}
/**
* failover worker tasks
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
*
* 1. kill yarn job if there are yarn jobs in tasks.
* 2. change task state from running to need failover.
* 3. failover all tasks when workerHost is null
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
* @param workerHost worker host
*/
/**
* failover worker tasks
*
* 1. kill yarn job if there are yarn jobs in tasks.
* 2. change task state from running to need failover.
* 3. failover all tasks when workerHost is null
* @param workerHost worker host
* @param needCheckWorkerAlive need check worker alive
* @throws Exception exception
*/
private void failoverWorker(String workerHost, boolean needCheckWorkerAlive) throws Exception {
logger.info("start worker[{}] failover ...", workerHost);
List<TaskInstance> needFailoverTaskInstanceList = processDao.queryNeedFailoverTaskInstances(workerHost);
for(TaskInstance taskInstance : needFailoverTaskInstanceList){
if(needCheckWorkerAlive){
if(!checkTaskInstanceNeedFailover(taskInstance)){
continue;
}
}
ProcessInstance instance = processDao.findProcessInstanceDetailById(taskInstance.getProcessInstanceId());
if(instance!=null){
taskInstance.setProcessInstance(instance);
}
// only kill yarn job if exists , the local thread has exited
ProcessUtils.killYarnJob(taskInstance);
taskInstance.setState(ExecutionStatus.NEED_FAULT_TOLERANCE);
processDao.saveTaskInstance(taskInstance);
}
logger.info("end worker[{}] failover ...", workerHost);
}
/**
* failover master tasks
Add method and parameters comments (#1220) * rename from DatasourceUserMapper to DataSourceUserMapper * add unit test in UserMapper and WorkerGroupMapper * change cn.escheduler to org.apache.dolphinscheduler * add unit test in UdfFuncMapperTest * add unit test in UdfFuncMapperTest * remove DatabaseConfiguration * add ConnectionFactoryTest * cal duration in processInstancesList * change desc to description * change table name in mysql ddl * change table name in mysql ddl * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * change escheduler to dolphinscheduler * remove log4j-1.2-api and modify AlertMapperTest * remove log4j-1.2-api * Add alertDao to spring management * Add alertDao to spring management * get SqlSessionFactory from MybatisSqlSessionFactoryBean * get processDao by DaoFactory * read druid properties in ConneciontFactory * read druid properties in ConneciontFactory * change get alertDao by spring to DaoFactory * add log4j to resolve #967 * resole verify udf name error and delete udf error * Determine if principal is empty * Determine whether the logon user has the right to delete the project * Fixed an issue that produced attatch file named such as ATT00002.bin * fix too many connection in upgrade or create * fix NEED_FAULT_TOLERANCE and WAITTING_THREAD count fail * Added a judgment on whether the currently login user is an administrator * fix update udf database not change and create time is changed * add enterprise.wechat.enable to decide whether to send enterprise WeChat * change method check * Remove the administrator's judgment on query access token list * only admin can create worker group * delete alert group need delete the relation of user and alert group * add timeout in proxy when upload large resource * add gets scheduled times by expect fire times * add gets scheduled times by expect fire times * Increase the judgment of whether it is admin * Increase the judgment of whether it is admin * when delete access token add whether login user has perm to delete * change mysql-connector-java scope to test * update scm test * add profile test * Add method and parameters comments * roll back
5 years ago
*
* @param masterHost master host
*/
private void failoverMaster(String masterHost) {
logger.info("start master failover ...");
List<ProcessInstance> needFailoverProcessInstanceList = processDao.queryNeedFailoverProcessInstances(masterHost);
//updateProcessInstance host is null and insert into command
for(ProcessInstance processInstance : needFailoverProcessInstanceList){
processDao.processNeedFailoverProcessInstances(processInstance);
}
logger.info("master failover end");
}
}