diff --git a/docs/docs/en/development/api-standard.md b/docs/docs/en/development/api-standard.md new file mode 100644 index 0000000000..7a6421cd73 --- /dev/null +++ b/docs/docs/en/development/api-standard.md @@ -0,0 +1,100 @@ +# API design standard +A standardized and unified API is the cornerstone of project design.The API of DolphinScheduler follows the REST ful standard. REST ful is currently the most popular Internet software architecture. It has a clear structure, conforms to standards, is easy to understand and extend. + +This article uses the DolphinScheduler API as an example to explain how to construct a Restful API. + +## 1. URI design +REST is "Representational State Transfer".The design of Restful URI is based on resources.The resource corresponds to an entity on the network, for example: a piece of text, a picture, and a service. And each resource corresponds to a URI. + ++ One Kind of Resource: expressed in the plural, such as `task-instances`、`groups` ; ++ A Resource: expressed in the singular, or use the ID to represent the corresponding resource, such as `group`、`groups/{groupId}`; ++ Sub Resources: Resources under a certain resource, such as `/instances/{instanceId}/tasks`; ++ A Sub Resource:`/instances/{instanceId}/tasks/{taskId}`; + +## 2. Method design +We need to locate a certain resource by URI, and then use Method or declare actions in the path suffix to reflect the operation of the resource. + +### ① Query - GET +Use URI to locate the resource, and use GET to indicate query. + ++ When the URI is a type of resource, it means to query a type of resource. For example, the following example indicates paging query `alter-groups`. +``` +Method: GET +/api/dolphinscheduler/alert-groups +``` + ++ When the URI is a single resource, it means to query this resource. For example, the following example means to query the specified `alter-group`. +``` +Method: GET +/api/dolphinscheduler/alter-groups/{id} +``` + ++ In addition, we can also express query sub-resources based on URI, as follows: +``` +Method: GET +/api/dolphinscheduler/projects/{projectId}/tasks +``` + +**The above examples all represent paging query. If we need to query all data, we need to add `/list` after the URI to distinguish. Do not mix the same API for both paged query and query.** +``` +Method: GET +/api/dolphinscheduler/alert-groups/list +``` + +### ② Create - POST +Use URI to locate the resource, use POST to indicate create, and then return the created id to requester. + ++ create an `alter-group`: + +``` +Method: POST +/api/dolphinscheduler/alter-groups +``` + ++ create sub-resources is also the same as above. +``` +Method: POST +/api/dolphinscheduler/alter-groups/{alterGroupId}/tasks +``` + +### ③ Modify - PUT +Use URI to locate the resource, use PUT to indicate modify. ++ modify an `alert-group` +``` +Method: PUT +/api/dolphinscheduler/alter-groups/{alterGroupId} +``` + +### ④ Delete -DELETE +Use URI to locate the resource, use DELETE to indicate delete. + ++ delete an `alert-group` +``` +Method: DELETE +/api/dolphinscheduler/alter-groups/{alterGroupId} +``` + ++ batch deletion: batch delete the id array,we should use POST. **(Do not use the DELETE method, because the body of the DELETE request has no semantic meaning, and it is possible that some gateways, proxies, and firewalls will directly strip off the request body after receiving the DELETE request.)** +``` +Method: POST +/api/dolphinscheduler/alter-groups/batch-delete +``` + +### ⑤ Others +In addition to creating, deleting, modifying and quering, we also locate the corresponding resource through url, and then append operations to it after the path, such as: +``` +/api/dolphinscheduler/alert-groups/verify-name +/api/dolphinscheduler/projects/{projectCode}/process-instances/{code}/view-gantt +``` + +## 3. Parameter design +There are two types of parameters, one is request parameter and the other is path parameter. And the parameter must use small hump. + +In the case of paging, if the parameter entered by the user is less than 1, the front end needs to automatically turn to 1, indicating that the first page is requested; When the backend finds that the parameter entered by the user is greater than the total number of pages, it should directly return to the last page. + +## 4. Others design +### base URL +The URI of the project needs to use `/api/` as the base path, so as to identify that these APIs are under this project. +``` +/api/dolphinscheduler +``` \ No newline at end of file diff --git a/docs/docs/en/development/architecture-design.md b/docs/docs/en/development/architecture-design.md new file mode 100644 index 0000000000..09f932b90e --- /dev/null +++ b/docs/docs/en/development/architecture-design.md @@ -0,0 +1,315 @@ +## Architecture Design +Before explaining the architecture of the schedule system, let us first understand the common nouns of the schedule system. + +### 1.Noun Interpretation + +**DAG:** Full name Directed Acyclic Graph,referred to as DAG。Tasks in the workflow are assembled in the form of directed acyclic graphs, which are topologically traversed from nodes with zero indegrees of ingress until there are no successor nodes. For example, the following picture: + +

+ dag示例 +

+ dag example +

+

+ +**Process definition**: Visualization **DAG** by dragging task nodes and establishing associations of task nodes + +**Process instance**: A process instance is an instantiation of a process definition, which can be generated by manual startup or scheduling. The process definition runs once, a new process instance is generated + +**Task instance**: A task instance is the instantiation of a specific task node when a process instance runs, which indicates the specific task execution status + +**Task type**: Currently supports SHELL, SQL, SUB_PROCESS (sub-process), PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (dependency), and plans to support dynamic plug-in extension, note: the sub-**SUB_PROCESS** is also A separate process definition that can be launched separately + +**Schedule mode** : The system supports timing schedule and manual schedule based on cron expressions. Command type support: start workflow, start execution from current node, resume fault-tolerant workflow, resume pause process, start execution from failed node, complement, timer, rerun, pause, stop, resume waiting thread. Where **recovers the fault-tolerant workflow** and **restores the waiting thread** The two command types are used by the scheduling internal control and cannot be called externally + +**Timed schedule**: The system uses **quartz** distributed scheduler and supports the generation of cron expression visualization + +**Dependency**: The system does not only support **DAG** Simple dependencies between predecessors and successor nodes, but also provides **task dependencies** nodes, support for **custom task dependencies between processes** + +**Priority**: Supports the priority of process instances and task instances. If the process instance and task instance priority are not set, the default is first in, first out. + +**Mail Alert**: Support **SQL Task** Query Result Email Send, Process Instance Run Result Email Alert and Fault Tolerant Alert Notification + +**Failure policy**: For tasks running in parallel, if there are tasks that fail, two failure policy processing methods are provided. **Continue** means that the status of the task is run in parallel until the end of the process failure. **End** means that once a failed task is found, Kill also drops the running parallel task and the process ends. + +**Complement**: Complement historical data, support **interval parallel and serial** two complement methods + + + +### 2.System architecture + +#### 2.1 System Architecture Diagram +

+ System Architecture Diagram +

+ System Architecture Diagram +

+

+ + + +#### 2.2 Architectural description + +* **MasterServer** + + MasterServer adopts the distributed non-central design concept. MasterServer is mainly responsible for DAG task split, task submission monitoring, and monitoring the health status of other MasterServer and WorkerServer. + When the MasterServer service starts, it registers a temporary node with Zookeeper, and listens to the Zookeeper temporary node state change for fault tolerance processing. + + + + ##### The service mainly contains: + + - **Distributed Quartz** distributed scheduling component, mainly responsible for the start and stop operation of the scheduled task. When the quartz picks up the task, the master internally has a thread pool to be responsible for the subsequent operations of the task. + + - **MasterSchedulerThread** is a scan thread that periodically scans the **command** table in the database for different business operations based on different **command types** + + - **MasterExecThread** is mainly responsible for DAG task segmentation, task submission monitoring, logic processing of various command types + + - **MasterTaskExecThread** is mainly responsible for task persistence + + + +* **WorkerServer** + + - WorkerServer also adopts a distributed, non-central design concept. WorkerServer is mainly responsible for task execution and providing log services. When the WorkerServer service starts, it registers the temporary node with Zookeeper and maintains the heartbeat. + + ##### This service contains: + + - **FetchTaskThread** is mainly responsible for continuously receiving tasks from **Task Queue** and calling **TaskScheduleThread** corresponding executors according to different task types. + + - **ZooKeeper** + + The ZooKeeper service, the MasterServer and the WorkerServer nodes in the system all use the ZooKeeper for cluster management and fault tolerance. In addition, the system also performs event monitoring and distributed locking based on ZooKeeper. + We have also implemented queues based on Redis, but we hope that DolphinScheduler relies on as few components as possible, so we finally removed the Redis implementation. + + - **Task Queue** + + The task queue operation is provided. Currently, the queue is also implemented based on Zookeeper. Since there is less information stored in the queue, there is no need to worry about too much data in the queue. In fact, we have over-measured a million-level data storage queue, which has no effect on system stability and performance. + + - **Alert** + + Provides alarm-related interfaces. The interfaces mainly include **Alarms**. The storage, query, and notification functions of the two types of alarm data. The notification function has two types: **mail notification** and **SNMP (not yet implemented)**. + + - **API** + + The API interface layer is mainly responsible for processing requests from the front-end UI layer. The service provides a RESTful api to provide request services externally. + Interfaces include workflow creation, definition, query, modification, release, offline, manual start, stop, pause, resume, start execution from this node, and more. + + - **UI** + + The front-end page of the system provides various visual operation interfaces of the system. For details, see the [quick start](https://dolphinscheduler.apache.org/en-us/docs/latest/user_doc/guide/quick-start.html) section. + + + +#### 2.3 Architectural Design Ideas + +##### I. Decentralized vs centralization + +###### Centralization Thought + +The centralized design concept is relatively simple. The nodes in the distributed cluster are divided into two roles according to their roles: + +

+ master-slave role +

+ +- The role of Master is mainly responsible for task distribution and supervising the health status of Slave. It can dynamically balance the task to Slave, so that the Slave node will not be "busy" or "free". +- The role of the Worker is mainly responsible for the execution of the task and maintains the heartbeat with the Master so that the Master can assign tasks to the Slave. + +Problems in the design of centralized : + +- Once the Master has a problem, the group has no leader and the entire cluster will crash. In order to solve this problem, most Master/Slave architecture modes adopt the design scheme of the master and backup masters, which can be hot standby or cold standby, automatic switching or manual switching, and more and more new systems are available. Automatically elects the ability to switch masters to improve system availability. +- Another problem is that if the Scheduler is on the Master, although it can support different tasks in one DAG running on different machines, it will generate overload of the Master. If the Scheduler is on the Slave, all tasks in a DAG can only be submitted on one machine. If there are more parallel tasks, the pressure on the Slave may be larger. + +###### Decentralization + +

+

+ +- In the decentralized design, there is usually no Master/Slave concept, all roles are the same, the status is equal, the global Internet is a typical decentralized distributed system, networked arbitrary node equipment down machine , all will only affect a small range of features. +- The core design of decentralized design is that there is no "manager" that is different from other nodes in the entire distributed system, so there is no single point of failure problem. However, since there is no "manager" node, each node needs to communicate with other nodes to get the necessary machine information, and the unreliable line of distributed system communication greatly increases the difficulty of implementing the above functions. +- In fact, truly decentralized distributed systems are rare. Instead, dynamic centralized distributed systems are constantly emerging. Under this architecture, the managers in the cluster are dynamically selected, rather than preset, and when the cluster fails, the nodes of the cluster will spontaneously hold "meetings" to elect new "managers". Go to preside over the work. The most typical case is the Etcd implemented in ZooKeeper and Go. + +- Decentralization of DolphinScheduler is the registration of Master/Worker to ZooKeeper. The Master Cluster and the Worker Cluster are not centered, and the Zookeeper distributed lock is used to elect one Master or Worker as the “manager” to perform the task. + +##### 二、Distributed lock practice + +DolphinScheduler uses ZooKeeper distributed locks to implement only one Master to execute the Scheduler at the same time, or only one Worker to perform task submission. + +1. The core process algorithm for obtaining distributed locks is as follows + +

+ Get Distributed Lock Process +

+ +2. Scheduler thread distributed lock implementation flow chart in DolphinScheduler: + +

+ Get Distributed Lock Process +

+ +##### Third, the thread is insufficient loop waiting problem + +- If there is no subprocess in a DAG, if the number of data in the Command is greater than the threshold set by the thread pool, the direct process waits or fails. +- If a large number of sub-processes are nested in a large DAG, the following figure will result in a "dead" state: + +

+ Thread is not enough to wait for loop +

+ +In the above figure, MainFlowThread waits for SubFlowThread1 to end, SubFlowThread1 waits for SubFlowThread2 to end, SubFlowThread2 waits for SubFlowThread3 to end, and SubFlowThread3 waits for a new thread in the thread pool, then the entire DAG process cannot end, and thus the thread cannot be released. This forms the state of the child parent process loop waiting. At this point, the scheduling cluster will no longer be available unless a new Master is started to add threads to break such a "stuck." + +It seems a bit unsatisfactory to start a new Master to break the deadlock, so we proposed the following three options to reduce this risk: + +1. Calculate the sum of the threads of all Masters, and then calculate the number of threads required for each DAG, that is, pre-calculate before the DAG process is executed. Because it is a multi-master thread pool, the total number of threads is unlikely to be obtained in real time. +2. Judge the single master thread pool. If the thread pool is full, let the thread fail directly. +3. Add a Command type with insufficient resources. If the thread pool is insufficient, the main process will be suspended. This way, the thread pool has a new thread, which can make the process with insufficient resources hang up and wake up again. + +Note: The Master Scheduler thread is FIFO-enabled when it gets the Command. + +So we chose the third way to solve the problem of insufficient threads. + +##### IV. Fault Tolerant Design + +Fault tolerance is divided into service fault tolerance and task retry. Service fault tolerance is divided into two types: Master Fault Tolerance and Worker Fault Tolerance. + +###### 1. Downtime fault tolerance + +Service fault tolerance design relies on ZooKeeper's Watcher mechanism. The implementation principle is as follows: + +

+ DolphinScheduler Fault Tolerant Design +

+ +The Master monitors the directories of other Masters and Workers. If the remove event is detected, the process instance is fault-tolerant or the task instance is fault-tolerant according to the specific business logic. + + + +- Master fault tolerance flow chart: + +

+ Master Fault Tolerance Flowchart +

+ +After the ZooKeeper Master is fault-tolerant, it is rescheduled by the Scheduler thread in DolphinScheduler. It traverses the DAG to find the "Running" and "Submit Successful" tasks, and monitors the status of its task instance for the "Running" task. You need to determine whether the Task Queue already exists. If it exists, monitor the status of the task instance. If it does not exist, resubmit the task instance. + + + +- Worker fault tolerance flow chart: + +

+ Worker Fault Tolerance Flowchart +

+ +Once the Master Scheduler thread finds the task instance as "need to be fault tolerant", it takes over the task and resubmits. + + Note: Because the "network jitter" may cause the node to lose the heartbeat of ZooKeeper in a short time, the node's remove event occurs. In this case, we use the easiest way, that is, once the node has timeout connection with ZooKeeper, it will directly stop the Master or Worker service. + +###### 2. Task failure retry + +Here we must first distinguish between the concept of task failure retry, process failure recovery, and process failure rerun: + +- Task failure Retry is task level, which is automatically performed by the scheduling system. For example, if a shell task sets the number of retries to 3 times, then the shell task will try to run up to 3 times after failing to run. +- Process failure recovery is process level, is done manually, recovery can only be performed **from the failed node** or **from the current node** +- Process failure rerun is also process level, is done manually, rerun is from the start node + + + +Next, let's talk about the topic, we divided the task nodes in the workflow into two types. + +- One is a business node, which corresponds to an actual script or processing statement, such as a Shell node, an MR node, a Spark node, a dependent node, and so on. +- There is also a logical node, which does not do the actual script or statement processing, but the logical processing of the entire process flow, such as sub-flow sections. + +Each **service node** can configure the number of failed retries. When the task node fails, it will automatically retry until it succeeds or exceeds the configured number of retries. **Logical node** does not support failed retry. But the tasks in the logical nodes support retry. + +If there is a task failure in the workflow that reaches the maximum number of retries, the workflow will fail to stop, and the failed workflow can be manually rerun or process resumed. + + + +##### V. Task priority design + +In the early scheduling design, if there is no priority design and fair scheduling design, it will encounter the situation that the task submitted first may be completed simultaneously with the task submitted subsequently, but the priority of the process or task cannot be set. We have redesigned this, and we are currently designing it as follows: + +- According to **different process instance priority** prioritizes **same process instance priority** prioritizes **task priority within the same process** takes precedence over **same process** commit order from high Go to low for task processing. + + - The specific implementation is to resolve the priority according to the json of the task instance, and then save the **process instance priority _ process instance id_task priority _ task id** information in the ZooKeeper task queue, when obtained from the task queue, Through string comparison, you can get the task that needs to be executed first. + + - The priority of the process definition is that some processes need to be processed before other processes. This can be configured at the start of the process or at the time of scheduled start. There are 5 levels, followed by HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below + +

+ Process Priority Configuration +

+ + - The priority of the task is also divided into 5 levels, followed by HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below + +

` + task priority configuration +

+ +##### VI. Logback and gRPC implement log access + +- Since the Web (UI) and Worker are not necessarily on the same machine, viewing the log is not as it is for querying local files. There are two options: + - Put the logs on the ES search engine + - Obtain remote log information through gRPC communication +- Considering the lightweightness of DolphinScheduler as much as possible, gRPC was chosen to implement remote access log information. + +

+ grpc remote access +

+ +- We use a custom Logback FileAppender and Filter function to generate a log file for each task instance. +- The main implementation of FileAppender is as follows: + +```java + /** + * task log appender + */ + Public class TaskLogAppender extends FileAppender { + + ... + + @Override + Protected void append(ILoggingEvent event) { + + If (currentlyActiveFile == null){ + currentlyActiveFile = getFile(); + } + String activeFile = currentlyActiveFile; + // thread name: taskThreadName-processDefineId_processInstanceId_taskInstanceId + String threadName = event.getThreadName(); + String[] threadNameArr = threadName.split("-"); + // logId = processDefineId_processInstanceId_taskInstanceId + String logId = threadNameArr[1]; + ... + super.subAppend(event); + } +} +``` + +Generate a log in the form of /process definition id/process instance id/task instance id.log + +- Filter matches the thread name starting with TaskLogInfo: +- TaskLogFilter is implemented as follows: + +```java + /** + * task log filter + */ +Public class TaskLogFilter extends Filter { + + @Override + Public FilterReply decide(ILoggingEvent event) { + If (event.getThreadName().startsWith("TaskLogInfo-")){ + Return FilterReply.ACCEPT; + } + Return FilterReply.DENY; + } +} +``` + + + +### summary + +Starting from the scheduling, this paper introduces the architecture principle and implementation ideas of the big data distributed workflow scheduling system-DolphinScheduler. To be continued diff --git a/docs/docs/en/development/backend/mechanism/global-parameter.md b/docs/docs/en/development/backend/mechanism/global-parameter.md new file mode 100644 index 0000000000..53b73747d8 --- /dev/null +++ b/docs/docs/en/development/backend/mechanism/global-parameter.md @@ -0,0 +1,61 @@ +# Global Parameter development document + +After the user defines the parameter with the direction OUT, it is saved in the localParam of the task. + +## Usage of parameters + +Getting the direct predecessor node `preTasks` of the current `taskInstance` to be created from the DAG, get the `varPool` of `preTasks`, merge this varPool (List) into one `varPool`, and in the merging process, if parameters with the same parameter name are found, they will be handled according to the following logics: + +* If all the values are null, the merged value is null +* If one and only one value is non-null, then the merged value is the non-null value +* If all the values are not null, it would be the earliest value of the endtime of taskInstance taken by VarPool. + +The direction of all the merged properties is updated to IN during the merge process. + +The result of the merge is saved in taskInstance.varPool. + +The worker receives and parses the varPool into the format of `Map`, where the key of the map is property.prop, which is the parameter name. + +When the processor processes the parameters, it will merge the varPool and localParam and globalParam parameters, and if there are parameters with duplicate names during the merging process, they will be replaced according to the following priorities, with the higher priority being retained and the lower priority being replaced: + +* globalParam: high +* varPool: middle +* localParam: low + +The parameters are replaced with the corresponding values using regular expressions compared to ${parameter name} before the node content is executed. + +## Parameter setting + +Currently, only SQL and SHELL nodes are supported to get parameters. + +Get the parameter with direction OUT from localParam, and do the following way according to the type of different nodes. + +### SQL node + +The structure returned by the parameter is List>, where the elements of List are each row of data, the key of Map is the column name, and the value is the value corresponding to the column. + +* If the SQL statement returns one row of data, match the OUT parameter name based on the OUT parameter name defined by the user when defining the task, or discard it if it does not match. +* If the SQL statement returns multiple rows of data, the column names are matched based on the OUT parameter names defined by the user when defining the task of type LIST. All rows of the corresponding column are converted to `List` as the value of this parameter. If there is no match, it is discarded. + +### SHELL node + +The result of the processor execution is returned as `Map`. + +The user needs to define `${setValue(key=value)}` in the output when defining the shell script. + +Remove `${setValue()}` when processing parameters, split by "=", with the 0th being the key and the 1st being the value. + +Similarly match the OUT parameter name and key defined by the user when defining the task, and use value as the value of that parameter. + +Return parameter processing + +* The result of acquired Processor is String. +* Determine whether the processor is empty or not, and exit if it is empty. +* Determine whether the localParam is empty or not, and exit if it is empty. +* Get the parameter of localParam which is OUT, and exit if it is empty. +* Format String as per appeal format (`List>` for SQL, `Map>` for shell). + +Assign the parameters with matching values to varPool (List, which contains the original IN's parameters) + +* Format the varPool as json and pass it to master. +* The parameters that are OUT would be written into the localParam after the master has received the varPool. diff --git a/docs/docs/en/development/backend/mechanism/overview.md b/docs/docs/en/development/backend/mechanism/overview.md new file mode 100644 index 0000000000..4f0d592c46 --- /dev/null +++ b/docs/docs/en/development/backend/mechanism/overview.md @@ -0,0 +1,6 @@ +# Overview + + + +* [Global Parameter](global-parameter.md) +* [Switch Task type](task/switch.md) diff --git a/docs/docs/en/development/backend/mechanism/task/switch.md b/docs/docs/en/development/backend/mechanism/task/switch.md new file mode 100644 index 0000000000..490510405e --- /dev/null +++ b/docs/docs/en/development/backend/mechanism/task/switch.md @@ -0,0 +1,8 @@ +# SWITCH Task development + +Switch task workflow step as follows + +* User-defined expressions and branch information are stored in `taskParams` in `taskdefinition`. When the switch is executed, it will be formatted as `SwitchParameters` +* `SwitchTaskExecThread` processes the expressions defined in `switch` from top to bottom, obtains the value of the variable from `varPool`, and parses the expression through `javascript`. If the expression returns true, stop checking and record The order of the expression, here we record as resultConditionLocation. The task of SwitchTaskExecThread is over +* After the `switch` task runs, if there is no error (more commonly, the user-defined expression is out of specification or there is a problem with the parameter name), then `MasterExecThread.submitPostNode` will obtain the downstream node of the `DAG` to continue execution. +* If it is found in `DagHelper.parsePostNodes` that the current node (the node that has just completed the work) is a `switch` node, the `resultConditionLocation` will be obtained, and all branches except `resultConditionLocation` in the SwitchParameters will be skipped. In this way, only the branches that need to be executed are left diff --git a/docs/docs/en/development/backend/spi/alert.md b/docs/docs/en/development/backend/spi/alert.md new file mode 100644 index 0000000000..d5e94bcafa --- /dev/null +++ b/docs/docs/en/development/backend/spi/alert.md @@ -0,0 +1,75 @@ +### DolphinScheduler Alert SPI main design + +#### DolphinScheduler SPI Design + +DolphinScheduler is undergoing a microkernel + plug-in architecture change. All core capabilities such as tasks, resource storage, registration centers, etc. will be designed as extension points. We hope to use SPI to improve DolphinScheduler’s own flexibility and friendliness (extended sex). + +For alarm-related codes, please refer to the `dolphinscheduler-alert-api` module. This module defines the extension interface of the alarm plug-in and some basic codes. When we need to realize the plug-inization of related functions, it is recommended to read the code of this block first. Of course, it is recommended that you read the document. This will reduce a lot of time, but the document There is a certain degree of lag. When the document is missing, it is recommended to take the source code as the standard (if you are interested, we also welcome you to submit related documents). In addition, we will hardly make changes to the extended interface (excluding new additions) , Unless there is a major structural adjustment, there is an incompatible upgrade version, so the existing documents can generally be satisfied. + +We use the native JAVA-SPI, when you need to extend, in fact, you only need to pay attention to the extension of the `org.apache.dolphinscheduler.alert.api.AlertChannelFactory` interface, the underlying logic such as plug-in loading, and other kernels have been implemented, Which makes our development more focused and simple. + +By the way, we have adopted an excellent front-end component form-create, which supports the generation of front-end UI components based on JSON. If plug-in development involves the front-end, we will use JSON to generate related front-end UI components, org.apache.dolphinscheduler. The parameters of the plug-in are encapsulated in spi.params, which will convert all the relevant parameters into the corresponding JSON, which means that you can complete the drawing of the front-end components by way of Java code (here is mainly the form, we only care Data exchanged between the front and back ends). + +This article mainly focuses on the design and development of Alert. + +#### Main Modules + +If you don't care about its internal design, but simply want to know how to develop your own alarm plug-in, you can skip this content. + +* dolphinscheduler-alert-api + + This module is the core module of ALERT SPI. This module defines the interface of the alarm plug-in extension and some basic codes. The extension plug-in must implement the interface defined by this module: `org.apache.dolphinscheduler.alert.api.AlertChannelFactory` + +* dolphinscheduler-alert-plugins + + This module is currently a plug-in provided by us, such as Email, DingTalk, Script, etc. + + +#### Alert SPI Main class information. +AlertChannelFactory +Alarm plug-in factory interface. All alarm plug-ins need to implement this interface. This interface is used to define the name of the alarm plug-in and the required parameters. The create method is used to create a specific alarm plug-in instance. + +AlertChannel +The interface of the alert plug-in. The alert plug-in needs to implement this interface. There is only one method process in this interface. The upper-level alert system will call this method and obtain the return information of the alert through the AlertResult returned by this method. + +AlertData +Alarm content information, including id, title, content, log. + +AlertInfo +For alarm-related information, when the upper-level system calls an instance of the alarm plug-in, the instance of this class is passed to the specific alarm plug-in through the process method. It contains the alert content AlertData and the parameter information filled in by the front end of the called alert plug-in instance. + +AlertResult +The alarm plug-in sends alarm return information. + +org.apache.dolphinscheduler.spi.params +This package is a plug-in parameter definition. Our front-end uses the from-create front-end library http://www.form-create.com, which can dynamically generate the front-end UI based on the parameter list json returned by the plug-in definition, so We don't need to care about the front end when we are doing SPI plug-in development. + +Under this package, we currently only encapsulate RadioParam, TextParam, and PasswordParam, which are used to define text type parameters, radio parameters and password type parameters, respectively. + +AbsPluginParams This class is the base class of all parameters, RadioParam these classes all inherit this class. Each DS alert plug-in will return a list of AbsPluginParams in the implementation of AlertChannelFactory. + +The specific design of alert_spi can be seen in the issue: [Alert Plugin Design](https://github.com/apache/incubator-dolphinscheduler/issues/3049) + +#### Alert SPI built-in implementation + +* Email + + Email alert notification + +* DingTalk + + Alert for DingTalk group chat bots + +* EnterpriseWeChat + + EnterpriseWeChat alert notifications + + Related parameter configuration can refer to the EnterpriseWeChat robot document. + +* Script + + We have implemented a shell script for alerting. We will pass the relevant alert parameters to the script and you can implement your alert logic in the shell. This is a good way to interface with internal alerting applications. + +* SMS + + SMS alerts diff --git a/docs/docs/en/development/backend/spi/datasource.md b/docs/docs/en/development/backend/spi/datasource.md new file mode 100644 index 0000000000..5772b4357c --- /dev/null +++ b/docs/docs/en/development/backend/spi/datasource.md @@ -0,0 +1,23 @@ +## DolphinScheduler Datasource SPI main design + +#### How do I use data sources? + +The data source center supports POSTGRESQL, HIVE/IMPALA, SPARK, CLICKHOUSE, SQLSERVER data sources by default. + +If you are using MySQL or ORACLE data source, you need to place the corresponding driver package in the lib directory + +#### How to do Datasource plugin development? + +org.apache.dolphinscheduler.spi.datasource.DataSourceChannel +org.apache.dolphinscheduler.spi.datasource.DataSourceChannelFactory +org.apache.dolphinscheduler.plugin.datasource.api.client.CommonDataSourceClient + +1. In the first step, the data source plug-in can implement the above interfaces and inherit the general client. For details, refer to the implementation of data source plug-ins such as sqlserver and mysql. The addition methods of all RDBMS plug-ins are the same. + +2. Add the driver configuration in the data source plug-in pom.xml + +We provide APIs for external access of all data sources in the dolphin scheduler data source API module + +#### **Future plan** + +Support data sources such as kafka, http, files, sparkSQL, FlinkSQL, etc. \ No newline at end of file diff --git a/docs/docs/en/development/backend/spi/registry.md b/docs/docs/en/development/backend/spi/registry.md new file mode 100644 index 0000000000..0957ff3cdd --- /dev/null +++ b/docs/docs/en/development/backend/spi/registry.md @@ -0,0 +1,27 @@ +### DolphinScheduler Registry SPI Extension + +#### how to use? + +Make the following configuration (take zookeeper as an example) + +* Registry plug-in configuration, take Zookeeper as an example (registry.properties) + dolphinscheduler-service/src/main/resources/registry.properties + ```registry.properties + registry.plugin.name=zookeeper + registry.servers=127.0.0.1:2181 + ``` + +For specific configuration information, please refer to the parameter information provided by the specific plug-in, for example zk: `org/apache/dolphinscheduler/plugin/registry/zookeeper/ZookeeperConfiguration.java` +All configuration information prefixes need to be +registry, such as base.sleep.time.ms, which should be configured in the registry as follows: registry.base.sleep.time.ms=100 + +#### How to expand + +`dolphinscheduler-registry-api` defines the standard for implementing plugins. When you need to extend plugins, you only need to implement `org.apache.dolphinscheduler.registry.api.RegistryFactory`. + +Under the `dolphinscheduler-registry-plugin` module is the registry plugin we currently provide. + +#### FAQ + +1: registry connect timeout + +You can increase the relevant timeout parameters. diff --git a/docs/docs/en/development/backend/spi/task.md b/docs/docs/en/development/backend/spi/task.md new file mode 100644 index 0000000000..70b01d48ff --- /dev/null +++ b/docs/docs/en/development/backend/spi/task.md @@ -0,0 +1,15 @@ +## DolphinScheduler Task SPI extension + +#### How to develop task plugins? + +org.apache.dolphinscheduler.spi.task.TaskChannel + +The plug-in can implement the above interface. It mainly includes creating tasks (task initialization, task running, etc.) and task cancellation. If it is a yarn task, you need to implement org.apache.dolphinscheduler.plugin.task.api.AbstractYarnTask. + +We provide APIs for external access to all tasks in the dolphinscheduler-task-api module, while the dolphinscheduler-spi module is the spi general code library, which defines all the plug-in modules, such as the alarm module, the registry module, etc., you can read and view in detail . + +*NOTICE* + +Since the task plug-in involves the front-end page, the front-end SPI has not yet been implemented, so you need to implement the front-end page corresponding to the plug-in separately. + +If there is a class conflict in the task plugin, you can use [Shade-Relocating Classes](https://maven.apache.org/plugins/maven-shade-plugin/) to solve this problem. \ No newline at end of file diff --git a/docs/docs/en/development/development-environment-setup.md b/docs/docs/en/development/development-environment-setup.md new file mode 100644 index 0000000000..ad25b8577f --- /dev/null +++ b/docs/docs/en/development/development-environment-setup.md @@ -0,0 +1,159 @@ +# DolphinScheduler development + +## Software Requests + +Before setting up the DolphinScheduler development environment, please make sure you have installed the software as below: + +* [Git](https://git-scm.com/downloads): DolphinScheduler version control system +* [JDK](https://www.oracle.com/technetwork/java/javase/downloads/index.html): DolphinScheduler backend language +* [Maven](http://maven.apache.org/download.cgi): Java Package Management System +* [Node](https://nodejs.org/en/download): DolphinScheduler frontend + language + +### Clone Git Repository + +Download the git repository through your git management tool, here we use git-core as an example + +```shell +mkdir dolphinscheduler +cd dolphinscheduler +git clone git@github.com:apache/dolphinscheduler.git +``` +### compile source code + +i. If you use MySQL database, pay attention to modify pom.xml in the root project, and change the scope of the mysql-connector-java dependency to compile. + +ii. Run `mvn clean install -Prelease -Dmaven.test.skip=true` + + +## Notice + +There are two ways to configure the DolphinScheduler development environment, standalone mode and normal mode + +* [Standalone mode](#dolphinscheduler-standalone-quick-start): **Recommended**,more convenient to build development environment, it can cover most scenes. +* [Normal mode](#dolphinscheduler-normal-mode): Separate server master, worker, api, which can cover more test environments than standalone, and it is more like production environment in real life. + +## DolphinScheduler Standalone Quick Start + +> **_Note:_** Standalone server only for development and debugging, cause it use H2 Database, Zookeeper Testing Server which may not stable in production +> Standalone is only supported in DolphinScheduler 1.3.9 and later versions + +### Git Branch Choose + +Use different Git branch to develop different codes + +* If you want to develop based on a binary package, switch git branch to specific release branch, for example, if you want to develop base on 1.3.9, you should choose branch `1.3.9-release`. +* If you want to develop the latest code, choose branch branch `dev`. + +### Start backend server + +Find the class `org.apache.dolphinscheduler.server.StandaloneServer` in Intellij IDEA and clikc run main function to startup. + +### Start frontend server + +Install frontend dependencies and run it + +```shell +cd dolphinscheduler-ui +npm install +npm run start +``` + +The browser access address http://localhost:12345/dolphinscheduler can login DolphinScheduler UI. The default username and password are **admin/dolphinscheduler123** + +## DolphinScheduler Normal Mode + +### Prepare + +#### zookeeper + +Download [ZooKeeper](https://www.apache.org/dyn/closer.lua/zookeeper/zookeeper-3.6.3), and extract it. + +* Create directory `zkData` and `zkLog` +* Go to the zookeeper installation directory, copy configure file `zoo_sample.cfg` to `conf/zoo.cfg`, and change value of dataDir in conf/zoo.cfg to dataDir=./tmp/zookeeper + + ```shell + # We use path /data/zookeeper/data and /data/zookeeper/datalog here as example + dataDir=/data/zookeeper/data + dataLogDir=/data/zookeeper/datalog + ``` + +* Run `./bin/zkServer.sh` in terminal by command `./bin/zkServer.sh start`. + +#### Database + +The DolphinScheduler's metadata is stored in relational database. Currently supported MySQL and Postgresql. We use MySQL as an example. Start the database and create a new database named dolphinscheduler as DolphinScheduler metabase + +After creating the new database, run the sql file under `dolphinscheduler/dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_mysql.sql` directly in MySQL to complete the database initialization + +#### Start Backend Server + +Following steps will guide how to start the DolphinScheduler backend service + +##### Backend Start Prepare + +* Open project: Use IDE open the project, here we use Intellij IDEA as an example, after opening it will take a while for Intellij IDEA to complete the dependent download +* Plugin installation(**Only required for 2.0 or later**) + + * Registry plug-in configuration, take Zookeeper as an example (registry.properties) + dolphinscheduler-service/src/main/resources/registry.properties + ```registry.properties + registry.plugin.name=zookeeper + registry.servers=127.0.0.1:2181 + ``` +* File change + * If you use MySQL as your metadata database, you need to modify `dolphinscheduler/pom.xml` and change the `scope` of the `mysql-connector-java` dependency to `compile`. This step is not necessary to use PostgreSQL + * Modify database configuration, modify the database configuration in the `dolphinscheduler-dao/src/main/resources/application-mysql.yaml` + + + We here use MySQL with database, username, password named dolphinscheduler as an example + ```application-mysql.yaml + spring: + datasource: + driver-class-name: com.mysql.jdbc.Driver + url: jdbc:mysql://127.0.0.1:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8 + username: ds_user + password: dolphinscheduler + ``` + +* Log level: add a line `` to the following configuration to enable the log to be displayed on the command line + + `dolphinscheduler-server/src/main/resources/logback-worker.xml` + + `dolphinscheduler-server/src/main/resources/logback-master.xml` + + `dolphinscheduler-api/src/main/resources/logback-api.xml` + + here we add the result after modify as below: + + ```diff + + + + + + + ``` + +> **_Note:_** Only DolphinScheduler 2.0 and later versions need to inatall plugin before start server. It not need before version 2.0. + +##### Server start + +There are three services that need to be started, including MasterServer, WorkerServer, ApiApplicationServer. + +* MasterServer:Execute function `main` in the class `org.apache.dolphinscheduler.server.master.MasterServer` by Intellij IDEA, with the configuration *VM Options* `-Dlogging.config=classpath:logback-master.xml -Ddruid.mysql.usePingMethod=false -Dspring.profiles.active=mysql` +* WorkerServer:Execute function `main` in the class `org.apache.dolphinscheduler.server.worker.WorkerServer` by Intellij IDEA, with the configuration *VM Options* `-Dlogging.config=classpath:logback-worker.xml -Ddruid.mysql.usePingMethod=false -Dspring.profiles.active=mysql` +* ApiApplicationServer:Execute function `main` in the class `org.apache.dolphinscheduler.api.ApiApplicationServer` by Intellij IDEA, with the configuration *VM Options* `-Dlogging.config=classpath:logback-api.xml -Dspring.profiles.active=api,mysql`. After it started, you could find Open API documentation in http://localhost:12345/dolphinscheduler/doc.html + +> The `mysql` in the VM Options `-Dspring.profiles.active=mysql` means specified configuration file + +### Start Frontend Server + +Install frontend dependencies and run it + +```shell +cd dolphinscheduler-ui +npm install +npm run start +``` + +The browser access address http://localhost:12345/dolphinscheduler can login DolphinScheduler UI. The default username and password are **admin/dolphinscheduler123** diff --git a/docs/docs/en/development/e2e-test.md b/docs/docs/en/development/e2e-test.md new file mode 100644 index 0000000000..3f5e26af69 --- /dev/null +++ b/docs/docs/en/development/e2e-test.md @@ -0,0 +1,197 @@ +# DolphinScheduler E2E Automation Test + +## I. Preparatory knowledge + +### 1. The difference between E2E Test and Unit Test + +E2E, which stands for "End to End", can be translated as "end-to-end" testing. It imitates the user, starting from a certain entry point and progressively performing actions until a certain job is completed. And unit tests are different, the latter usually requires testing parameters, types and parameter values, the number of arguments, the return value, throw an error, and so on, the purpose is to ensure that a specific function to finishing the work is stable and reliable in all cases. Unit testing assumes that if all functions work correctly, then the whole product will work. + +In contrast, E2E test does not emphasize so much the need to cover all usage scenarios, it focuses on whether a complete chain of operations can be completed. For the web front-end, it is also concerned with the layout of the interface and whether the content information meets expectations. + +For example, E2E test of the login page is concerned with whether the user is able to enter and log in normally, and whether the error message is correctly displayed if the login fails. It is not a major concern whether input that is not legal is processed. + +### 2. Selenium test framework + +[Selenium](https://www.selenium.dev) is an open source testing tool for executing automated tests on a web browser. The framework uses WebDriver to transform Web Service commands into browser native calls through the browser's native components to complete operations. In simple words, it simulates the browser and makes selection operations on the elements of the page. + +A WebDriver is an API and protocol which defines a language-neutral interface for controlling the behavior of a web browser. Every browser has a specific WebDriver implementation, called a driver. The driver is the component responsible for delegating to the browser and handling the communication with Selenium and the browser. + +The Selenium framework links all these components together through a user-facing interface that allows transparent work with different browser backends, enabling cross-browser and cross-platform automation. + +## II. E2E Test + +### 1. E2E-Pages + +DolphinScheduler's E2E tests are deployed using docker-compose. The current tests are in standalone mode and are mainly used to check some basic functions such as "add, delete, change and check". For further cluster validation, such as collaboration between services or communication mechanisms between services, refer to `deploy/docker/docker-compose.yml` for configuration. + +For E2E test (the front-end part), the [page model](https://www.selenium.dev/documentation/guidelines/page_object_models/) form is used, mainly to create a corresponding model for each page. The following is an example of a login page. + +```java +package org.apache.dolphinscheduler.e2e.pages; + +import org.apache.dolphinscheduler.e2e.pages.common.NavBarPage; +import org.apache.dolphinscheduler.e2e.pages.security.TenantPage; + +import org.openqa.selenium.WebElement; +import org.openqa.selenium.remote.RemoteWebDriver; +import org.openqa.selenium.support.FindBy; +import org.openqa.selenium.support.ui.ExpectedConditions; +import org.openqa.selenium.support.ui.WebDriverWait; + +import lombok.Getter; +import lombok.SneakyThrows; + +@Getter +public final class LoginPage extends NavBarPage { + @FindBy(id = "inputUsername") + private WebElement inputUsername; + + @FindBy(id = "inputPassword") + private WebElement inputPassword; + + @FindBy(id = "btnLogin") + private WebElement buttonLogin; + + public LoginPage(RemoteWebDriver driver) { + super(driver); + } + + @SneakyThrows + public TenantPage login(String username, String password) { + inputUsername().sendKeys(username); + inputPassword().sendKeys(password); + buttonLogin().click(); + + new WebDriverWait(driver, 10) + .until(ExpectedConditions.urlContains("/#/security")); + + return new TenantPage(driver); + } +} +``` + +During the test process, we only test the elements we need to focus on, not all elements of the page. So on the login page only the username, password and login button elements are declared. The FindBy interface is provided by the Selenium test framework to find the corresponding id or class in a Vue file. + +In addition, during the testing process, the elements are not manipulated directly. The general choice is to package the corresponding methods to achieve the effect of reuse. For example, if you want to log in, you input your username and password through the `public TenantPage login()` method to manipulate the elements you pass in to achieve the effect of logging in. That is, when the user finishes logging in, he or she jumps to the Security Centre (which goes to the Tenant Management page by default). + +The goToTab method is provided in SecurityPage to test the corresponding sidebar jumps, which include TenantPage, UserPage and WorkerGroupPge and QueuePage. These pages are implemented in the same way, to test that the form's input, add and delete buttons return the corresponding pages. + +```java + public T goToTab(Class tab) { + if (tab == TenantPage.class) { + WebElement menuTenantManageElement = new WebDriverWait(driver, 60) + .until(ExpectedConditions.elementToBeClickable(menuTenantManage)); + ((JavascriptExecutor)driver).executeScript("arguments[0].click();", menuTenantManageElement); + return tab.cast(new TenantPage(driver)); + } + if (tab == UserPage.class) { + WebElement menUserManageElement = new WebDriverWait(driver, 60) + .until(ExpectedConditions.elementToBeClickable(menUserManage)); + ((JavascriptExecutor)driver).executeScript("arguments[0].click();", menUserManageElement); + return tab.cast(new UserPage(driver)); + } + if (tab == WorkerGroupPage.class) { + WebElement menWorkerGroupManageElement = new WebDriverWait(driver, 60) + .until(ExpectedConditions.elementToBeClickable(menWorkerGroupManage)); + ((JavascriptExecutor)driver).executeScript("arguments[0].click();", menWorkerGroupManageElement); + return tab.cast(new WorkerGroupPage(driver)); + } + if (tab == QueuePage.class) { + menuQueueManage().click(); + return tab.cast(new QueuePage(driver)); + } + throw new UnsupportedOperationException("Unknown tab: " + tab.getName()); + } +``` + +![SecurityPage](/img/e2e-test/SecurityPage.png) + +For navigation bar options jumping, the goToNav method is provided in `org/apache/dolphinscheduler/e2e/pages/common/NavBarPage.java`. The currently supported pages are: ProjectPage, SecurityPage and ResourcePage. + +```java + public T goToNav(Class nav) { + if (nav == ProjectPage.class) { + WebElement projectTabElement = new WebDriverWait(driver, 60) + .until(ExpectedConditions.elementToBeClickable(projectTab)); + ((JavascriptExecutor)driver).executeScript("arguments[0].click();", projectTabElement); + return nav.cast(new ProjectPage(driver)); + } + + if (nav == SecurityPage.class) { + WebElement securityTabElement = new WebDriverWait(driver, 60) + .until(ExpectedConditions.elementToBeClickable(securityTab)); + ((JavascriptExecutor)driver).executeScript("arguments[0].click();", securityTabElement); + return nav.cast(new SecurityPage(driver)); + } + + if (nav == ResourcePage.class) { + WebElement resourceTabElement = new WebDriverWait(driver, 60) + .until(ExpectedConditions.elementToBeClickable(resourceTab)); + ((JavascriptExecutor)driver).executeScript("arguments[0].click();", resourceTabElement); + return nav.cast(new ResourcePage(driver)); + } + + throw new UnsupportedOperationException("Unknown nav bar"); + } +``` + +### E2E-Cases + +Current E2E test cases supported include: File Management, Project Management, Queue Management, Tenant Management, User Management, Worker Group Management and Workflow Test. + +![E2E_Cases](/img/e2e-test/E2E_Cases.png) + +The following is an example of a tenant management test. As explained earlier, we use docker-compose for deployment, so for each test case, we need to import the corresponding file in the form of an annotation. + +The browser is loaded using the RemoteWebDriver provided with Selenium. Before each test case is started there is some preparation work that needs to be done. For example: logging in the user, jumping to the corresponding page (depending on the specific test case). + +```java + @BeforeAll + public static void setup() { + new LoginPage(browser) + .login("admin", "dolphinscheduler123") + .goToNav(SecurityPage.class) + .goToTab(TenantPage.class) + ; + } +``` + +When the preparation is complete, it is time for the formal test case writing. We use a form of @Order() annotation for modularity, to confirm the order of the tests. After the tests have been run, assertions are used to determine if the tests were successful, and if the assertion returns true, the tenant creation was successful. The following code can be used as a reference: + +```java + @Test + @Order(10) + void testCreateTenant() { + final TenantPage page = new TenantPage(browser); + page.create(tenant); + + await().untilAsserted(() -> assertThat(page.tenantList()) + .as("Tenant list should contain newly-created tenant") + .extracting(WebElement::getText) + .anyMatch(it -> it.contains(tenant))); + } +``` + +The rest are similar cases and can be understood by referring to the specific source code. + +https://github.com/apache/dolphinscheduler/tree/dev/dolphinscheduler-e2e/dolphinscheduler-e2e-case/src/test/java/org/apache/dolphinscheduler/e2e/cases + +## III. Supplements + +When running E2E tests locally, First, you need to start the local service, you can refer to this page: +[development-environment-setup](https://dolphinscheduler.apache.org/en-us/development/development-environment-setup.html) + +When running E2E tests locally, the `-Dlocal=true` parameter can be configured to connect locally and facilitate changes to the UI. + +When running E2E tests with `M1` chip, you can use `-Dm1_chip=true` parameter to configure containers supported by +`ARM64`. + +![Dlocal](/img/e2e-test/Dlocal.png) + +If a connection timeout occurs during a local run, increase the load time to a recommended 30 and above. + +![timeout](/img/e2e-test/timeout.png) + +The test run will be available as an MP4 file. + +![MP4](/img/e2e-test/MP4.png) diff --git a/docs/docs/en/development/frontend-development.md b/docs/docs/en/development/frontend-development.md new file mode 100644 index 0000000000..297a7ccee0 --- /dev/null +++ b/docs/docs/en/development/frontend-development.md @@ -0,0 +1,639 @@ +# Front-end development documentation + +### Technical selection +``` +Vue mvvm framework + +Es6 ECMAScript 6.0 + +Ans-ui Analysys-ui + +D3 Visual Library Chart Library + +Jsplumb connection plugin library + +Lodash high performance JavaScript utility library +``` + +### Development environment + +- #### Node installation +Node package download (note version v12.20.2) `https://nodejs.org/download/release/v12.20.2/` + +- #### Front-end project construction +Use the command line mode `cd` enter the `dolphinscheduler-ui` project directory and execute `npm install` to pull the project dependency package. + +> If `npm install` is very slow, you can set the taobao mirror + +``` +npm config set registry http://registry.npm.taobao.org/ +``` + +- Modify `API_BASE` in the file `dolphinscheduler-ui/.env` to interact with the backend: + +``` +# back end interface address +API_BASE = http://127.0.0.1:12345 +``` + +> ##### ! ! ! Special attention here. If the project reports a "node-sass error" error while pulling the dependency package, execute the following command again after execution. + +```bash +npm install node-sass --unsafe-perm #Install node-sass dependency separately +``` + +- #### Development environment operation +- `npm start` project development environment (after startup address http://localhost:8888) + +#### Front-end project release + +- `npm run build` project packaging (after packaging, the root directory will create a folder called dist for publishing Nginx online) + +Run the `npm run build` command to generate a package file (dist) package + +Copy it to the corresponding directory of the server (front-end service static page storage directory) + +Visit address` http://localhost:8888` + +#### Start with node and daemon under Linux + +Install pm2 `npm install -g pm2` + +Execute `pm2 start npm -- run dev` to start the project in the project `dolphinscheduler-ui `root directory + +#### command + +- Start `pm2 start npm -- run dev` + +- Stop `pm2 stop npm` + +- delete `pm2 delete npm` + +- Status `pm2 list` + +``` + +[root@localhost dolphinscheduler-ui]# pm2 start npm -- run dev +[PM2] Applying action restartProcessId on app [npm](ids: 0) +[PM2] [npm](0) ✓ +[PM2] Process successfully started +┌──────────┬────┬─────────┬──────┬──────┬────────┬─────────┬────────┬─────┬──────────┬──────┬──────────┐ +│ App name │ id │ version │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ user │ watching │ +├──────────┼────┼─────────┼──────┼──────┼────────┼─────────┼────────┼─────┼──────────┼──────┼──────────┤ +│ npm │ 0 │ N/A │ fork │ 6168 │ online │ 31 │ 0s │ 0% │ 5.6 MB │ root │ disabled │ +└──────────┴────┴─────────┴──────┴──────┴────────┴─────────┴────────┴─────┴──────────┴──────┴──────────┘ + Use `pm2 show ` to get more details about an app + +``` + +### Project directory structure + +`build` some webpack configurations for packaging and development environment projects + +`node_modules` development environment node dependency package + +`src` project required documents + +`src => combo` project third-party resource localization `npm run combo` specific view `build/combo.js` + +`src => font` Font icon library can be added by visiting https://www.iconfont.cn Note: The font library uses its own secondary development to reintroduce its own library `src/sass/common/_font.scss` + +`src => images` public image storage + +`src => js` js/vue + +`src => lib` internal components of the company (company component library can be deleted after open source) + +`src => sass` sass file One page corresponds to a sass file + +`src => view` page file One page corresponds to an html file + +``` +> Projects are developed using vue single page application (SPA) +- All page entry files are in the `src/js/conf/${ corresponding page filename => home} index.js` entry file +- The corresponding sass file is in `src/sass/conf/${corresponding page filename => home}/index.scss` +- The corresponding html file is in `src/view/${corresponding page filename => home}/index.html` +``` + +Public module and utill `src/js/module` + +`components` => internal project common components + +`download` => download component + +`echarts` => chart component + +`filter` => filter and vue pipeline + +`i18n` => internationalization + +`io` => io request encapsulation based on axios + +`mixin` => vue mixin public part for disabled operation + +`permissions` => permission operation + +`util` => tool + +### System function module + +Home => `http://localhost:8888/#/home` + +Project Management => `http://localhost:8888/#/projects/list` +``` +| Project Home +| Workflow + - Workflow definition + - Workflow instance + - Task instance +``` + +Resource Management => `http://localhost:8888/#/resource/file` +``` +| File Management +| udf Management + - Resource Management + - Function management +``` + +Data Source Management => `http://localhost:8888/#/datasource/list` + +Security Center => `http://localhost:8888/#/security/tenant` +``` +| Tenant Management +| User Management +| Alarm Group Management + - master + - worker +``` + +User Center => `http://localhost:8888/#/user/account` + +## Routing and state management + +The project `src/js/conf/home` is divided into + +`pages` => route to page directory +``` + The page file corresponding to the routing address +``` + +`router` => route management +``` +vue router, the entry file index.js in each page will be registered. Specific operations: https://router.vuejs.org/zh/ +``` + +`store` => status management +``` +The page corresponding to each route has a state management file divided into: + +actions => mapActions => Details:https://vuex.vuejs.org/zh/guide/actions.html + +getters => mapGetters => Details:https://vuex.vuejs.org/zh/guide/getters.html + +index => entrance + +mutations => mapMutations => Details:https://vuex.vuejs.org/zh/guide/mutations.html + +state => mapState => Details:https://vuex.vuejs.org/zh/guide/state.html + +Specific action:https://vuex.vuejs.org/zh/ +``` + +## specification +## Vue specification +##### 1.Component name +The component is named multiple words and is connected with a wire (-) to avoid conflicts with HTML tags and a clearer structure. +``` +// positive example +export default { + name: 'page-article-item' +} +``` + +##### 2.Component files +The internal common component of the `src/js/module/components` project writes the folder name with the same name as the file name. The subcomponents and util tools that are split inside the common component are placed in the internal `_source` folder of the component. +``` +└── components + ├── header + ├── header.vue + └── _source + └── nav.vue + └── util.js + ├── conditions + ├── conditions.vue + └── _source + └── search.vue + └── util.js +``` + +##### 3.Prop +When you define Prop, you should always name it in camel format (camelCase) and use the connection line (-) when assigning values to the parent component. +This follows the characteristics of each language, because it is case-insensitive in HTML tags, and the use of links is more friendly; in JavaScript, the more natural is the hump name. + +``` +// Vue +props: { + articleStatus: Boolean +} +// HTML + +``` + +The definition of Prop should specify its type, defaults, and validation as much as possible. + +Example: + +``` +props: { + attrM: Number, + attrA: { + type: String, + required: true + }, + attrZ: { + type: Object, + // The default value of the array/object should be returned by a factory function + default: function () { + return { + msg: 'achieve you and me' + } + } + }, + attrE: { + type: String, + validator: function (v) { + return !(['success', 'fail'].indexOf(v) === -1) + } + } +} +``` + +##### 4.v-for +When performing v-for traversal, you should always bring a key value to make rendering more efficient when updating the DOM. +``` +
    +
  • + {{ item.title }} +
  • +
+``` + +v-for should be avoided on the same element as v-if (`for example:
  • `) because v-for has a higher priority than v-if. To avoid invalid calculations and rendering, you should try to use v-if Put it on top of the container's parent element. +``` +
      +
    • + {{ item.title }} +
    • +
    +``` + +##### 5.v-if / v-else-if / v-else +If the elements in the same set of v-if logic control are logically identical, Vue reuses the same part for more efficient element switching, `such as: value`. In order to avoid the unreasonable effect of multiplexing, you should add key to the same element for identification. +``` +
    + {{ mazeyData }} +
    +
    + no data +
    +``` + +##### 6.Instruction abbreviation +In order to unify the specification, the instruction abbreviation is always used. Using `v-bind`, `v-on` is not bad. Here is only a unified specification. +``` + +``` + +##### 7.Top-level element order of single file components +Styles are packaged in a file, all the styles defined in a single vue file, the same name in other files will also take effect. All will have a top class name before creating a component. +Note: The sass plugin has been added to the project, and the sas syntax can be written directly in a single vue file. +For uniformity and ease of reading, they should be placed in the order of `