From 2c1583d1941f9b2cdfcfced80db359cb738d8981 Mon Sep 17 00:00:00 2001 From: David Date: Wed, 26 Oct 2022 16:03:30 +0800 Subject: [PATCH] [Doc] Update the readme content (#12500) * [Doc] Update the readme content * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md --- README.md | 88 ++++++++++++++++++++++++++----------------------------- 1 file changed, 42 insertions(+), 46 deletions(-) diff --git a/README.md b/README.md index a14bce1bc6..8c49415866 100644 --- a/README.md +++ b/README.md @@ -11,46 +11,40 @@ Dolphin Scheduler Official Website [![Stargazers over time](https://starchart.cc/apache/dolphinscheduler.svg)](https://starchart.cc/apache/dolphinscheduler) [![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) -[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) - -## Design Features - -DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. - -Its main objectives are as follows: -- Highly Reliable, -DolphinScheduler adopts a decentralized multi-master and multi-worker architecture design, which naturally supports easy expansion and high availability (not restricted by a single point of bottleneck), and its performance increases linearly with the increase of machines -- High performance, supporting tens of millions of tasks every day -- Support multi-tenant. -- Cloud Native, DolphinScheduler supports multi-cloud/data center workflow management, also -supports Kubernetes, Docker deployment and custom task types, distributed -scheduling, with overall scheduling capability increased linearly with the -scale of the cluster -- Support various task types: Shell, MR, Spark, SQL (MySQL, PostgreSQL, hive, spark SQL), Python, Sub_Process, Procedure, etc. -- Support scheduling of workflows and dependencies, manual scheduling to pause/stop/recover task, support failure task retry/alarm, recover specified nodes from failure, kill task, etc. -- Associate the tasks according to the dependencies of the tasks in a DAG graph, which can visualize the running state of the task in real-time. -- WYSIWYG online editing tasks -- Support the priority of workflows & tasks, task failover, and task timeout alarm or failure. -- Support workflow global parameters and node customized parameter settings. -- Support online upload/download/management of resource files, etc. Support online file creation and editing. -- Support task log online viewing and scrolling and downloading, etc. -- Support the viewing of Master/Worker CPU load, memory, and CPU usage metrics. -- Support displaying workflow history in tree/Gantt chart, as well as statistical analysis on the task status & process status in each workflow. -- Support back-filling data. -- Support internationalization. -- More features waiting for partners to explore... - -## What's in DolphinScheduler - -| Stability | Accessibility | Features | Scalability | -|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Decentralized multi-master and multi-worker | Visualization of workflow key information, such as task status, task type, retry times, task operation machine information, visual variables, and so on at a glance.  |  Support pause, recover operation | Support customized task types | -| support HA | Visualization of all workflow operations, dragging tasks to draw DAGs, configuring data sources and resources. At the same time, for third-party systems, provide API mode operations. | Users on DolphinScheduler can achieve many-to-one or one-to-one mapping relationship through tenants and Hadoop users, which is very important for scheduling large data jobs. | The scheduler supports distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster. Master and Worker support dynamic adjustment. | -| Overload processing: By using the task queue mechanism, the number of schedulable tasks on a single machine can be flexibly configured. Machine jam can be avoided with high tolerance to numbers of tasks cached in task queue. | One-click deployment | Support traditional shell tasks, and big data platform task scheduling: MR, Spark, SQL (MySQL, PostgreSQL, hive, spark SQL), Python, Procedure, Sub_Process | | -## User Interface Screenshots +## Features + +Apache DolphinScheduler is the modern data workflow orchestration platform with powerful user interface, dedicated to solving complex task dependencies in the data pipeline and providing various types of jobs available `out of the box` + +The key features for DolphinScheduler are as follows: +- Easy to deploy, we provide 4 ways to deploy, such as Standalone deployment,Cluster deployment,Docker / Kubernetes deployment and Rainbond deployment +- Easy to use, there are 3 ways to create workflows: + - Visually, create tasks by dragging and dropping tasks + - Creating workflows by PyDolphinScheduler(Python way) + - Creating workflows through Open API + +- Highly Reliable, +DolphinScheduler uses a decentralized multi-master and multi-worker architecture, which naturally supports horizontal scaling and high availability +- High performance, its performance is N times faster than other orchestration platform and it can support tens of millions of tasks per day +- Supports multi-tenancy +- Supports various task types: Shell, MR, Spark, SQL (MySQL, PostgreSQL, Hive, Spark SQL), Python, Procedure, Sub_Workflow, +Http, K8s, Jupyter, MLflow, SageMaker, DVC, Pytorch, Amazon EMR, etc +- Orchestrating workflows and dependencies, you can pause/stop/recover task any time, failed tasks can be set to automatically retry +- Visualizing the running state of the task in real-time and seeing the task runtime log +- What you see is what you get when you edit the task on the UI +- Backfill can be operated on the UI directly +- Perfect project, resource, data source-level permission control +- Displaying workflow history in tree/Gantt chart, as well as statistical analysis on the task status & process status in each workflow +- Supports internationalization +- Cloud Native, DolphinScheduler supports orchestrating multi-cloud/data center workflow, and +supports custom task type +- More features waiting for partners to explore + + +## User Interface Screenshots ![dag](./images/en_US/dag.png) + ![data-source](./images/en_US/data-source.png) ![home](./images/en_US/home.png) ![master](./images/en_US/master.png) @@ -77,26 +71,28 @@ dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-b dolphinscheduler-dist/target/apache-dolphinscheduler-${latest.release.version}-src.tar.gz: Source code package of DolphinScheduler ``` -## Thanks - -DolphinScheduler is based on a lot of excellent open-source projects, such as Google guava, grpc, netty, quartz, and many open-source projects of Apache and so on. -We would like to express our deep gratitude to all the open-source projects used in Dolphin Scheduler. We hope that we are not only the beneficiaries of open-source, but also give back to the community. Besides, we hope everyone who have the same enthusiasm and passion for open source could join in and contribute to the open-source community! - ## Get Help 1. Submit an [issue](https://github.com/apache/dolphinscheduler/issues/new/choose) -2. [Join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#troubleshooting` +2. [Join our slack](https://s.apache.org/dolphinscheduler-slack) and send your question to channel `#general` +3. Send email to users@dolphinscheduler.apache.org or dev@dolphinscheduler.apache.org ## Community You are very welcome to communicate with the developers and users of Dolphin Scheduler. There are two ways to find them: -1. Join the Slack channel [Slack](https://asf-dolphinscheduler.slack.com/). -2. Follow the [Twitter account of DolphinScheduler](https://twitter.com/dolphinschedule) and get the latest news on time. +1. Join the Slack channel [Slack](https://asf-dolphinscheduler.slack.com/) +2. Follow the [Twitter account of DolphinScheduler](https://twitter.com/dolphinschedule) and get the latest news on time ## How to Contribute The community welcomes everyone to contribute, please refer to this page to find out more: [How to contribute](docs/docs/en/contribute/join/contribute.md). + +## Thanks + +DolphinScheduler is based on a lot of excellent open-source projects, such as Google guava, grpc, netty, quartz, and many open-source projects of Apache and so on. +We would like to express our deep gratitude to all the open-source projects used in DolphinScheduler. We hope that we are not only the beneficiaries of open-source, but also give back to the community. Besides, we hope everyone who have the same enthusiasm and passion for open source could join in and contribute to the open-source community + # Landscapes

@@ -109,4 +105,4 @@ DolphinScheduler enriches the