分布式调度框架。
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Jiajie Zhong 2e7036529c
Add CI test for python API (#6636)
3 years ago
..
examples Add example about how to bulk create workflow (#6659) 3 years ago
src/pydolphinscheduler Add CI test for python API (#6636) 3 years ago
tests Add CI test for python API (#6636) 3 years ago
README.md Add CI test for python API (#6636) 3 years ago
ROADMAP.md Add Python API implementation of workflows-as-code (#6269) 3 years ago
pytest.ini Add CI test for python API (#6636) 3 years ago
requirements.txt Add Python API implementation of workflows-as-code (#6269) 3 years ago
requirements_dev.txt Add Python API implementation of workflows-as-code (#6269) 3 years ago
setup.cfg Add Python API implementation of workflows-as-code (#6269) 3 years ago
setup.py Add Python API implementation of workflows-as-code (#6269) 3 years ago

README.md

pydolphinscheduler

GitHub Build

pydolphinscheduler is python API for Apache DolphinScheduler, which allow you definition your workflow by python code, aka workflow-as-codes.

Quick Start

Notice: For now, due to pydolphinscheduler without release to any binary tarball or PyPI, you have to clone Apache DolphinScheduler code from GitHub to ensure quick start setup

Here we show you how to install and run a simple example of pydolphinscheduler

Prepare

# Clone code from github
git clone git@github.com:apache/dolphinscheduler.git

# Install pydolphinscheduler from source
cd dolphinscheduler-python/pydolphinscheduler
pip setup.py install

Start Server And Run Example

Before you run an example, you have to start backend server. You could follow development setup section "DolphinScheduler Standalone Quick Start" to set up developer environment. You have to start backend and frontend server in this step, which mean that you could view DolphinScheduler UI in your browser with URL http://localhost:12345/dolphinscheduler

After backend server is being start, all requests from pydolphinscheduler would be sent to backend server. And for now we could run a simple example by:

cd dolphinscheduler-python/pydolphinscheduler
python example/tutorial.py

NOTICE: Since Apache DolphinScheduler's tenant is requests while running command, you might need to change tenant value in example/tutorial.py. For now the value is tenant_exists, please change it to username exists in you environment.

After command execute, you could see a new project with single process definition named tutorial in the UI.

Until now, we finish quick start by an example of pydolphinscheduler and run it. If you want to inspect or join pydolphinscheduler develop, you could take a look at develop

Develop

pydolphinscheduler is python API for Apache DolphinScheduler, it just defines what workflow look like instead of store or execute it. We here use py4j to dynamically access Java Virtual Machine.

Setup Develop Environment

We already clone the code in quick start, so next step we have to open pydolphinscheduler project in you editor. We recommend you use pycharm instead of IntelliJ IDEA to open it. And you could just open directory dolphinscheduler-python/pydolphinscheduler instead of dolphinscheduler-python.

Brief Concept

Apache DolphinScheduler is design to define workflow by UI, and pydolphinscheduler try to define it by code. When define by code, user usually do not care user, tenant, or queue exists or not. All user care about is created a new workflow by the code his/her definition. So we have some side object in pydolphinscheduler/side directory, their only check object exists or not, and create them if not exists.

Process Definition

pydolphinscheduler workflow object name, process definition is also same name as Java object(maybe would be change to other word for more simple).

Tasks

pydolphinscheduler tasks object, we use tasks to define exact job we want DolphinScheduler do for us. For now, we only support shell task to execute shell task. This link list all tasks support in DolphinScheduler and would be implemented in the further.

Testing

pydolphinscheduler using pytest to test our codebase. GitHub Action will run our test when you create pull request or commit to dev branch, with python version 3.6|3.7|3.8|3.9 and operating system linux|macOS|windows.

To test locally, you could directly run pytest after set PYTHONPATH

PYTHONPATH=src/ pytest