Distributed scheduling system architecture and micro-service workflow scheduling method

文档序号：135162 发布日期：2021-10-22 浏览：24次中文

阅读说明：本技术 一种分布式调度系统架构和微服务工作流调度方法 (Distributed scheduling system architecture and micro-service workflow scheduling method ) 是由许健齐海乐宋茹张艳丽于 2021-07-26 设计创作，主要内容包括：本发明涉及一种分布式调度系统架构和微服务工作流调度方法,属于计算机领域。本发明采用分布式架构设计微服务调度的控制中心,将控制逻辑与业务逻辑分离,从试验结果来看,将业务逻辑分离出去后,并且采用异步任务的方式,这时占用执行线程的时间会大幅缩短,微服务响应速度大幅提升。采用分布式锁实现后台微服务工作流的有序调度和容错,在相同的线程池数量下,能够明显提高调度中心的并发能力和高可用能力。(The invention relates to a distributed scheduling system architecture and a micro-service workflow scheduling method, belonging to the field of computers. The invention adopts a distributed architecture to design a control center of micro-service scheduling, separates control logic from service logic, separates the service logic from the test result, and adopts an asynchronous task mode, so that the time occupied by an execution thread can be greatly shortened, and the response speed of the micro-service is greatly improved. The distributed locks are adopted to realize the ordered scheduling and fault tolerance of the background micro-service workflow, and the concurrency capability and the high availability capability of the scheduling center can be obviously improved under the condition of the same number of thread pools.)

1. A distributed scheduling system architecture is characterized in that the system architecture comprises a micro-service registration center eureka, a scheduling center, an execution node, a scheduling database and a service database, wherein the scheduling center comprises a plurality of scheduling nodes;

the scheduling node and the execution node are distributed and deployed in a micro-service mode; the role and API address information of each node are registered on the eureka of the micro-service center, and are uniformly maintained by the eureka of the micro-service center;

the scheduling node comprises a remote call controller, a callback controller, a management runtime module and a core scheduler; wherein the core scheduler is constructed based on quartz; the management operation is used for realizing various management functions; the core communication between the scheduling node and the execution node comprises remote calling (RMS) and call-back (Callback), an execution instruction is sent to the execution node through a remote calling controller, a job running result returned from an executor of the execution node is received through a call-back controller, and a complex job flow sequence can be received from a job chain module of the execution node through a job management component;

the scheduling database is connected with the scheduling center and is used for persistently storing scheduling related data;

the execution node is an execution module embedded in each microservice and comprises an executor, a job chain and a service bean; the executor executes the task and returns the result to the scheduling center through the callback interface; the execution sequence and the dependency relationship of the job chain combined tasks meet the complex job scheduling requirement; the service bean is a carrier for embedding the execution node and the micro service;

the service database is connected with the execution node and used for storing the server data of the persistent micro-service application.

2. The distributed scheduling system architecture of claim 1 wherein the management functions of the scheduling node include job management, monitoring management, log management, configuration management, trigger management, and scheduling logs, and provide Restful interfaces and web page dynamic exposure.

3. The distributed scheduling system architecture of claim 1 wherein the scheduling database stores data including task sequences, monitoring data, logging data, and configuration data.

4. The distributed scheduling system architecture of claim 1 wherein the communication between the scheduling node and the executing node is via an API interface of http protocol for remote calls and result callbacks.

5. The distributed scheduling system architecture of claim 1 wherein the scheduling node sends synchronous or asynchronous execution instructions to the execution node via the remote call controller, and the executor supports both synchronous and asynchronous execution of tasks and returns results to the scheduling center via the callback interface.

6. A method for scheduling micro-service workflow based on the distributed scheduling system architecture of any of claims 1-5, characterized in that the method comprises the following steps,

s1, the scheduling node acquires an idle thread from the task scheduling thread pool, and accesses the scheduling database through the new thread to acquire a task; if the task needs to be executed, the process goes to step S2; otherwise, entering a dormant state until the step is restarted after awakening;

s2, the scheduling node acquires the process lock through competition; the process lock is distributed to the optimal node in the distributed dispatching center, namely, the optimal node is elected to be a management node, and the node which does not obtain the lock blocks until the process lock is obtained;

s3, the management node starts the affair, takes out the first task from the task database, judges the instruction type, submits the first task to the task queue, remotely calls the execution node, deletes the task in the task database, closes the affair, and records the log information;

s4, the management node releases the flow lock and then releases the thread resource, and the thread returns to the thread pool for the next task to dispatch.

7. The distributed scheduling system architecture of claim 6 wherein the scheduling node invokes the execution node in a non-blocking manner, releasing flow locks and threads without waiting for execution node result callbacks.

8. The distributed scheduling system architecture of claim 6 wherein when the instruction retrieved by the management node/execution node is "out of resources", the blocked thread is suspended and the flow that is suspended out of resources is allowed to resume execution.

9. The distributed scheduling system architecture of claim 6 wherein when a management node fails or is heartbeats due to network jitter, the following management node fault tolerance procedure is performed:

(1) the dispatching center monitors a management node fault event and triggers a fault tolerance mechanism;

(2) available scheduling nodes compete for the fault-tolerant lock, the scheduling node which obtains the fault-tolerant lock becomes a fault-tolerant management node, and the fault-tolerant management node broadcasts a fault-tolerant alarm notification and records log information;

(3) the fault-tolerant management node inquires the task instances of which the calling sources are the original fault nodes, updates the calling sources of the instances to Null and generates a new task instruction;

(4) releasing the fault-tolerant lock and completing fault tolerance.

10. The distributed scheduling system architecture of claim 9 further comprising, after fault tolerance is complete, the steps of: the thread scheduling is carried out by the scheduling center again, and the new management node takes over according to the monitoring of different states of the newly submitted task; monitoring the state of the task instance of the running task; and judging whether the task which is submitted successfully exists in the task queue, if so, monitoring the state of the task instance, and if not, resubmitting the task instance again.

Technical Field

The invention belongs to the field of computers, and particularly relates to a distributed scheduling system architecture and micro-service workflow scheduling method.

Background

The mainstream micro-service system realizes the management and control of micro-service application through a micro-service gateway product. The micro service gateway is responsible for processing interface service call among the micro service modules, and can relate to scheduling work such as safety, routing, proxy, monitoring, log, current limiting and the like, thereby forming a centralized scheduling system architecture. For the centralized architecture, all API interface services are registered in the micro service gateway, which is equivalent to that the micro service gateway performs a layer of encapsulation based on the original service API interface, and then issues the proxy service. Therefore, the call of all the micro service interface services can be intercepted in the micro service gateway. The scheduling capabilities of security, logging, current limiting, routing, etc. implemented by the micro-service gateway are all based on this interception. Meanwhile, the realization of each capability can be configured as each independent plug-in the interception process.

The centralized micro service gateway is used as an API entrance for all services, and is easily subject to performance bottleneck along with the expansion of the service scale. When each user requests to access the background application, as long as interaction between services is involved, routing is performed from the micro-service gateway, and after the services are loaded, a large number of interaction calls between internal services are superposed on the micro-service gateway, so that the problems of overload of the micro-service gateway and slow response of the background services can be caused. Another problem is that once a problem occurs with the microservice gateway, the cluster is endless and the entire cluster crashes.

To solve this problem, some solutions adopt a design of multiple gateway instances with load balancing in an architecture mode to achieve load sharing and high availability, and this mode has a disadvantage that scheduling control is not flexible enough. Some micro service systems also provide decentralized architectures, such as a SericeMesh service gateway, and through implanting an SDK packet with a control function in a service end, background services directly perform point-to-point interaction, and actual service call requests and data streams do not pass through a control center; such disadvantages are the need to design and implant SDK packages for microservices, the need for large workloads, and the unsuitability for complex scheduling of workflow tasks.

Disclosure of Invention

Technical problem to be solved

The technical problem to be solved by the invention is how to provide a distributed scheduling system architecture and a micro-service workflow scheduling method to solve the problem that a centralized micro-service gateway is easy to encounter performance bottleneck.

(II) technical scheme

In order to solve the technical problem, the invention provides a distributed scheduling system architecture, which comprises a micro-service registration center eureka, a scheduling center, an execution node, a scheduling database and a service database, wherein the scheduling center comprises a plurality of scheduling nodes;

the scheduling node comprises a remote call controller, a callback controller, a management runtime module and a core scheduler; wherein the core scheduler is constructed based on quartz; the management operation is used for realizing various management functions; the core communication between the scheduling node and the execution node comprises remote calling (RMS) and call-back (Cal lback), an execution instruction is sent to the execution node through a remote calling controller, a job running result returned from an executor of the execution node is received through a call-back controller, and a complex job flow sequence can be received from a job chain module of the execution node through a job management component;

the scheduling database is connected with the scheduling center and is used for persistently storing scheduling related data;

the service database is connected with the execution node and used for storing the server data of the persistent micro-service application.

Further, the management functions of the scheduling node comprise job management, monitoring management, log management, configuration management, trigger management and log scheduling, and a Restful interface and web page dynamic presentation are provided.

Further, the data stored in the scheduling database comprises a task sequence, monitoring data, log data and configuration data.

Furthermore, the communication between the scheduling node and the execution node carries out remote call and result callback through an API (application programming interface) of an http protocol.

Furthermore, the scheduling node sends a synchronous or asynchronous execution instruction to the execution node through the remote call controller, the executor supports synchronous and asynchronous modes to execute tasks, and the result is returned to the scheduling center through the callback interface.

The invention also provides a micro-service workflow scheduling method based on the distributed scheduling system architecture, which is characterized by comprising the following steps,

s4, the management node releases the flow lock and then releases the thread resource, and the thread returns to the thread pool for the next task to dispatch.

Furthermore, the scheduling node calls the execution node in a non-blocking mode, and the flow lock and the thread can be released without waiting for the result callback of the execution node.

Further, when the instruction acquired by the management node/execution node is "insufficient resource", the blocked thread is suspended, and the suspended flow with insufficient resource is awakened again for execution.

Further, when the management node fails or the heartbeat is lost due to network jitter, the following management node fault-tolerant flow is executed:

(1) the dispatching center monitors a management node fault event and triggers a fault tolerance mechanism;

(4) releasing the fault-tolerant lock and completing fault tolerance.

Further, the fault tolerance method also comprises the following steps after the fault tolerance is completed: the thread scheduling is carried out by the scheduling center again, and the new management node takes over according to the monitoring of different states of the newly submitted task; monitoring the state of the task instance of the running task; and judging whether the task which is submitted successfully exists in the task queue, if so, monitoring the state of the task instance, and if not, resubmitting the task instance again.

(III) advantageous effects

The invention provides a distributed scheduling system architecture and a micro-service workflow scheduling method. The distributed locks are adopted to realize the ordered scheduling and fault tolerance of the background micro-service workflow, and the concurrency capability and the high availability capability of the scheduling center can be obviously improved under the condition of the same number of thread pools.

Drawings

FIG. 1 is a diagram of a distributed scheduling system architecture according to the present invention;

FIG. 2 is a flow chart of an embodiment of distributed scheduling of the present invention;

FIG. 3 is a flow chart of distributed scheduling fault tolerance according to the present invention.

Detailed Description

In order to make the objects, contents and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.

The invention aims to provide a distributed solution for micro-service scheduling management, which is characterized in that a distributed control center is formed by independently stripping control functions in a traditional micro-service gateway, and decentralization and high availability of the control center are realized. In the distributed scheduling process, control logic and service logic are separated, ordered scheduling and fault tolerance of control nodes are guaranteed through distributed lock design, and high flexibility and scheduling of complex workflow tasks are achieved.

The invention adopts a distributed architecture to design a control center of micro-service scheduling, separates control logic from service logic, separates the service logic from the test result, and adopts an asynchronous task mode, so that the time occupied by an execution thread can be greatly shortened, and the response speed of the micro-service is greatly improved. The distributed locks are adopted to realize the ordered scheduling and fault tolerance of the background micro-service workflow, and the concurrency capability and the high availability capability of the scheduling center can be obviously improved under the condition of the same number of thread pools.

The architecture diagram of the distributed scheduling system proposed by the present invention is shown in fig. 1. The system architecture comprises a micro-service registration center eureka, a scheduling center, an execution node, a scheduling database and a service database. The dispatch center includes a plurality of dispatch nodes.

The whole architecture is established on the basis of micro-services, and the scheduling nodes and the execution nodes are distributed and deployed in a micro-service mode. The role and API address information of each node are registered on the micro service center eureka, and are uniformly maintained by the micro service center eureka. Service decentralized scheduling can be achieved, and high availability of the cluster is achieved.

The dispatching center is composed of a plurality of dispatching nodes. The single scheduling node includes a remote call controller, a callback controller, a management runtime, and a core scheduler. The core scheduler is constructed based on quartz, and the quartz native supports clustering, so that complex task triggering and scheduling can be realized;

the management function is realized through a management runtime component which is arranged in a modularized mode, the functions of job management, monitoring management, log management, configuration management, trigger management, log scheduling and the like are supported, and a Restful interface and web page dynamic display are provided.

The core communication between the scheduling node and the execution node comprises remote calling (RMS) and call-back (Cal lback), synchronous or asynchronous execution instructions are sent to the execution node through a remote calling controller, and job running results returned from an executor of the execution node are received through a call-back controller. A complex sequence of job flows may be received from a job chain module of an execution node by a job management component.

The scheduling database is connected with the scheduling center and is used for persistently storing task sequences, monitoring data, log data, configuration data and the like related to scheduling.

The execution node is an execution module embedded in each micro-service and is responsible for receiving scheduling of a scheduling center, and comprises an executor, a job chain and a service bean. The executor supports two modes of synchronous and asynchronous execution of tasks and returns the result to the dispatching center through the callback interface; the job chain can combine the execution sequence and the dependency relationship of tasks to meet the complex job scheduling requirement; the service bean is a carrier for executing the mosaic of nodes and microservices.

The service database is connected with the execution node and used for storing the server data of the persistent micro-service application.

The scheduling node and the execution node are separated, the scheduling node is only responsible for scheduling, the execution node is only responsible for service, the communication between the nodes is mainly used for remote calling and result callback through an API (application program interface) of an http (hyper text transport protocol), and the two parts can be completely decoupled, so that the overall expansibility of the system is enhanced.

The distributed scheduling method proposed by the present invention is shown in fig. 2, and includes the following steps:

and S1, the scheduling node acquires an idle thread from the task scheduling thread pool, and accesses the scheduling database through the new thread to acquire the task. If the task needs to be executed, the process goes to step S2; otherwise, entering a dormant state until the step is restarted after awakening;

and S2, the scheduling node acquires the process lock through competition. The process lock is distributed to the optimal node in the distributed dispatching center, namely, the optimal node is elected to be a management node, and the node which does not acquire the lock blocks until the process lock is acquired.

S3, the management node starts the affair, takes out the first task from the task database, judges the instruction type, and submits the instruction type to the task queue to remotely call the execution node. And then the management node deletes the task in the task database, closes the transaction and records log information.

S4, the management node releases the flow lock and then releases the thread resource, and the thread returns to the thread pool for the next task to dispatch.

In the above scenario, the management node typically initiates a remote call to the execution node asynchronously, and the flow lock and thread may be released without waiting for the execution node result to call back. In the working mode, the scheduling node can call the execution node in a non-blocking mode, performance influence caused by task service logic can be avoided, and system performance is improved.

However, in some stateful application scenarios, the task execution has a limited sequence, and the task execution can only be performed in a synchronous manner, that is, after the execution node completes the task execution through the synchronous executor, and returns the result to the management node, the management node can release the flow lock and the thread resource. If the sub-processes are still nested with each other, problems of insufficient thread circulation waiting and deadlock can be caused. The method for achieving the thread flow comprises the steps that a resource-deficient instruction type is added, when the instruction acquired by a management node/an execution node is 'resource-deficient', the blocked thread is hung, so that a thread pool has a new thread, and the hung flow with insufficient resources can be awakened and executed again.

In a scheduling system, a management node may fail or lose heartbeat due to network jitter. The distributed scheduling system of the present invention can realize high availability of clusters through fault tolerance, and a management node fault tolerance flowchart is shown in fig. 3.

(1) The dispatching center monitors the fault event of the management node and triggers a fault tolerance mechanism.

(2) Available scheduling nodes compete for the fault-tolerant lock, the scheduling node which obtains the fault-tolerant lock becomes a fault-tolerant management node, and the fault-tolerant management node broadcasts a fault-tolerant alarm notification and records log information.

(3) And the fault-tolerant management node inquires the task instances with the calling sources of the original fault nodes, updates the calling sources of the instances to Nul l and generates a new task instruction.

(4) Releasing the fault-tolerant lock and completing fault tolerance.

(5) And after the fault tolerance is completed, the thread scheduling is carried out again by the scheduling center, and the new management node takes over according to different states of the newly submitted task. Monitoring the state of the task instance of the running task; and judging whether the task which is submitted successfully exists in the task queue, if so, monitoring the state of the task instance, and if not, resubmitting the task instance again.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

11页详细技术资料下载

Distributed scheduling system architecture and micro-service workflow scheduling method

相关技术

网友询问留言