Time sequence control and synchronization method of distributed system

文档序号：409946 发布日期：2021-12-17 浏览：12次中文

阅读说明：本技术 一种分布式系统的时序控制与同步方法 (Time sequence control and synchronization method of distributed system ) 是由侯正平薛垒申臻魏冬冬于清华于 2021-09-27 设计创作，主要内容包括：本发明公开了一种分布式系统的时序控制与同步方法,该方法包括：S100：预设时序控制与同步策略,编写时序控制与同步程序,并在进行封装后形成节点控制程序包,以使节点控制程序包作为独立线程在对应节点上运行；S200：基于各节点中封装后的节点控制程序包,根据不同的场景分类,在对应节点的发送数据处以及接收数据处,利用节点控制程序包对本地节点及相邻节点执行包括不做控制、挂起、暂停、缓存、恢复在内的不同时序控制,以便分布式系统网络的全局时序控制与同步。本发明结同步机制简单、低功耗,利用数据流保证节点间的依赖关系,并且收发同步不因节点的运行速度差异而变换,不需要引入额外控制端。(The invention discloses a time sequence control and synchronization method of a distributed system, which comprises the following steps: s100: presetting a time sequence control and synchronization strategy, compiling a time sequence control and synchronization program, and forming a node control program package after packaging so that the node control program package is used as an independent thread to run on a corresponding node; s200: based on the node control program packages encapsulated in each node, classifying according to different scenes, and performing different time sequence control including no control, suspension, pause, cache and recovery on a local node and adjacent nodes by using the node control program packages at the data sending position and the data receiving position of the corresponding node so as to facilitate the global time sequence control and synchronization of the distributed system network. The invention has simple synchronization mechanism and low power consumption, ensures the dependency relationship between the nodes by using data flow, does not change the receiving and transmitting synchronization due to the difference of the running speeds of the nodes, and does not need to introduce an additional control end.)

1. A method for timing control and synchronization in a distributed system, the method comprising:

s100: configuring time sequence control and synchronization strategy attributes at each node in a distributed system network to form a time sequence description file as the input of a node control program; compiling a time sequence control and synchronization program according to the time sequence control and synchronization strategy, and forming a node control program package after packaging for being called by a local node or an adjacent node in the distributed system network so that the node control program package is used as an independent thread to run on the corresponding node when the distributed system network runs; when the node control program package runs, the specific implementation of the time sequence control and synchronization strategy is determined according to the time sequence description file and the actual communication model;

s200: based on the node control program packages encapsulated in each node, different time sequence control including no control, suspension, pause, cache and recovery is carried out on a local node and an adjacent node by using the node control program packages according to different scene classifications at a sending interface and a receiving interface of a corresponding node, so that the global time sequence control and synchronization of the distributed system network are facilitated.

2. The timing control and synchronization method of the distributed system according to claim 1, wherein the timing control and synchronization policy in step S100 includes the following control rules:

the local node sends a message to the subsequent node, and when the message does not exist, the local node is judged not to reach the preset time; when the message normally reaches the subsequent node and the judgment time is tolerable, the local node and the adjacent node are not controlled; when the message is not sent beyond the preset time, when the local node is judged to be dragged by the preorder node, the subsequent node is informed to wait, and the scene of sending the message to the subsequent node is converted into the scene of receiving the message sent by the preorder node;

the local node receives the message sent by the preorder node, and when the received message is early, the local node judges that the message is early due to low speed, and executes suspension control on the local node and pause control on the preorder node; when the message is normally received and the judgment time is tolerable, the local node and the adjacent node are not controlled; when the message does not arrive beyond the preset time and the speed of the local node is judged to be higher, executing pause control on the local node; when the message arrives late and the local node is judged to be fast, executing pause control on the local node; when the local node is in a pause state, the local node and the adjacent node are not controlled, and only the network buffer area is used for storing the message; when the local node is in the suspended state, the local node and the adjacent nodes are not controlled.

3. The timing control and synchronization method of the distributed system according to claim 1, wherein the step S200 is preceded by writing a timing description file for each node according to the timing control and synchronization policy, and presetting, by the timing description file, relevant attributes of each node including information transceiving time and tolerance to time offset.

4. The timing control and synchronization method for distributed systems according to claim 3, wherein each of the node control packages in the step S100 includes four control interfaces, respectively a clock count interface for acquiring a current clock count of the local node, a node suspension interface for suspending a node, and a node resume interface for resuming a node.

5. The timing control and synchronization method of the distributed system as claimed in claim 4, wherein the timing description file describes the transceiving requirements of each node in a formatted manner and serves as a judgment input for whether the time is correct, so as to provide the local time by using clock counting as a judgment input.

6. The method as claimed in claim 5, wherein the determination result of the node control packet in each node is generated locally at the corresponding node, and the node control packet only acts on the local node and the neighboring nodes, so as to implement the global timing control and synchronization between the node itself and the neighboring nodes.

7. The timing control and synchronization method of the distributed system according to claim 1, wherein in the step S200, the node control packet is used to send a function at a sending interface of a node, when the node actively calls the sending function, the node first performs a judgment, and determines to directly send or perform control according to a time calculation result; and at the receiving interface, when the node monitors that data arrives or a received data processing function of the node is called, controlling or directly receiving the data according to a time calculation result, and when an independent daemon thread is used, processing the condition that the message does not arrive at the preset time.

Technical Field

The invention relates to the technical field of network control, in particular to a time sequence control and synchronization method of a distributed system.

Background

The distributed system is a network computing system which has a certain independence and obtains the capability of information interaction between nodes after a plurality of nodes which run independently are interconnected in a certain form. Distributed systems typically have explicit computing tasks and split the tasks into many subtasks for execution by different nodes in the system, while there are also mutual remote calls and dependencies between nodes. The existing distributed system focuses include: how to ensure that each node in the distributed system completes interaction according to correct time and how to correct in time when the node information transmits information faults so as to ensure that the time sequence is correct.

With the development of the technology, the synchronization modes corresponding to the distributed concepts are also diversified, the current distributed system can have the following control strategies, firstly, the correct time sequence is ensured by performing access control on resources, for example, a resource lock is used, the problem which needs to be mainly solved by the distributed system is the problem of network resource preemption, a fixed process with clear requirements on the process sequence does not exist on the level of the whole system, and the process running on each node does not need strict global scheduling; secondly, the system only makes time requirements but not timing control requirements on the jobs submitted by each part, and the problem of the distributed system is to ensure the consistency and the integrity of the data taken by each node, such as a database system. And thirdly, carrying out global time service by using the concept of a central node or a global clock, and carrying out time sequence control on the whole system by using global uniform time. However, the design of global time service makes the equipment with high running speed have to suspend waiting frequently, and introduces nodes which do not exist in the original design of the system, which makes the structure complicated.

For a distributed system with multiple devices operating independently but with dependencies among the devices in a definite data processing order, the nodes each have a local clock and a running period. There is a need for a more flexible control mechanism that enables devices to form a global control sequence based on autonomous decisions.

Disclosure of Invention

The embodiment of the application provides a time sequence control and synchronization method for a distributed system, and specifies a communication data stream of time aiming at the existing distributed network with specific requirements on nodes so as to ensure the dependency relationship between the nodes, so that the synchronization relationship between the nodes is not damaged due to the difference of the running speeds of all the nodes, and a local clock forms a variable-scale clock globally by using a clock preemption mode, thereby providing the synchronization capability.

The embodiment of the application provides a time sequence control and synchronization method of a distributed system, which comprises the following steps:

s100: configuring time sequence control and synchronization strategy attributes at each node in a distributed system network to form a time sequence description file as the input of a node control program; compiling a time sequence control and synchronization program according to the time sequence control and synchronization strategy, and forming a node control program package after packaging for being called by a local node or an adjacent node in the distributed system network so that the node control program package is used as an independent thread to run on the corresponding node when the distributed system network runs; and when the node control program package runs, the specific implementation of the time sequence control and synchronization strategy is determined according to the time sequence description file and the actual communication model.

S200: based on the node control program packages encapsulated in each node, different time sequence control including no control, suspension, pause, cache and recovery is carried out on a local node and an adjacent node by using the node control program packages according to different scene classifications at a sending interface and a receiving interface of a corresponding node, so that the global time sequence control and synchronization of the distributed system network are facilitated.

Further, the timing control and synchronization policy in step S100 includes the following control rules:

Further, before the step S200, a time sequence description file is compiled for each node according to the time sequence control and synchronization policy, and relevant attributes including information transceiving time and tolerance to time deviation of each node are preset by the time sequence description file.

Further, each node control program package in step S100 includes four control interfaces, which are a clock count interface for acquiring a current clock count of the local node, a node suspend interface for suspending the node, a node pause interface for pausing the node, and a node resume interface for resuming the node.

Furthermore, the time sequence description file describes the transceiving requirements of each node in a formatting mode and is used as a judgment input for judging whether the time is correct, so that local time is provided by using clock counting as a judgment input.

Furthermore, the judgment result of the node control program packet in each node is locally generated at the corresponding node, and the node control program packet only acts on the local node and the adjacent node, so that the autonomous control of each node and the adjacent node forms global timing control and synchronization.

Further, in the step S200, the sending interface of the node sends a function by using the node control program package, and when the node actively calls the sending function, the node first performs judgment and determines to directly send or perform control according to a time calculation result; and at the receiving interface, when the node monitors that data arrives or a received data processing function of the node is called, controlling or directly receiving the data according to a time calculation result, and when an independent daemon thread is used, processing the condition that the message does not arrive at the preset time.

The technical scheme provided in the embodiment of the application has at least the following technical effects:

the method comprises the following steps that 1, aiming at the existing distributed network with specific requirements on nodes, communication data flow of time is regulated so as to ensure the dependency relationship between the nodes, the receiving and the sending are synchronous, and the nodes are not damaged due to the difference of the running speeds of the nodes.

2, the local clock forms a variable scale clock globally by using a clock preemption mode, thereby providing the synchronization capability.

3, the method has low time loss, does not introduce additional control nodes into a distributed system network, particularly in a distributed system supporting hardware simulation, ensures the time sequence based on a real system by depending on hardware, and can solve the problem of error caused by the characteristic and performance difference of each node in a simulation system by the technology of the invention.

4, the method does not need a complex global synchronization mechanism, does not need a global clock and an additional control node, takes the local clock of each node as the time basis of the communication behavior, and reduces the time loss by using a multi-clock multi-scale control method.

Drawings

FIG. 1 is a flowchart illustrating a timing control and synchronization method for a distributed system according to an embodiment of the present disclosure;

FIG. 2 is a diagram of a distributed system with explicit dependencies, in an embodiment of the present application;

FIG. 3 is a schematic diagram illustrating an implementation process of a timing control and synchronization method according to an embodiment of the present application;

FIG. 4 is a diagram illustrating a transformation of an inter-node clock according to an embodiment of the present application.

Detailed Description

In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.

The embodiment of the application provides a time sequence control and synchronization method of a distributed system, aiming at a distributed system network with specific time sequence requirements among nodes, the method ensures the dependency relationship among the nodes by using the data stream communication state in the specified time, enables the receiving and transmitting positions to be synchronous and not to be damaged due to the difference of the node running speeds, and enables a local clock to form a variable-scale clock globally by using a clock preemption mode, thereby providing the capability of synchronizing the nodes of the distributed system network.

Referring to fig. 1, the timing control and synchronization method of the distributed system in the present embodiment includes the following steps.

Step S100: configuring time sequence control and synchronization strategy attributes at each node in a distributed system network to form a time sequence description file as the input of a node control program; compiling a time sequence control and synchronization program according to a time sequence control and synchronization strategy, and forming a node control program package after packaging for being called by a local node or an adjacent node in the distributed system network so that the node control program package is used as an independent thread to run on a corresponding node when the distributed system network runs; and when the node control program package runs, the specific implementation of the time sequence control and synchronization strategy is determined according to the time sequence description file and the actual communication model.

Step S200: based on the node control program packages encapsulated in each node, different time sequence control including no control, suspension, pause, cache and recovery is carried out on the local node and the adjacent nodes by utilizing the node control program packages according to different scene classifications at the sending interface and the receiving interface of the corresponding node, so that the global time sequence control and the synchronization of the distributed system network are facilitated.

In step S100, the distributed system network is represented as a network system in which a plurality of devices independently operate and the devices have information interaction capability after being interconnected. Each device in the network system acts as a network node.

In step S100, the timing control and synchronization policy is all control behavior rules of the node control package for the node, and the timing control and synchronization policy may include the following control rules:

and acquiring an inter-node scene, analyzing and judging the node scene through a node control program package when the node sends or receives a message, and correspondingly controlling the node according to different node scenes.

The local node sends a message to the subsequent node, and when the message does not exist, the local node is judged not to reach the preset time; when the message normally reaches the subsequent node and the judgment time is tolerable, the local node and the adjacent node are not controlled; when the message is not sent beyond the preset time, the local node is judged to be tired by the preorder node, and after the subsequent node is informed to wait, the scene of sending the message to the subsequent node is converted into the scene of receiving the message sent by the preorder node.

The local node receives the message sent by the preorder node, and when the received message is early, the local node judges that the message is early due to low speed, and executes suspension control on the local node and pause control on the preorder node; when the message is normally received and the judgment time is tolerable, the local node and the adjacent node are not controlled; when the message does not arrive beyond the preset time and the speed of the local node is judged to be higher, executing pause control on the local node; when the message arrives late and the local node is judged to be fast, executing pause control on the local node; when the local node is in a pause state, the local node and the adjacent node are not controlled, and only the network buffer area is used for storing the message; when the local node is in the suspended state, the local node and the adjacent nodes are not controlled.

See table 1 below for details.

TABLE 1

The timing control and synchronization strategy of step S100 generally includes a plurality of scene classifications, and when the node control package performs scene analysis, it determines a corresponding scene classification, and executes a corresponding node control operation according to the scene classification. Specifically, referring to table 1, the node control program package determines the scene classification of the node after performing the node scene analysis, and executes the corresponding node control operation according to the corresponding scene classification, and as can be seen from table 1, the node control program package plays a role in executing the transmission of the packet by the transmission interface or the reception of the packet by the reception interface based on the transmission and reception state of the packet.

In this embodiment, the determination result of the node control program package of each node is locally generated at the corresponding node, and the node control program package only acts on the local node and the adjacent node, so that the autonomous control of each node and the adjacent node forms global timing control and synchronization capability. It can therefore be understood that all node control package decisions are made at the local node and only act on the local node and the neighboring nodes, so that the autonomous control of each node over itself and the neighboring nodes forms a global timing control and synchronization capability.

Further, in step S100, a timing control and synchronization program is written according to the timing control and synchronization policy, and is encapsulated to form a node control package. The purpose of the timing control and synchronization procedure encapsulation is to provide an easy-to-use control interface for the node. Static libraries and dynamic libraries are routinely employed. And a node control program package sending function is used at a sending interface of the node, when the node actively calls the sending function, judgment is firstly executed, and direct sending or control execution is determined according to a time calculation result. There are two control mechanisms at the receiving interface, the first is to control or directly receive data according to the time calculation result when the node monitors that there is data arrival or the received data processing function of the node is called, and the second is to use an independent daemon thread to process the condition that the message does not arrive at the preset time.

Each node control program package in step S100 includes four control interfaces, which are a clock count interface for acquiring a current clock count of the local node, a node suspend interface for suspending the node, and a node resume interface for resuming the node. It will be further appreciated that each node in the distributed system network is provided with at least four control interfaces for node packet control, a first for obtaining the current clock count of the local node, a second for suspending the node, a third for suspending the node, and a fourth for resuming the node. The node suspension interface, the node suspension interface and the node operation resuming interface are the basis for implementing the control behavior of the node control program package. Wherein, the suspension is distinguished from the pause by: when the node control program packet is suspended, the communication behavior is blocked, and the local clock is still running; a pause refers to the clock and program on a node stopping at the same time. The control interfaces are provided by respective node control programs running on the respective nodes. The clock count is the local clock of each node, and the count of each local clock has no comparative significance.

The node control program package in this embodiment plays a control role at two positions of the node, which are a node transmission interface and a node reception interface for performing network communication of the node, respectively. And the node control program controls the corresponding control interface to execute operation according to the running state of the transmitting interface/the receiving interface. Specifically, the local node time is provided through the clock counting interface to be used as a judgment input of the node control program package, and the basic control operation of the node control program package is provided through the suspending, resuming and suspending interfaces. In the operation process of the distributed system network in this embodiment, the node may be suspended, recovered, or suspended at any time, so that the entire distributed system network controls the local node to form the global timing control and synchronization of the distributed system network. The command for the local node to obtain the suspension or resume the operation may be from the local node or from the neighboring node, and only these two sources are available.

Before step S200, a timing description file is written for each node according to the timing control and synchronization strategy, and relevant attributes including information transceiving time and tolerance to time deviation of each node are preset through the timing description file. Therefore, it can be understood that a timing description file is written for each node in the distributed system network in this embodiment, and each timing description file is used to set the packet sending and receiving time, the tolerance to time deviation, and other relevant attributes of the corresponding node. An independent timing description file is written for each node, and specific values of which dependency, clock and period the node should transmit and receive data are specified.

Further, the timing description file in this embodiment describes the transceiving requirements of each node in a formatted manner, and is used as a judgment input for judging whether the time is correct, so as to provide the local time by using clock counting as a judgment input. Therefore, the timing description file and the clock count interface in this embodiment are the basis for the node control package to perform corresponding control. Of course, the time sequence description file includes, but is not limited to, the above determination method. The timing description file in this embodiment is a file that has a specific format and can be understood by a computer program. Each node has its own separate timing description file. The timing description file of each node requires the following information: the node name, the message type supported by the node, the running period of the local node, and the specified communication behavior time sequence in each period. The information recorded in the communication behavior includes, but is not limited to: message type, sender ID, receiver ID, scheduled arrival time, tolerance to early or late arrival of the message as measured by time, whether to send periodically, etc.

In step S200, different timing control is performed on the local node and the adjacent node through the node control program package, so that the distributed system network globally realizes control and control, and therefore it can be understood that, when performing packet transmission between nodes, the node control program package is used for the nodes to call based on the node control program package set by each node, so that when the node control program package is in packet transmission of the distributed system network, the node control program package operates as an independent thread along with the node, and does not affect the packet transmission. It can be seen that, referring to fig. 2, in a distributed system network with explicit dependency relationship between each node, each node having a preamble node needs to send a packet to a subsequent node according to the packet of the preamble node. The dependencies may or may not be in a ring, see the dashed arrows in fig. 2.

In this embodiment, referring to fig. 3, a node control program package executes a node control schematic diagram, which includes the following steps: s211, controlling the starting of each node main process in the distributed system network; s212, initializing each node control program package, and reading a time sequence description file in each node; s213, loading the node control program package read in the time sequence description file into a synchronization strategy, executing time sequence control and synchronization operation on each node according to the synchronization strategy, including active control on a preamble node, triggering control on a subsequent node by a callback function, starting whether a check message in a daemon thread arrives at a preset time, and executing control according to local time.

Through the control of each node on the node and the adjacent nodes thereof, the global timing control and synchronization of the distributed system network are formed, and referring to fig. 4, 3 nodes with dependency relationship form data flow in a single period. The arrows in the figure represent the one-time communication behavior, the three vertical lines represent the respective times of the three devices, the nodes have no comparability with each other, and the bold part represents the real functioning clock. It can be seen that the method of the present invention forms a conceptual global clock by using the manner of jumping between local clocks of each node.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

12页详细技术资料下载

Time sequence control and synchronization method of distributed system

相关技术

网友询问留言