Data processing method, data processor, target source component and system

文档序号:85752 发布日期:2021-10-08 浏览:53次 中文

阅读说明:本技术 一种数据处理方法、数据处理器、目标源组件和系统 (Data processing method, data processor, target source component and system ) 是由 邹年当 朱火庚 于 2021-07-08 设计创作,主要内容包括:本申请公开了一种数据处理方法、数据处理器、目标源组件和系统,方法包括:数据处理器在完成初始化后,接收数据源组件发送的从数据源中获取的源数据;数据处理器对源数据进行处理,并将处理后的源数据发送给目标源组件,使得目标源组件基于初始化后生成的目标源数据交换协议将处理后的源数据写入至对应的目标源,提高了数据处理效率。(The application discloses a data processing method, a data processor, a target source component and a system, wherein the method comprises the following steps: after the initialization is completed, the data processor receives source data which is sent by a data source component and acquired from the data source; the data processor processes the source data and sends the processed source data to the target source component, so that the target source component writes the processed source data into the corresponding target source based on the target source data exchange protocol generated after initialization, and the data processing efficiency is improved.)

1. A data processing method, comprising:

after the initialization is completed, the data processor receives source data which is sent by a data source component and acquired from the data source;

the data processor processes the source data and sends the processed source data to a target source component, so that the target source component writes the processed source data into a corresponding target source based on a target source data exchange protocol generated after initialization.

2. The data processing method of claim 1, wherein the initialization process of the data processor comprises:

the data processor loads target parameters and predefined data processing rules from a configuration file, wherein the target parameters comprise the number of threads or network parameters;

the data processor initializes an algorithm chain according to the predefined data processing rule to obtain an instantiated algorithm chain, and the instantiated algorithm chain is composed of a plurality of instantiated execution units;

correspondingly, the data processor processes the source data, including:

and the data processor sequentially processes the source data through a plurality of execution units in the instantiated algorithm chain based on the target parameters.

3. The data processing method of claim 1, further comprising:

and when data accumulation occurs, the data processor sends a notification message to the data source component, so that the data source component adjusts the sending rate of the source data.

4. The data processing method of claim 1, further comprising:

and the data processor resends the processed source data to the target source component when the data processor does not receive the confirmation message fed back by the target source component within the preset time period for sending the processed source data.

5. A data processing method, comprising:

initializing a target source component to generate a target source data exchange protocol;

after receiving the processed source data sent by the data processor, the target source component writes the processed source data into a corresponding target source based on the target source data exchange protocol;

the processed source data is obtained by processing source data acquired from a data source and sent by a data processor to a data source component.

6. The data processing method of claim 5, wherein the target source data exchange protocol comprises: write logic, a target source exchange protocol, and a target address;

after receiving the processed source data sent by the data processor, the target source component writes the processed source data into a corresponding target source based on the target source data exchange protocol, including:

after receiving the processed source data sent by the data processor, the target source component puts the processed source data into a buffer area, so that each target source instance in the target source component acquires the processed source data from the buffer area, analyzes the processed source data into corresponding target source executable logic based on the target source exchange protocol, and writes the processed source data into a corresponding target source according to the write-in logic and the target address.

7. The data processing method of claim 5, further comprising:

and after the target source component successfully writes the data, sending a confirmation message to the data processor.

8. A data processor, comprising:

the receiving unit is used for receiving source data which is sent by a data source component and acquired from the data source after the initialization is finished;

and the processing unit is used for processing the source data and sending the processed source data to a target source component, so that the target source component writes the processed source data into a corresponding target source based on a target source data exchange protocol generated after initialization.

9. A target source component, comprising:

the initialization unit is used for initializing and generating a target source data exchange protocol;

the writing unit is used for writing the processed source data into a corresponding target source based on the target source data exchange protocol after receiving the processed source data sent by the data processor;

the processed source data is obtained by processing source data acquired from a data source and sent by a data processor to a data source component.

10. A data processing system, comprising: a data source component, the data processor of claim 8, and the target source component of claim 9;

the data source component and the target source component are respectively in communication connection with the data processor.

Technical Field

The present application relates to the field of computer technologies, and in particular, to a data processing method, a data processor, a target source component, and a system.

Background

In the field of computers, developers often develop different open source systems for different service data, and different system programming models have large differences, so that service logic cannot be seamlessly migrated among the open source systems, and the problem of low data processing efficiency is caused.

Disclosure of Invention

The application provides a data processing method, a data processor, a target source component and a system, which are used for solving the technical problem of low data processing efficiency of the existing open source system.

In view of this, a first aspect of the present application provides a data processing method, including:

after the initialization is completed, the data processor receives source data which is sent by a data source component and acquired from the data source;

the data processor processes the source data and sends the processed source data to a target source component, so that the target source component writes the processed source data into a corresponding target source based on a target source data exchange protocol generated after initialization.

Optionally, the initialization process of the data processor includes:

the data processor loads target parameters and predefined data processing rules from a configuration file, wherein the target parameters comprise the number of threads or network parameters;

the data processor initializes an algorithm chain according to the predefined data processing rule to obtain an instantiated algorithm chain, and the instantiated algorithm chain is composed of a plurality of instantiated execution units;

correspondingly, the data processor processes the source data, including:

and the data processor sequentially processes the source data through a plurality of execution units in the instantiated algorithm chain based on the target parameters.

Optionally, the method further includes:

and when data accumulation occurs, the data processor sends a notification message to the data source component, so that the data source component adjusts the sending rate of the source data.

Optionally, the method further includes:

and the data processor resends the processed source data to the target source component when the data processor does not receive the confirmation message fed back by the target source component within the preset time period for sending the processed source data.

A second aspect of the present application provides a data processing method, including:

initializing a target source component to generate a target source data exchange protocol;

after receiving the processed source data sent by the data processor, the target source component writes the processed source data into a corresponding target source based on the target source data exchange protocol;

the processed source data is obtained by processing source data acquired from a data source and sent by a data processor to a data source component.

Optionally, the target source data exchange protocol includes: write logic, a target source exchange protocol, and a target address;

after receiving the processed source data sent by the data processor, the target source component writes the processed source data into a corresponding target source based on the target source data exchange protocol, including:

after receiving the processed source data sent by the data processor, the target source component puts the processed source data into a buffer area, so that each target source instance in the target source component acquires the processed source data from the buffer area, analyzes the processed source data into corresponding target source executable logic based on the target source exchange protocol, and writes the processed source data into a corresponding target source according to the write-in logic and the target address.

Optionally, the method further includes:

and after the target source component successfully writes the data, sending a confirmation message to the data processor.

A third aspect of the present application provides a data processor comprising:

the receiving unit is used for receiving source data which is sent by a data source component and acquired from the data source after the initialization is finished;

and the processing unit is used for processing the source data and sending the processed source data to a target source component, so that the target source component writes the processed source data into a corresponding target source based on a target source data exchange protocol generated after initialization.

A fourth aspect of the present application provides a target source component comprising:

the initialization unit is used for initializing and generating a target source data exchange protocol;

the writing unit is used for writing the processed source data into a corresponding target source based on the target source data exchange protocol after receiving the processed source data sent by the data processor;

the processed source data is obtained by processing source data acquired from a data source and sent by a data processor to a data source component.

A fifth aspect of the present application provides a data processing system comprising: a data source component, the data processor of the third aspect and the target source component of the fourth aspect;

the data source component and the target source component are respectively in communication connection with the data processor.

According to the technical scheme, the method has the following advantages:

the application provides a data processing method, which comprises the following steps: after the initialization is completed, the data processor receives source data which is sent by a data source component and acquired from the data source; and the data processor processes the source data and sends the processed source data to the target source component, so that the target source component writes the processed source data into the corresponding target source based on the target source data exchange protocol generated after initialization.

In the application, after the data processor is initialized, the data processor receives source data acquired by a data source component from a data source, can uniformly process the source data and then transmit the source data to a target source component, and the target source component can transmit the processed source data to different target sources according to a target source data exchange protocol generated after initialization.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a schematic flowchart of a data processing method applied to a data processor according to an embodiment of the present application;

FIG. 2 is a flow chart illustrating a data processing method applied to a data component according to an embodiment of the present application;

FIG. 3 is a block diagram of a data processing system according to an embodiment of the present application;

FIG. 4 is a schematic diagram illustrating the workflow of various components of the data processing system in accordance with an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

For easy understanding, referring to fig. 1, an embodiment of a data processing method provided in the present application includes:

step 101, after completing initialization, the data processor receives source data acquired from a data source, which is sent by a data source component.

A data Processor (Processor) is a unified abstraction of data processing logic while providing a friendly programming interface and a configurable way to implement the data processing logic and implement portions of the high-level features. The data processor encapsulates business processing logic, receives data from upstream (source data components) and outputs to downstream (target source components).

When the data processor is started, initialization is required to be performed, and the initialization process of the data processor comprises the following steps:

the data processor loads target parameters and predefined data processing rules from the configuration file, wherein the target parameters comprise the number of threads or network parameters;

the data processor initializes the algorithm chain according to the predefined data processing rule to obtain an instantiated algorithm chain, and the instantiated algorithm chain is composed of a plurality of instantiated execution units.

Specifically, the data processor can be divided into 3 stages during initialization:

first, system-level configuration is initialized: and loading the thread number, the network parameters and the like from a configuration file, wherein the configuration file is configured in advance by a user.

Second, the logical abstraction of the data transformation is initialized: loading predefined data processing rules from the configuration file, wherein the rules comprise conversion, replacement, mapping and the like, and logic units which can be executed by the system; all logic units have independent configurations to accomplish instantiation, e.g., dependent database configuration, cache configuration, etc.

Finally, an algorithm chain is initialized, wherein the algorithm chain is formed by connecting a series of logic units in series, and therefore the instantiated algorithm chain is composed of a series of instantiated execution units.

After the initialization is completed, the data processor receives Source data sent by a Source data component (Source), wherein the Source data component pulls the Source data from the data Source after the initialization.

And 102, processing the source data by the data processor, and sending the processed source data to the target source component, so that the target source component writes the processed source data into the corresponding target source based on the target source data exchange protocol generated after initialization.

The data processor processes the source data in sequence through a plurality of execution units in an instantiated algorithm chain based on the target parameters. After the data processor finishes processing the source data, the source data can enter from the first execution unit in the instantiated algorithm chain, is output from the last execution unit and is written into each Sink in the target source component. Sink is a logical abstraction of data ground, and interacts with external systems, such as databases, file systems, networks, and the like. The target source component can floor data to multiple destinations simultaneously.

When the target source component starts, a buffer component (MQ) is initialized, and a corresponding target source data exchange protocol is generated. And after the initialization is finished, receiving the processed source data sent by the data processor, and writing the processed source data into the corresponding target source based on the target source data exchange protocol generated after the initialization. The target source component is provided with a plurality of sinks, and places the received processed source data into a buffer area, so that each Sink acquires data from the buffer area, writes the processed source data into a corresponding target source according to a write-in logic, a target address and a target source exchange protocol in a target source data exchange protocol, and generally writes the data into a remote target source in a client mode. And the target source component sends an acknowledgement message to the data processor after the write is successful.

Further, the data processor resends the processed source data to the target source component when the data processor does not receive the confirmation message fed back by the target source component within the preset time period for sending the processed source data.

Further, the data processor sends a notification message to the data source component when data accumulation occurs, causing the data source component to adjust the sending rate of the source data.

In the embodiment of the application, in consideration of data fault tolerance, when the data processor finishes processing data, the data processor sends the corresponding confirmation message to the data source component, and provides a timeout mechanism, and the upstream does not receive the confirmation message within a preset time period, and the upstream resends the data, so that the data is not lost. Each downstream can also feed back the processing capacity of the downstream at regular time, and when data accumulation occurs, a backpressure mechanism is passively triggered to send a notification message to the upstream, so that the upstream adjusts the sending rate of the data to ensure that the system does not crash.

In the embodiment of the application, after the data processor is initialized, the data processor receives source data acquired by a data source component from a data source, can uniformly process the source data and then transmit the source data to a target source component, and the target source component can transmit the processed source data to different target sources according to a target source data exchange protocol generated after initialization.

The above is an embodiment of a data processing method applied to a data processor provided by the present application, and the following is an embodiment of a data processing method applied to a target source component provided by the present application.

Referring to fig. 2, a data processing method provided in an embodiment of the present application includes:

step 201, the target source component initializes and generates a target source data exchange protocol.

After a target source (Sink) component is started, initialization is required, specifically:

1. and initializing Sink, wherein the core of the Sink is configured as a Transport (target source data exchange protocol) and used for generating information such as write logic, a target source exchange protocol and a target address of a target source.

2. Initializing a Sink Group to support output of a plurality of target sources, wherein the Sink Group consists of a plurality of sinks, and when MQ is used as a cache component, data can be output to the target sources in parallel by using a consumption Group model of the MQ;

3. the initialization buffer component is used as a medium for data buffering, and can be a direct memory, an MQ component and the like, and meanwhile, the time of buffering and a persistence strategy are adjusted according to whether a plurality of data sources need to be output or not.

Step 202, after receiving the processed source data sent by the data processor, the target source component writes the processed source data into the corresponding target source based on the target source data exchange protocol, where the processed source data is obtained by processing the source data acquired from the data source and sent by the data processor to the data source component.

After receiving the processed source data sent by the data processor, the target source component puts the processed source data into a buffer area, so that each target source instance in the target source component acquires the processed source data from the buffer area, analyzes the processed source data into corresponding target source executable logic based on a target source exchange protocol, and writes the processed source data into the corresponding target source according to the write-in logic and the target address.

The target source component puts processed source data sent by the data processor into a buffer, each Sink instance (obtained after initialization) in the target source component acquires data from the buffer, each Sink instance has a corresponding Transport, each Sink instance analyzes the data into corresponding target source executable logic according to a target source exchange protocol in the corresponding Transport, for example, generation of some SQL statements, and each Sink instance then executes the corresponding write logic according to a corresponding target address to write the data into a corresponding target source.

Further, the target source component sends an acknowledgement message to the data processor after the data is successfully written.

Further, the target source component sends a notification message to the data processor when data accumulation occurs, so that the data processor adjusts the sending rate of the processed source data.

In the embodiment of the application, in consideration of data fault tolerance, after the target source component successfully writes data, the corresponding acknowledgement message is sent to the data processor, a timeout mechanism is provided, the upstream does not receive the acknowledgement message within a preset time period, and the upstream resends the data, so that the data is not lost. Each downstream can also feed back the processing capacity of the downstream at regular time, and when data accumulation occurs, a backpressure mechanism is passively triggered to send a notification message to the upstream, so that the upstream adjusts the sending rate of the data to ensure that the system does not crash.

In the embodiment of the application, after the data processor is initialized, the data processor receives source data acquired by a data source component from a data source, can uniformly process the source data and then transmit the source data to a target source component, and the target source component can transmit the processed source data to different target sources according to a target source data exchange protocol generated after initialization.

The foregoing is an embodiment of a data processing method applied to a target source component provided in the present application, and the following is an embodiment of a data processor provided in the present application.

The data processor in the embodiment of the application comprises:

the receiving unit is used for receiving source data which is sent by a data source component and acquired from the data source after the initialization is finished;

and the processing unit is used for processing the source data and sending the processed source data to the target source component, so that the target source component writes the processed source data into the corresponding target source based on the target source data exchange protocol generated after initialization.

As a further improvement, the initialization process of the data processor comprises:

the data processor loads target parameters and predefined data processing rules from the configuration file, wherein the target parameters comprise the number of threads or network parameters;

the data processor initializes the algorithm chain according to the predefined data processing rule to obtain an instantiated algorithm chain, and the instantiated algorithm chain is composed of a plurality of instantiated execution units;

correspondingly, the processing unit is specifically configured to:

based on the target parameters, sequentially processing the source data through a plurality of execution units in an instantiation algorithm chain;

and sending the processed source data to the target source component, so that the target source component writes the processed source data into the corresponding target source based on the target source data exchange protocol generated after initialization.

As a further improvement, the method further comprises the following steps:

and the first sending unit is used for sending a notification message to the data source component when data accumulation occurs, so that the data source component adjusts the sending rate of the source data.

As a further improvement, the method further comprises the following steps:

and the second sending unit is used for resending the processed source data to the target source component when the confirmation message fed back by the target source component is not received in the preset time period for sending the processed source data.

The foregoing is an embodiment of a data processor provided herein and the following is an embodiment of a target source component provided herein.

The target source component in the embodiment of the application comprises:

the initialization unit is used for initializing and generating a target source data exchange protocol;

the writing unit is used for writing the processed source data into the corresponding target source based on the target source data exchange protocol after receiving the processed source data sent by the data processor;

the processed source data is obtained by processing source data acquired from a data source and sent by a data processor to a data source component.

As a further improvement, the target source data exchange protocol comprises: write logic, a target source exchange protocol, and a target address;

correspondingly, the write unit is specifically configured to:

after receiving the processed source data sent by the data processor, the target source component puts the processed source data into a buffer area, so that each target source instance in the target source component acquires the processed source data from the buffer area, analyzes the processed source data into corresponding target source executable logic based on a target source exchange protocol, and writes the processed source data into the corresponding target source according to the write-in logic and the target address.

As a further improvement, the method further comprises the following steps:

and the first sending unit is used for sending a confirmation message to the data processor after the data is successfully written.

As a further improvement, the method further comprises the following steps:

and the second sending unit is used for sending a notification message to the data processor when data accumulation occurs, so that the data processor adjusts the sending rate of the processed source data.

The embodiment of the present application further provides a data processing system, which includes a data source component, a data processor in the foregoing embodiment, and a target source component in the foregoing embodiment;

the data source component and the target source component are respectively in communication connection with the data processor.

Referring to fig. 3, the data Source component (Source) is responsible for pulling data from the data Source in pull mode, which has the advantage of actively performing a backpressure mechanism to control the pressure of the system within an acceptable range. After the initialization is completed, the source data (data1, data2, data3) is acquired from the data source and sent to the data Processor (Processor) after the initialization is completed.

The Processor encapsulates the business processing logic, receives data from upstream and outputs the data to downstream. And after the data processor processes the source data, the source data are sent to corresponding Sink1, Sink2 and Sink3 in the target source component.

Sink is a logical abstraction of data ground, and interacts with external systems, such as databases, file systems, networks, and the like. And the Sink1, the Sink2 and the Sink3 send respective data to corresponding target sources dest1, dest2 and dest3 according to respective initial target source data exchange protocols. The target source component can output data to a plurality of target sources in parallel, and data processing efficiency can be improved.

Further, referring to fig. 4, in consideration of data fault tolerance, each downstream in the data processing system may send a corresponding acknowledgement message to the upstream after processing data, and provide a timeout mechanism, where the upstream may resend the data if the upstream does not receive the acknowledgement message within a preset time period, thereby ensuring that the data is not lost. Each downstream can also feed back the processing capacity of the downstream at regular time, and when data accumulation occurs, a backpressure mechanism is passively triggered to send a notification message to the upstream, so that the upstream adjusts the sending rate of the data to ensure that the system does not crash.

In the data processing system in the embodiment of the application, a user can customize Source, Sink and Processor; the Source, Sink and Processor can be started and operated in a configuration mode, and can meet all scenes of extraction, conversion, loading and the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for executing all or part of the steps of the method described in the embodiments of the present application through a computer device (which may be a personal computer, a server, or a network device). And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:卫星条件指令系统及其执行方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!