Data processing method and device

文档序号：1477019 发布日期：2020-02-25 浏览：17次中文

阅读说明：本技术 一种数据处理方法和装置 (Data processing method and device ) 是由田野于 2018-08-15 设计创作，主要内容包括：本发明公开了一种数据处理方法和装置,涉及计算机技术领域。该方法的一具体实施方式包括：接收输入的业务需求信息,所述业务需求信息包括业务逻辑描述；利用预存的业务描述信息,对所述业务需求信息进行逻辑分析,以确定所述业务需求信息对应的数据源；利用所述数据源中的数据,执行与所述业务逻辑描述对应的可执行代码,以得到数据处理结果。该实施方式能够根据业务需求信息自动匹配数据源,工作量化性高,可以实现自动化地代码编写和调试,使得非编程人员也能开发数据处理代码,代码风格统一且易于修改和维护,简化和加速了数据处理的操作过程,节约时间成本,提高数据处理效率,有利于数据处理的统一和规范化。(The invention discloses a data processing method and device, and relates to the technical field of computers. One embodiment of the method comprises: receiving input service requirement information, wherein the service requirement information comprises service logic description; performing logic analysis on the service demand information by using prestored service description information to determine a data source corresponding to the service demand information; and executing the executable code corresponding to the business logic description by using the data in the data source to obtain a data processing result. The implementation mode can automatically match the data source according to the service demand information, has high workload and high variability, can realize automatic code writing and debugging, enables non-programming personnel to develop data processing codes, has uniform code style, is easy to modify and maintain, simplifies and accelerates the operation process of data processing, saves time and cost, improves the data processing efficiency, and is favorable for the unification and standardization of the data processing.)

1. A data processing method, comprising:

receiving input service requirement information, wherein the service requirement information comprises service logic description;

performing logic analysis on the service demand information by using prestored service description information to determine a data source corresponding to the service demand information;

and executing the executable code corresponding to the business logic description by using the data in the data source to obtain a data processing result.

2. The method according to claim 1, wherein the step of performing a logic analysis on the service requirement information by using pre-stored service description information to determine a data source corresponding to the service requirement information comprises:

performing word segmentation on the service requirement information to obtain each word segmentation of the service requirement;

calculating the word frequency weight total value of each participle of the service requirement in each data source by using the service description information pre-stored in each data source;

and determining a data source corresponding to the service demand information according to the word frequency weight total value.

3. The method of claim 2, wherein the total word frequency weight of each participle of the service requirement in a data source is calculated by using the service description information pre-stored in the data source as follows:

obtaining the word frequency of each participle in the data source according to the occurrence frequency of each participle of the service requirement in the participle set of the data source, wherein the participle set of the data source is obtained by participling the service description information pre-stored in the data source;

respectively calculating the weight value of each participle in the data source according to the word frequency of each participle in the data source and the total participle amount of a participle set of the data source;

and obtaining the word frequency weight total value of each participle of the service requirement in the data source according to the weight value of each participle in the data source.

4. The method according to claim 2, wherein the step of determining the data source corresponding to the service requirement information according to the total word frequency weight value comprises:

determining that the total word frequency weight value of each participle of the service requirement is greater than a preset threshold value in the total word frequency weight value of each data source;

and determining the data source corresponding to the maximum word frequency weight total value in the word frequency weight total values larger than the preset threshold value as the data source corresponding to the service demand information.

5. The method according to claim 4, wherein if the total word-frequency weight value larger than the preset threshold does not exist among the total word-frequency weight values of the data sources of the participles of the service requirement, determining the data source corresponding to the service requirement information according to a preset requirement circulation rule.

6. The method according to claim 5, wherein after the step of determining the data source corresponding to the service demand information according to a preset demand flow rule, the method comprises:

determining that the service requirement information meets a preset standard through multi-disk analysis, and then storing each word segmentation of the service requirement into a data source corresponding to the determined service requirement information.

7. The method of claim 2, wherein the step of executing the executable code corresponding to the business logic description using the data in the data source is preceded by the step of:

and generating an executable code corresponding to the service logic description according to the service logic description and a message processing template matched with the data source.

8. The method of claim 7, wherein the step of generating executable code corresponding to the business logic description according to the message processing template matching the data source according to the business logic description comprises:

extracting Chinese keywords from the service logic description, and extracting participles belonging to the service logic description from each participle of the service requirement;

and generating the executable code by using the code keywords corresponding to the Chinese keywords and the fields corresponding to the participles belonging to the business logic description in the data source.

9. The method of claim 1, wherein the step of executing the executable code corresponding to the business logic description by using the data in the data source to obtain the data processing result is preceded by the step of:

and confirming that the business logic in the executable code conforms to the business logic description by using the data in the data source.

10. A data processing apparatus, comprising:

the demand receiving module is used for receiving input service demand information, and the service demand information comprises service logic description;

the first data source determining module is used for carrying out logic analysis on the service demand information by utilizing prestored service description information so as to determine a data source corresponding to the service demand information;

and the data processing module executes the executable code corresponding to the business logic description by utilizing the data in the data source so as to obtain a data processing result.

11. The apparatus of claim 10, wherein the first data source determining module is further configured to:

performing word segmentation on the service requirement information to obtain each word segmentation of the service requirement;

calculating the word frequency weight total value of each participle of the service requirement in each data source by using the service description information pre-stored in each data source;

and determining a data source corresponding to the service demand information according to the word frequency weight total value.

12. The apparatus of claim 11, wherein the first data source determining module comprises a word frequency weight total value determining submodule configured to:

and obtaining the word frequency weight total value of each participle of the service requirement in the data source according to the weight value of each participle in the data source.

13. The apparatus of claim 11, wherein the first data source determining module comprises a data source determining sub-module configured to:

14. The apparatus of claim 13, further comprising a second data source determination module configured to:

and if the word frequency weight total value of each participle of the service requirement is not in the word frequency weight total values of each data source and is larger than a preset threshold value, determining the data source corresponding to the service requirement information according to a preset requirement circulation rule.

15. The apparatus of claim 14, wherein the second data source determination module comprises a duplicate analysis sub-module configured to:

16. The apparatus of claim 11, further comprising a code generation module to:

and generating an executable code corresponding to the service logic description according to the service logic description and a message processing template matched with the data source.

17. The apparatus of claim 16, wherein the code generation module is further configured to:

extracting Chinese keywords from the service logic description, and extracting participles belonging to the service logic description from each participle of the service requirement;

18. The apparatus of claim 10, further comprising a testing module to:

and confirming that the business logic in the executable code conforms to the business logic description by using the data in the data source.

19. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-9.

20. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-9.

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a data processing method and apparatus.

Background

In the development process of data processing, most of the labor hour is consumed in the acquisition of data sources and the formulation of processing logic, and how to reduce the development engineering amount of the two concerns is a problem that each developer needs to pay attention to.

In the current data processing scheme, developers analyze requirements, acquire a data source for data processing by experience with the assistance of a product manager, then access a message of the data source, manually write a code for data processing, debug and get online after the completion, and perform data processing.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

the data source is acquired by means of manual experience, the workload is low, codes formed by processing logic need to be manually compiled and debugged, the style is different, modification and maintenance are not easy, and the data processing efficiency is reduced.

Disclosure of Invention

In view of this, embodiments of the present invention provide a data processing method and apparatus, which can automatically match a data source according to service demand information, have high workload, and can implement automatic code writing and debugging, so that non-programmers can also develop data processing codes, and the code style is uniform and easy to modify and maintain, thereby simplifying and accelerating the operation process of data processing, saving time and cost, improving data processing efficiency, and facilitating the uniformity and normalization of data processing.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a data processing method.

A method of data processing, comprising: receiving input service requirement information, wherein the service requirement information comprises service logic description; performing logic analysis on the service demand information by using prestored service description information to determine a data source corresponding to the service demand information; and executing the executable code corresponding to the business logic description by using the data in the data source to obtain a data processing result.

Optionally, the step of performing logic analysis on the service requirement information by using pre-stored service description information to determine a data source corresponding to the service requirement information includes: performing word segmentation on the service requirement information to obtain each word segmentation of the service requirement; calculating the word frequency weight total value of each participle of the service requirement in each data source by using the service description information pre-stored in each data source; and determining a data source corresponding to the service demand information according to the word frequency weight total value.

Optionally, by using the service description information pre-stored in a data source, calculating a total word frequency weight value of each participle of the service requirement in the data source by the following method: obtaining the word frequency of each participle in the data source according to the occurrence frequency of each participle of the service requirement in the participle set of the data source, wherein the participle set of the data source is obtained by participling the service description information pre-stored in the data source; respectively calculating the weight value of each participle in the data source according to the word frequency of each participle in the data source and the total participle amount of a participle set of the data source; and obtaining the word frequency weight total value of each participle of the service requirement in the data source according to the weight value of each participle in the data source.

Optionally, the step of determining a data source corresponding to the service demand information according to the total word frequency weight value includes: determining that the total word frequency weight value of each participle of the service requirement is greater than a preset threshold value in the total word frequency weight value of each data source; and determining the data source corresponding to the maximum word frequency weight total value in the word frequency weight total values larger than the preset threshold value as the data source corresponding to the service demand information.

Optionally, if the word frequency weight total value of each participle of the service requirement is not in the word frequency weight total values of each data source, the data source corresponding to the service requirement information is determined according to a preset requirement circulation rule.

Optionally, after the step of determining the data source corresponding to the service demand information according to a preset demand flow rule, the method includes: determining that the service requirement information meets a preset standard through multi-disk analysis, and then storing each word segmentation of the service requirement into a data source corresponding to the determined service requirement information.

Optionally, before the step of executing the executable code corresponding to the business logic description by using the data in the data source, the method includes: and generating an executable code corresponding to the service logic description according to the service logic description and a message processing template matched with the data source.

Optionally, the step of generating, according to the service logic description and according to a message processing template matched with the data source, an executable code corresponding to the service logic description includes: extracting Chinese keywords from the service logic description, and extracting participles belonging to the service logic description from each participle of the service requirement; and generating the executable code by using the code keywords corresponding to the Chinese keywords and the fields corresponding to the participles belonging to the business logic description in the data source.

Optionally, before the step of executing the executable code corresponding to the business logic description by using the data in the data source to obtain the data processing result, the method includes: and confirming that the business logic in the executable code conforms to the business logic description by using the data in the data source.

According to another aspect of the embodiments of the present invention, there is provided a data processing apparatus.

A data processing apparatus comprising: the demand receiving module is used for receiving input service demand information, and the service demand information comprises service logic description; the first data source determining module is used for carrying out logic analysis on the service demand information by utilizing prestored service description information so as to determine a data source corresponding to the service demand information; and the data processing module executes the executable code corresponding to the business logic description by utilizing the data in the data source so as to obtain a data processing result.

Optionally, the first data source determining module is further configured to: performing word segmentation on the service requirement information to obtain each word segmentation of the service requirement; calculating the word frequency weight total value of each participle of the service requirement in each data source by using the service description information pre-stored in each data source; and determining a data source corresponding to the service demand information according to the word frequency weight total value.

Optionally, the first data source determining module includes a word frequency weight total value determining submodule, configured to: obtaining the word frequency of each participle in the data source according to the occurrence frequency of each participle of the service requirement in the participle set of the data source, wherein the participle set of the data source is obtained by participling the service description information pre-stored in the data source; respectively calculating the weight value of each participle in the data source according to the word frequency of each participle in the data source and the total participle amount of a participle set of the data source; and obtaining the word frequency weight total value of each participle of the service requirement in the data source according to the weight value of each participle in the data source.

Optionally, the first data source determining module includes a data source determining sub-module, configured to: determining that the total word frequency weight value of each participle of the service requirement is greater than a preset threshold value in the total word frequency weight value of each data source; and determining the data source corresponding to the maximum word frequency weight total value in the word frequency weight total values larger than the preset threshold value as the data source corresponding to the service demand information.

Optionally, the system further comprises a second data source determining module, configured to: and if the word frequency weight total value of each participle of the service requirement is not in the word frequency weight total values of each data source and is larger than a preset threshold value, determining the data source corresponding to the service requirement information according to a preset requirement circulation rule.

Optionally, the second data source determining module includes a duplicate analysis sub-module for: determining that the service requirement information meets a preset standard through multi-disk analysis, and then storing each word segmentation of the service requirement into a data source corresponding to the determined service requirement information.

Optionally, the system further comprises a code generation module, configured to: and generating an executable code corresponding to the service logic description according to the service logic description and a message processing template matched with the data source.

Optionally, the code generation module is further configured to: extracting Chinese keywords from the service logic description, and extracting participles belonging to the service logic description from each participle of the service requirement; and generating the executable code by using the code keywords corresponding to the Chinese keywords and the fields corresponding to the participles belonging to the business logic description in the data source.

Optionally, the system further comprises a test module, configured to: and confirming that the business logic in the executable code conforms to the business logic description by using the data in the data source.

According to yet another aspect of an embodiment of the present invention, an electronic device is provided.

An electronic device, comprising: one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data processing method provided by the present invention.

According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.

A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method provided by the invention.

One embodiment of the above invention has the following advantages or benefits: the method has the advantages that the input service requirement information is subjected to logic analysis by utilizing the prestored service description information to determine the data source corresponding to the service requirement information, the data source can be automatically matched according to the service requirement information, the workload is high, the executable code corresponding to the service logic description in the service requirement information is generated according to the message processing template matched with the data source corresponding to the service requirement, automatic code compiling and debugging can be realized, non-programming personnel can develop data processing codes, the code style is uniform and easy to modify and maintain, the operation process of data processing is simplified and accelerated, the time cost is saved, the data processing efficiency is improved, and the unification and standardization of data processing are facilitated.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main steps of a data processing method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of the composition of common attributes and private attributes according to an embodiment of the invention;

FIG. 3 is an exemplary flow diagram of a requirements verification according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a message middleware system message template according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a data processing platform according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of the main blocks of a data processing apparatus according to an embodiment of the present invention;

FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

FIG. 8 is a schematic block diagram of a computer system suitable for use with a server implementing an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of main steps of a data processing method according to an embodiment of the present invention.

As shown in fig. 1, the data processing method of the embodiment mainly includes steps S101 to S103 as follows.

Step S101: and receiving input service requirement information.

The service requirement information includes service logic description, for example: the method comprises the steps of obtaining a manifest number for checking goods, and updating the manifest number and the time for checking the goods to a database if the manifest number is undefined. A user (such as a data processing developer) can input service requirement information through the visual requirement input interface, and the service requirement information can also comprise service requirement description, such as: "the time for goods arriving at the station, the station leader to check and accept the goods and the access to check and accept".

Step S102: and performing logic analysis on the input service demand information by using the pre-stored service description information to determine a data source corresponding to the service demand information.

The pre-stored service description information is pre-stored in the existing data sources, each data source is a database table, which can also be called as a data source table, such as a goods inspection table, a goods receiving table, etc.

Step S102 may specifically include:

segmenting the service requirement information to obtain each segmentation of the service requirement;

calculating the word frequency weight total value of each participle of the service requirement in each data source by using service description information pre-stored in each data source;

and determining a data source corresponding to the service demand information according to the calculated word frequency weight total value.

The participles of the business requirements are stored in a requirement characteristic table. After the service demand information is segmented, the segmentation may be manually checked first, and then the step of calculating the total word-frequency weight value may be executed, where the manual checking process specifically includes: the words successfully segmented are set to be yellow (the font display color can be controlled through css (cascading style sheet)), developers carry out secondary browsing scanning, words which are automatically scanned and left without additional notes or do not contain contexts are segmented, after the words are verified to be correct, the words are clicked and stored, after the segmented words and the additional notes are recorded, the characteristics of the words can be analyzed, and the database can be expanded through next segmentation.

The total word frequency weight value of each participle required by the service in the data source can be calculated in the following way by using the service description information pre-stored in the data source:

obtaining the word frequency of each participle in the data source according to the occurrence frequency of each participle in the service requirement in the participle set of the data source, wherein the participle set of the data source is obtained by participling service description information prestored in the data source;

respectively calculating the weight value of each participle of the service demand in the data source according to the word frequency of each participle of the service demand in the data source and the total participle quantity of the participle set of the data source, wherein specifically, the weight value of a certain participle in the data source is equal to the ratio of the word frequency of the participle in the data source to the total participle quantity of the participle set of the data source;

according to the weight value of each participle of the service requirement in the data source, obtaining the word frequency weight total value of each participle of the service requirement in the data source, specifically, the calculation method of the word frequency weight total value of each participle of the service requirement in the data source comprises the following steps: the method comprises the steps of removing duplication of each participle of a service demand, adding the weighted values of each deduplicated participle in a data source to obtain a word frequency weight total value of each participle of the service demand in the data source, namely, when the weighted values are added to obtain the word frequency weight total value, the weighted values of the same participle are only added once and are not repeatedly added.

The step of determining a data source corresponding to the service demand information according to the total word frequency weight value may specifically include:

The preset threshold may be self-defined, since the total word frequency weight value is a numerical value greater than or equal to 0 and less than or equal to 1, a numerical value greater than or equal to zero and less than 1 may be defined as the preset threshold according to the requirement, for example, the preset threshold is defined as 0, when the preset threshold is 0, the data source corresponding to the largest total word frequency weight value among the total word frequency weight values of the data sources of the service requirement is determined as the data source corresponding to the service requirement information.

If the pre-stored service description information is utilized, after the input service requirement information is logically analyzed, the data source corresponding to the service requirement information cannot be determined, in other words, if each participle of the service requirement is in the total word-frequency weight value of each data source and the total word-frequency weight value larger than the preset threshold value does not exist, the data source corresponding to the service requirement information is determined according to the preset requirement circulation rule.

The requirement circulation rule may be preset according to the service requirement, for example, the requirement circulation rule may include forwarding, set top, comment, score, upgrade, responsibility determination, and the like.

And forwarding, the developer can forward the service requirement to relevant personnel such as a product manager or a project manager, and the relevant personnel assist in determining the data source corresponding to the service requirement information.

And the unsolved data source demand list is a business demand list interface which displays all the unsolved data sources so as to be browsed and commented by all developers, product managers, project managers and data producers in all links participating in data processing.

And comment, namely after the business requirement is placed on the unresolved data source requirement list, all people can participate in the business requirement comment, and all comments are fed back to developers, product managers and the like of the business requirement.

And scoring, namely, a developer browses feedback information of the comments of the related business requirements, and once the developer finds that the comments are sourced by the correct feedback data, the developer is scored for the feedback person.

And upgrading, namely if a data source is not inquired about the business requirement within the limited time, automatically upgrading the business requirement to a department manager, and coordinating resources to solve by the department manager.

Accountability, i.e. if a data source has not been queried for the business requirement within a limited time (usually longer than the limited time for upgrading), the data source is determined by the department manager to the individual to be determined by the person being accountable.

The requirement circulation rule can also comprise a reply disk, namely after a certain service requirement is inquired to a data source through links such as forwarding, commenting and upgrading, the relation between the service requirement characteristic and the data source is automatically analyzed, the reason for losing the service requirement information logic analysis is recorded, the requirement characteristic table and the data source table are filled, the analysis and loss processing mechanism is perfected, whether the service requirement information accords with the preset standard can be judged through reply disk analysis, and after the service requirement information accords with the preset standard is determined, all the participles of the service requirement are stored in the data source corresponding to the determined service requirement information, so that a database of the data source is expanded, and the next participle can be directly used. If the service requirement information does not meet the preset specification, a prompt message can be output to the developer so that the developer can correct the input service requirement information.

Step S103: and executing the executable code corresponding to the service logic description by using the data in the data source corresponding to the service requirement information to obtain a data processing result.

In an embodiment, before step S103, an executable code corresponding to the service logic description may be generated according to the service logic description and a message processing template matched with a data source corresponding to the service requirement information.

Specifically, Chinese keywords are extracted from the service logic description, and participles belonging to the service logic description are extracted from each participle of the service requirement; and generating the executable code by using the code key words corresponding to the extracted Chinese key words and the fields corresponding to the participles belonging to the service logic description in the data source corresponding to the service requirement information.

Those skilled in the art can understand that this embodiment is a preferred embodiment for generating an executable code according to the embodiment of the present invention, and in other embodiments, for example, the executable code corresponding to the service logic description in step S103 is generated by manual writing, and in combination with step S101 and step S102 described above, the technical effect of high work quantization can also be achieved by automatically matching a data source according to the service requirement information according to the embodiment of the present invention.

Before step S103, it may also be determined that the service logic in the executable code conforms to the service logic description in the service requirement information by using the data in the data source corresponding to the service requirement information.

For example, in the above-mentioned embodiment, after the executable code corresponding to the service logic description is generated according to the service logic description and the message processing template matched with the data source corresponding to the service requirement information, it may be automatically tested whether the service logic in the executable code conforms to the service logic description in the service requirement information by using the data in the data source corresponding to the service requirement information, and after it is determined that the service logic in the executable code conforms to the service logic description in the service requirement information, step S103 is executed again.

The invention also provides an agile data processing platform (data processing platform for short), and the data processing method provided by the embodiment of the invention is executed to process data after the data processing platform is started. The data processing platform of the embodiment of the invention can be developed by independent modules/units, particularly, a visual demand input interface (in the form of a Web interface) is taken as an operation unit for agile data processing, and the development stage of data processing is divided into three functional modules: the system comprises a data entry module, a logic analysis system and a code generator, wherein the three built-in functional modules are used as basic units, and visual development is realized by realizing data processing driven by actual services.

The agile data processing platform of the embodiment of the invention is described below by taking data processing in the e-commerce field as an example. The embodiment of the invention is not limited to data processing in the E-commerce field, and can be used for a scene in various fields where a data source needs to be positioned, a data processing code needs to be written, and the data processing code is executed for data processing.

In the e-commerce system, data processing usually involves a third-party service, such as a transportation service, in a data processing scenario involving the third-party service, service requirement information is input by a developer of the data processing, and pre-stored service description information is entered by a third-party service producer, so in the agile data processing platform of the embodiment, data entry for two different roles can be supported: and the requirement input and the third-party service information data input are carried out.

And the data processing platform carries out preliminary verification on the input service requirement information and then delivers the information to a logic analysis system to finish the acquisition of the requirement information.

The data entry of the third-party service information serves for a producer of data on a third-party service line, respective service description information including service description, service field description of a producer database and the like is entered by the producer and is used for screening, comparison and filtering of a logic analysis system, and the two data entries are applied to different roles and jointly cooperate to complete the identification of service requirements.

In the data processing flow, the two data entries are not necessarily performed simultaneously (generally, not performed simultaneously), and the third-party service information data may be entered and stored in advance and read at any time when in use.

In addition to the two data entry functions, the data entry module has a data circulation function for maintaining the identification of the service requirement, for example, after the service requirement information is entered, the logic analysis system analyzes and compares the entered service requirement information, and once the service characteristics of the service requirement are not found in the built-in database corresponding to the data processing platform, the requirement circulation function is started, specifically including the functions of forwarding, topping, commenting, scoring, upgrading, liability determination, and retailing, which are described in detail above and are not described herein again. The embodiment of the invention enables the business and the data source to form a document according to the relationship for all relevant personnel to review.

In order to more clearly understand the working process of the data entry module in the embodiment of the invention, the development steps of the data entry module in the agile data processing platform need to be introduced. Firstly, defining a data entry template, and abstracting a set of common attributes for demand entry and third-party service information data entry, wherein the common attributes comprise: the method comprises the steps that a requirement identifier (service requirement identifier), input content (including service requirement description, service logic description and the like), input time, input persons and the like are used, the common attribute table only has an inserting function and does not have an updating function, and the requirement identifier is used for judging whether a plurality of services are required by the same service (the updating function is removed to keep a history record); the third party service information data entry also needs a private attribute list for describing the field of the producer database and the description of the meaning thereof, and the private attributes comprise: the method comprises the following steps of (1) describing the business of a table (namely what business is stored in the table and is associated with a requirement identifier in a common attribute table), field names, describing fields, describing whether the data is produced on site (the site production of the data refers to that the data is generated by a pipeline operator, for example, the data produced by a pipeline delivery end is the data belonging to the site production) and the like, wherein the private attributes are stored in a word segmentation table after being recorded, and text similarity comparison is provided for a logic analysis system. The common attribute and the private attribute are schematically constructed as shown in fig. 2. Then develop the demand circulation button such as: the specific functions of buttons such as forwarding, setting, commenting, scoring, upgrading, determining responsibility, replying and the like.

It should be noted that the data processing platform according to the embodiment of the present invention is not limited to a data processing scenario involving a third-party service, but is also applicable to a data processing scenario not involving a third-party service, and in the data processing scenario not involving a third-party service, service requirement information and pre-stored service description information may be uniformly input by a developer of data processing, where the pre-stored service description information is self-owned service information data, instead of third-party service information data.

The logic analysis system is used for carrying out requirement acquisition on the service requirement information input by the data input module, carrying out logic analysis on the input service requirement information by utilizing prestored service description information so as to determine a data source corresponding to the service requirement information, and specifically, the data source corresponding to the service requirement information can be accurately positioned through the following stages.

The first stage is required word segmentation, which means that a series of Chinese character sequences are segmented into a single lexical unit, various word segmentation algorithms can be used to complete word segmentation of service requirement information in the embodiment, and the following takes word segmentation (a Java-implemented distributed Chinese word segmentation component) as an example to introduce specific steps of word segmentation of service requirement information:

introducing a maven dependent packet:

performing word segmentation on the text:

seg ("requirement entry text");

words show the content after word segmentation, for example, if the requirement entry text (e.g. service requirement description) is: "goods arrive at the station, the station leader checks and accepts the goods, and the time of access check and acceptance", words is displayed as: [ goods, arrival, station leader, acceptance, goods, access, acceptance ].

Carrying out manual secondary verification on the requirements:

after words are divided, words successfully divided are set to be yellow (the display color of the characters can be controlled through css (cascading style sheet)), developers carry out secondary browsing scanning, words which are automatically scanned and left for additional notes or do not contain contexts are divided, after the words are verified to be correct, the words are clicked and stored, a logic analysis system automatically records the divided words and the additional notes, and the characteristics of the words are analyzed, so that the next word division can be conveniently carried out to expand a database.

And in the second stage, performing keyword word frequency weight analysis on each participle required by the service and words in the participle table. The word segmentation table is a data source (or called data source table), wherein the stored content includes service description information entered by a third party service producer, including service description and service field description (such as field name, field description, etc.) of a producer database, and also including a word segmentation set obtained after the word segmentation of the service description information.

The following takes two word-dividing tables, namely a stock-checking table and a receiving table, as an example, and introduces a specific process of keyword word-frequency weight analysis. The checklist (private attribute structure of checklist of third-party service information site) is as shown in table 1. The delivery form (third party service information site delivery form private attribute structure) is shown in table 2.

TABLE 1

TABLE 2

Name of field	Field description	Business description of tables	Word segmentation set
				receive_id	Main key	Primary key ID of site delivery table	[ station, receiving, List, Key, id]
waybill_code	Freight note number	The waybill number, the station receiving the goods with the waybill as the dimension	[ station, receiving, manifest, dimension]
				create_site_id	Creating a site	By which station to receive goods	[ station, receiving goods)]
create_time	Creation time	Time of receiving goods	[ time to receive goods]
				create_user_id	Operator ID	ID record erp information	[ id, record, erp, information]
create_user	Name of operator	Name is that of erp	[ erp, name]

Assuming that the entered service demand information is 'goods arrival station, station length goods inspection, and access goods inspection time', the participles of the service demand obtained by participles are as follows: [ goods, arrival, station, check, access, check, time ]. Each participle of the business requirement appears 3 times in the participle set of the goods inspection table in the table 1, the station appears 4 times, the time appears 1 time, and other participles appear 0 time. That is, the word frequencies of the 'station', 'goods inspection' and 'time' in the goods inspection list are respectively 3, 4 and 1, and the word frequencies of other participles in the goods inspection list are 0.

And if the total word number of the word segmentation sets in the goods inspection list is 19, the weight value of the site in the goods inspection list is as follows: the ratio of the word frequency of the 'station' in the stock-checking list to the total word quantity in the word-dividing set of the stock-checking list is 3/19. Similarly, the weight values of other participles of the business requirement in the goods inspection table are calculated, the weight value of the goods inspection table is 4/19, the weight value of the time in the goods inspection table is 1/19, and the weight values of other participles in the goods inspection table are 0.

Because two items are checked in each participle of the business requirement, the items are checked, namely, when the word frequency weight total value of each participle of the business requirement in the item checking table is calculated, only one weighted value of the two items is taken, so that the word frequency weight total value of each participle of the business requirement in the item checking table is equal to: the addition of the weighted values of the de-duplicated participles of the business demands in the stock-checking list,

for this example, this is: 3/19+4/19+1/19 is 8/19.

According to the same method, the total word frequency weight value of each participle of the obtained service requirement in the receiving list is calculated as follows: 3/19 ("site" weight on delivery list) +1/19 ("time" weight on delivery list) ═ 4/19.

For convenience of introduction, in this example, only two word segmentation tables including a goods inspection table and a goods receipt table are taken as an example, then, in this example, when determining a data source, a preset threshold value is defined as 0, that is, if each segmentation word of the service requirement exists in the total word frequency weight values of each data source (i.e., the goods inspection table and the goods receipt table in this example) and the total word frequency weight value is greater than 0, the goods inspection table corresponding to the maximum total word frequency weight value (8/19) is determined as the data source corresponding to the service requirement information.

In this example, since the logic analysis system already determines that the data source corresponding to the business demand information is the inspection form by performing logic analysis on the input business demand information, it is not necessary to determine the data source through the demand flow.

In other embodiments, if the data source corresponding to the service demand information is not queried (or determined) by the above logic analysis method, a third stage is required to be performed, and the data source is determined by manual intervention according to a preset demand flow rule. When a data source is inquired through the manual demand circulation function, a demand duplication analysis function is started to perform duplication data association and additional recording, whether the demand meets the standard or not is judged at first, then, a result obtained after the business demand is participled is stored in a participle table, and a participle table database is expanded, so that the next participle can be directly used.

After the data source is determined, a code writing stage is entered, English codes do not need to be written, only Chinese description needs to be carried out, and the code generator can automatically complete conversion of the English codes.

First, the Chinese keywords and the generation formula are built in the data processing platform, as shown in Table 3. The production formula reflects the correspondence between the chinese keyword and the code keyword, for example, the chinese keyword "if" corresponds to the code keyword "if".

TABLE 3

For example, the business logic is described as "get invoice number for checked goods, if the invoice number is not undefined, update the invoice number and the checking time to the database". The code generator may first perform a requirement check.

Specifically, the participle of the service logic description is compared with the built-in Chinese keyword and the service description information prestored in the participle table, and the words which can be recognized in the participle of the service logic description are marked, for example, the word "not marked as" is not marked as "! The "undefined" is marked as "null", "time to check" is marked as "data source table field", etc. And independently extracting the words which cannot be identified, putting the words into a temporary word list, and manually associating the words which cannot be identified with the Chinese keywords and the service description information in the word list, for example, judging the 'then' as 'then', and establishing association. If no associated item is inquired, workflow approval is started, the unrecognized words and descriptions thereof are applied to the superior level, new Chinese keywords or similar texts of the word segmentation table are created after approval, and therefore the requirement verification is completed, and if the words of the temporary word segmentation table are matched with the words of the business logic description for multiple times, the words in the temporary word segmentation table are put into a formal word segmentation table. An exemplary flow of the requirement verification is shown in fig. 3 (step S301 to step S303).

And after the requirement verification is completed, entering a generation phase of the executable code. Since the data source (data source table) is determined to be the inspection table, the executable code corresponding to the business logic description can be generated according to the business logic description and the message processing template matched with the inspection table. Specifically, it first determines whether the checklist is kafka (a high throughput distributed publish-subscribe messaging system) message or other message middleware system message, and generates different message processing templates according to different message sources, where the message middleware system message template is shown in fig. 4. And starting mathematical model abstraction, replacing the participles belonging to the service logic description by corresponding fields in a participle table, replacing Chinese keywords by java character sequences, replacing the generated formulas by character sequences conforming to java judgment, filling auxiliary logics such as judgment, abnormity and the like, simultaneously generating java files, and compiling to generate class files.

After the code generator generates the code, the unit test can be performed by using the data in the inspection table, that is, whether the service logic in the generated code meets the service logic description in the service requirement information is tested, and after the test is passed, the generated code is operated to perform data processing.

The data processing platform may further include a monitoring module for printing (outputting) a system log of the data processing platform.

Based on the above description of the data processing platform according to the embodiment of the present invention, a schematic configuration diagram of the data processing platform according to the embodiment of the present invention can be seen in fig. 5.

After each module is replaced by a platform, the management standard of the data processing flow is favorably unified, moreover, the agile data processing platform of the embodiment automates the functions of two key links (data source determination and code generation) of data processing, developers do not need to write a line of codes, only need to describe business logic in Chinese on a visual interface, the agile data processing platform can automatically complete the acquisition of the data source and the writing of a processing program, and in addition, a visual testing module is provided for automatically testing the codes generated by the code generator, so that the data processing flow is quickly completed, the time cost is saved, and the unification and the standardization of the data processing are facilitated.

Fig. 6 is a schematic diagram of main blocks of a data processing apparatus according to an embodiment of the present invention.

As shown in fig. 6, a data processing apparatus 600 according to an embodiment of the present invention mainly includes: the system comprises a requirement receiving module 601, a first data source determining module 602 and a data processing module 603.

The requirement receiving module 601 is configured to receive input service requirement information, where the service requirement information includes service logic description.

The first data source determining module 602 is configured to perform logic analysis on the service requirement information by using pre-stored service description information to determine a data source corresponding to the service requirement information.

The first data source determination module 602 may be specifically configured to:

segmenting the service requirement information to obtain each segmentation of the service requirement;

calculating the word frequency weight total value of each participle of the service requirement in each data source by using service description information pre-stored in each data source;

and determining a data source corresponding to the service demand information according to the calculated word frequency weight total value.

The first data source determination module 602 may include a word frequency weight total value determination sub-module configured to:

obtaining the word frequency of each participle of the service demand in the data source according to the occurrence frequency of each participle of the service demand in the participle set of the data source, wherein the participle set of the data source is obtained by participling service description information prestored in the data source;

and obtaining the word frequency weight total value of each participle of the service demand in the data source according to the weight value of each participle of the service demand in the data source.

The first data source determination module 602 may further include a data source determination sub-module for:

The data processing apparatus 600 may further comprise a second data source determining module for: and if the word frequency weight total value of each participle of the service requirement is not greater than a preset threshold value in the word frequency weight total values of each data source, determining the data source corresponding to the service requirement information according to a preset requirement circulation rule.

The second data source determination module may include a duplicate analysis submodule to: determining that the service requirement information meets the preset specification through multi-disk analysis, and then storing each word segmentation of the service requirement into a data source corresponding to the determined service requirement information.

The data processing module 603 is configured to execute an executable code corresponding to the service logic description by using data in the data source corresponding to the service requirement information, so as to obtain a data processing result.

The data processing apparatus 600 may further comprise a code generation module for: and generating an executable code corresponding to the service logic description according to the service logic description and a message processing template matched with the data source corresponding to the service requirement information.

The code generation module is specifically configured to: extracting Chinese keywords from the service logic description, and extracting participles belonging to the service logic description from each participle of the service requirement; and generating an executable code by using the code keywords corresponding to the Chinese keywords and the fields corresponding to the participles belonging to the business logic description in the determined data source.

The data processing apparatus 600 may further comprise a testing module for: and judging whether the service logic in the executable code conforms to the service logic description or not by using the data in the data source corresponding to the service requirement information, if so, executing the executable code corresponding to the service logic description by using the data in the data source corresponding to the service requirement information by using the data processing module 603 so as to obtain a data processing result, otherwise, correcting the executable code so that the service logic conforms to the service logic description, and executing data processing.

The data processing apparatus 600 of the embodiment of the present invention may be implemented in an agile data processing platform, a module of the agile data processing platform having the same function as that of the data processing apparatus 600 may be used as a corresponding module of the data processing apparatus 600, for example, a visual requirement input interface and a data entry module are combined to be used as a requirement receiving module 601 to implement a corresponding function, a logic analysis system may implement functions of a first data source determining module 602 and a second data source determining module, a code generator may implement functions of a code generating module and a testing module, the data processing module 603 is not shown in the agile data processing platform of fig. 5, but through the introduction of the agile data processing platform, after a code generated by the code generator is tested, the generated code may be run to perform data processing, and therefore, a person skilled in the art can easily understand that the code generating module may be implemented on the agile data processing platform The code of (2) implements the functionality of the data processing module 603.

In addition, the detailed implementation of the data processing apparatus in the embodiment of the present invention has been described in detail in the above data processing method, and therefore, the repeated content will not be described again.

Fig. 7 shows an exemplary system architecture 700 of a data processing method or data processing apparatus to which embodiments of the present invention may be applied.

As shown in fig. 7, the system architecture 700 may include terminal devices 701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the terminal devices 701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the terminal devices 701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The terminal devices 701, 702, 703 may have installed thereon various communication client applications, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).

The terminal devices 701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 705 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 701, 702, 703. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the data processing method provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the data processing apparatus is generally disposed in the server 705.

It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use in implementing a server according to embodiments of the present application. The server shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, according to embodiments of the present disclosure, the processes described above with reference to the main step schematic may be implemented as computer software programs. For example, the disclosed embodiments of the invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the main step diagram. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present application when executed by the Central Processing Unit (CPU) 801.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The principal step diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the main step diagrams or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or block diagrams, and combinations of blocks in the block diagrams or block diagrams, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a requirement receiving module 601, a first data source determining module 602, a data processing module 603. The names of these modules do not form a limitation to the modules themselves in some cases, for example, the requirement receiving module 601 may also be described as a "module for receiving input service requirement information".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: receiving input service requirement information, wherein the service requirement information comprises service logic description; performing logic analysis on the service demand information by using prestored service description information to determine a data source corresponding to the service demand information; and executing the executable code corresponding to the business logic description by using the data in the data source to obtain a data processing result.

According to the technical scheme of the embodiment of the invention, the input service requirement information is logically analyzed by utilizing the prestored service description information to determine the data source corresponding to the service requirement information, the data source can be automatically matched according to the service requirement information, the workload is high, the executable code corresponding to the service logic description in the service requirement information is generated according to the message processing template matched with the data source corresponding to the service requirement, the automatic code compiling and debugging can be realized, non-programming personnel can develop the data processing code, the code style is uniform and easy to modify and maintain, the operation process of data processing is simplified and accelerated, the time cost is saved, the data processing efficiency is improved, and the uniformity and normalization of data processing are facilitated.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

22页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：多方空间通信的综合零售商城的设计

Data processing method and device

相关技术

网友询问留言