Data processing method, device, equipment and medium based on deep learning model

文档序号：1889427 发布日期：2021-11-26 浏览：4次中文

阅读说明：本技术 基于深度学习模型的数据处理方法、装置、设备及介质 (Data processing method, device, equipment and medium based on deep learning model ) 是由蒋佳峻成杰峰于 2021-08-31 设计创作，主要内容包括：本发明实施例涉及人工智能领域,公开了一种基于深度学习模型的数据处理方法、装置、设备及介质,该方法包括：对数据处理参数进行定义,并根据定义对应的逻辑参数生成pip包；接收携带接口标识和多个数据标识的数据集的获取请求；通过与接口标识对应的接口调用pip包获取与数据标识对应的训练数据,并对训练数据进行预处理得到数据集；对数据集进行拆分处理得到训练数据集,并将训练数据集输入深度学习模型进行训练,得到数据推荐模型；将数据推荐请求中携带目标接口标识和待处理数据的数据标识输入数据推荐模型,得到推荐结果数据,提高了数据推荐的效率和准确率。本发明涉及区块链技术,如可将数据集写入区块链中,以用于数据取证等场景。(The embodiment of the invention relates to the field of artificial intelligence, and discloses a data processing method, a device, equipment and a medium based on a deep learning model, wherein the method comprises the following steps: defining data processing parameters, and generating a pip packet according to logic parameters corresponding to the definition; receiving an acquisition request of a data set carrying an interface identifier and a plurality of data identifiers; calling a pip packet through an interface corresponding to the interface identifier to acquire training data corresponding to the data identifier, and preprocessing the training data to obtain a data set; splitting the data set to obtain a training data set, inputting the training data set into a deep learning model for training to obtain a data recommendation model; and inputting the data identifier carrying the target interface identifier and the data to be processed in the data recommendation request into the data recommendation model to obtain recommendation result data, so that the efficiency and accuracy of data recommendation are improved. The present invention relates to blockchain techniques, such as data sets can be written into blockchains for use in scenarios such as data forensics.)

1. A data processing method based on a deep learning model is characterized by comprising the following steps:

according to a preset compiling rule, defining data processing parameters by using a specified coding language, acquiring logic parameters corresponding to the definition, compiling operator codes according to the logic parameters to generate a pip packet generation script, and generating a pip packet according to the pip packet generation script, wherein the data processing parameters are used for indicating to acquire data from a database and/or preprocess the acquired data;

receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers;

determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set;

splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model;

acquiring a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed.

2. The method according to claim 1, wherein the defining data processing parameters using a specified coding language according to the preset compilation rule comprises:

obtaining a custom operator compiling rule in the preset deep learning model;

and compiling a general operator definition by using the specified coding language according to the custom operator compiling rule so as to realize the definition of the data processing parameters.

3. The method according to claim 1, wherein the calling the pip packet through the interface to obtain the training data corresponding to the data identifier from the database includes:

determining a data acquisition rule for acquiring data from the database according to the definition of the data processing parameters in the pip packet;

and acquiring training data corresponding to the data identification from the database according to the data acquisition rule.

4. The method of claim 1, wherein preprocessing the training data to obtain a data set comprises:

determining a preprocessing rule for preprocessing data according to the definition of the data processing parameters in the pip packet;

and preprocessing the training data corresponding to the data identification acquired from the database according to the preprocessing rule.

5. The method of claim 1, wherein the splitting the data set to obtain a training data set comprises:

splitting the data set according to a preset proportion to obtain the training data set;

inputting the training data set into a preset deep learning model for training to obtain a data recommendation model, wherein the training data set comprises:

carrying out batch splitting processing on the training data set according to a preset batch splitting rule to obtain a plurality of sub-training data sets;

and inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model.

6. The method according to claim 5, wherein the inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model comprises:

inputting the plurality of sub-training data sets into the preset deep learning model to obtain a loss function value;

when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep learning model according to the loss function value, and inputting the plurality of sub-training data sets into the deep learning model with the adjusted model parameters for iterative training;

and when the loss function value obtained by iterative training meets the preset condition, determining to obtain the data recommendation model.

7. The method according to claim 6, wherein the inputting the target interface identifier and the data identifier of the to-be-processed data into the data recommendation model to obtain recommendation result data of the to-be-processed data comprises:

inputting the target interface identification and the data identification of the data to be processed into the data recommendation model, and calling a pip packet in the data recommendation model through a target interface corresponding to the target interface identification;

acquiring data to be processed corresponding to the data identification of the data to be processed from the database according to the definition of the data processing parameters in the pip packet;

and preprocessing the data to be processed according to the definition of the data processing parameters in the pip packet to obtain target data, and inputting the target data into the data recommendation model to obtain recommendation result data corresponding to the target data.

8. A data processing apparatus based on a deep learning model, comprising:

the generating unit is used for defining data processing parameters by using a specified coding language according to a preset compiling rule, acquiring logic parameters corresponding to the definition, compiling operator codes according to the logic parameters to generate a pip packet generating script, and generating a pip packet according to the pip packet generating script, wherein the data processing parameters are used for indicating to acquire data from a database and/or preprocess the acquired data;

the device comprises a receiving unit, a processing unit and a processing unit, wherein the receiving unit is used for receiving an acquisition request of a data set, and the acquisition request of the data set carries an interface identifier and a plurality of data identifiers;

the processing unit is used for determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set;

the training unit is used for splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model;

and the recommending unit is used for acquiring a data recommending request, wherein the data recommending request carries a target interface identifier and a data identifier of the data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommending model to obtain recommending result data of the data to be processed.

9. A computer device comprising a processor and a memory, wherein the memory is configured to store a computer program and the processor is configured to invoke the computer program to perform the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1-7.

Technical Field

The invention relates to the field of artificial intelligence, in particular to a data processing method, a data processing device, data processing equipment and data processing media based on a deep learning model.

Background

In a recommendation system for advertisement recommendation, commodity recommendation and the like, a traditional implementation mode of a recommendation model is constructed by using machine learning, a deep learning technology is more and more used at present, in a deep learning model training process, data needs to be preprocessed in the first step and are used as original data (such as information of users, commodities and the like) of training, the data need to be acquired from a database firstly, and then the acquired data are preprocessed to become data which can be input by a deep learning model.

At present, the process of acquiring data from a database usually adopts a manual coding mode, screens various required data, performs corresponding data preprocessing after screening, and finally stores the data as a file in a persistent mode. The content in the file is the tensor which can be received during deep learning model training, so that the model is trained. However, in the process, a lot of time is required for encoding in acquiring, screening and preprocessing data, different encoding work is required for different models, and the whole process is long and takes a long time, so that the efficiency of data recommendation is low, and the accuracy is low. Therefore, how to improve the efficiency and accuracy of data recommendation becomes an important issue.

Disclosure of Invention

The embodiment of the invention provides a data processing method, a data processing device, data processing equipment and a data processing medium based on a deep learning model, which can reduce coding work required by data acquisition, data screening, data preprocessing and the like, and improve the efficiency and accuracy of data recommendation.

In a first aspect, an embodiment of the present invention provides a data processing method based on a deep learning model, including:

receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers;

splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model;

Further, the defining the data processing parameters by using a specified coding language according to the preset compiling rule includes:

obtaining a custom operator compiling rule in the preset deep learning model;

and compiling a general operator definition by using the specified coding language according to the custom operator compiling rule so as to realize the definition of the data processing parameters.

Further, the calling the pip packet through the interface to acquire the training data corresponding to the data identifier from the database includes:

determining a data acquisition rule for acquiring data from the database according to the definition of the data processing parameters in the pip packet;

and acquiring training data corresponding to the data identification from the database according to the data acquisition rule.

Further, the preprocessing the training data to obtain a data set includes:

determining a preprocessing rule for preprocessing data according to the definition of the data processing parameters in the pip packet;

and preprocessing the training data corresponding to the data identification acquired from the database according to the preprocessing rule.

Further, the splitting the data set to obtain a training data set includes:

splitting the data set according to a preset proportion to obtain the training data set;

inputting the training data set into a preset deep learning model for training to obtain a data recommendation model, wherein the training data set comprises:

carrying out batch splitting processing on the training data set according to a preset batch splitting rule to obtain a plurality of sub-training data sets;

and inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model.

Further, the inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model includes:

inputting the plurality of sub-training data sets into the preset deep learning model to obtain a loss function value;

and when the loss function value obtained by iterative training meets the preset condition, determining to obtain the data recommendation model.

Further, the inputting the target interface identifier and the data identifier of the to-be-processed data into the data recommendation model to obtain recommendation result data of the to-be-processed data includes:

acquiring data to be processed corresponding to the data identification of the data to be processed from the database according to the definition of the data processing parameters in the pip packet;

In a second aspect, an embodiment of the present invention provides a data processing apparatus based on a deep learning model, including:

In a third aspect, an embodiment of the present invention provides a computer device, including a processor and a memory, where the memory is used to store a computer program, and the computer program includes a program, and the processor is configured to call the computer program to execute the method of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the method of the first aspect.

According to the embodiment of the invention, a data processing parameter can be defined by using a specified coding language according to a preset compiling rule, a logic parameter corresponding to the definition is obtained, an operator code is compiled according to the logic parameter to generate a pip packet generation script, and the pip packet is generated according to the pip packet generation script, wherein the data processing parameter is used for indicating to acquire data from a database and/or preprocess the acquired data; receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers; determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set; splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model; acquiring a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed. By the implementation mode, coding work required by data acquisition, data screening, data preprocessing and the like can be reduced, and the efficiency and accuracy of data recommendation are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart of a data processing method based on a deep learning model according to an embodiment of the present invention;

FIG. 2 is a schematic block diagram of a data processing apparatus based on a deep learning model according to an embodiment of the present invention;

fig. 3 is a schematic block diagram of a computer device provided by an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The data processing method based on the deep learning model provided by the embodiment of the invention can be applied to a data processing device based on the deep learning model, and in some embodiments, the data processing device based on the deep learning model is arranged in computer equipment. In certain embodiments, the computer device includes, but is not limited to, one or more of a smartphone, tablet, laptop, and the like.

According to the embodiment of the invention, a data processing parameter can be defined by using a specified coding language according to a preset compiling rule, a logic parameter corresponding to the definition is obtained, an operator code is compiled according to the logic parameter to generate a pip packet generation script, and the pip packet is generated according to the pip packet generation script, wherein the data processing parameter is used for indicating to acquire data from a database and/or preprocess the acquired data; receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers; determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set; splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model; acquiring a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed. In some embodiments, the pip refers to a general Python package management tool, which is used to provide functions of searching, downloading, installing, uninstalling, and the like for a Python package.

According to the embodiment of the invention, the data acquisition, data screening and preprocessing processes are abstracted into one process of deep learning model training, and the data acquisition, data screening and preprocessing logics are packaged into the deep learning model to be trained to obtain the data recommendation model, so that the coding work required by database connection, data acquisition, data screening, data preprocessing, data set splitting, batch conversion and the like can be reduced; the data recommendation result data of the data to be processed is obtained by inputting the data identification and the target interface identification of the data to be processed into the data recommendation model, so that the efficiency and the accuracy of data recommendation are further improved.

The embodiment of the application can acquire and process related data (such as a data set) based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The embodiment of the application can be applied to various fields, such as: the field of medical data recommendation, the field of financial data recommendation, and the like.

In one possible implementation, in the field of medical data recommendation, the data may be medical data associated with a medical treatment, such as examination data, assay data, and the like associated with a medical treatment.

The following describes schematically a data processing method based on a deep learning model according to an embodiment of the present invention with reference to fig. 1.

Referring to fig. 1, fig. 1 is a schematic flow chart of a deep learning model-based data processing method according to an embodiment of the present invention, and as shown in fig. 1, the method may be executed by a deep learning model-based data processing apparatus, which is disposed in a computer device. Specifically, the method of the embodiment of the present invention includes the following steps.

S101: defining data processing parameters by using a specified coding language according to a preset compiling rule, acquiring logic parameters corresponding to the definition, compiling operator codes according to the logic parameters to generate a pip packet generation script, and generating a pip packet according to the pip packet generation script.

In the embodiment of the invention, the data processing device based on the deep learning model can define data processing parameters by using a specified coding language according to a preset compiling rule, acquire the logic parameters corresponding to the definition, compile operator codes according to the logic parameters to generate a pip packet generation script, and generate the pip packet according to the pip packet generation script.

In certain embodiments, the specified encoding language includes, but is not limited to, C + +, and the data processing parameters include input parameters and output parameters.

In one embodiment, when the data processing device based on the deep learning model defines data processing parameters by using a specified coding language according to a preset compiling rule, the data processing device based on the deep learning model can acquire a self-defined operator compiling rule in the preset deep learning model; and compiling a general operator definition by using the specified coding language according to the custom operator compiling rule so as to realize the definition of the data processing parameters.

For example, when defining the data acquisition from a database and normalization, a general operator definition can be written as follows:

REGISTER_OP("RecReadDb")

REGISTER_OP("RecNormalization")

REGISTER_OP(“RecRelationExtra")

wherein, the OP is used for indicating an operator, and the operator refers to a basic unit of neural network computation, including but not limited to conv, posing, activation, normalization, and the like.

In one embodiment, when the data processing apparatus based on the deep learning model acquires the logic parameters corresponding to the definition, the logic parameters required to be introduced for implementing the definition may be acquired according to the definition of the data processing parameters, and specifically, the logic parameters required to be introduced may be determined according to the operator definition.

For example, assuming that the operator is defined as the operator RecReadDb, and the logic of the operator is to read data from the database, the logic parameters to be transmitted may be determined to include: database type, connection configuration, SQL which needs to be executed for reading data, fields and types which need to output data, and the like.

In one embodiment, when the deep learning model-based data processing apparatus compiles the operator code to generate the pip packet generation script according to the logic parameter, a python wrap file may be written according to the logic parameter so as to compile the operator code to generate the pip packet generation script.

In an embodiment, when the deep learning model-based data processing apparatus compiles the operator code to generate the pip packet generation script according to the logic parameter, the operator code may also be compiled to generate the shared library file according to the logic parameter.

In one embodiment, the shared library file may be a so file; when the deep learning model-based data processing device compiles a python wrap file and compiles operator codes to generate a shared library file and/or a pip packet generation script, the deep learning model-based data processing device can compile the python wrap file according to operator standards in the deep learning model and compile the operator codes to generate a so file and/or a pip packet generation script.

S102: receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers.

In the embodiment of the invention, a data processing device based on a deep learning model can receive an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers.

In some embodiments, the interface identifier is used to indicate a Python interface, and the data identifier is an identifier used to indicate data in a database, including but not limited to MySql, PostgreSql, Oracle, and the like.

S103: and determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set.

In the embodiment of the invention, the data processing device based on the deep learning model can determine the interface corresponding to the interface identifier, call the pip packet through the interface to acquire the training data corresponding to the data identifier from the database, and preprocess the training data to obtain the data set.

In one embodiment, when the deep learning model-based data processing apparatus calls the pip packet through the interface to acquire training data corresponding to the data identifier from the database, the data acquisition rule for acquiring data from the database may be determined according to the definition of the data processing parameter in the pip packet; and acquiring training data corresponding to the data identification from the database according to the data acquisition rule.

In one embodiment, when the deep learning model-based data processing apparatus preprocesses the training data to obtain a data set, a preprocessing rule for preprocessing the data may be determined according to the definition of the data processing parameters in the pip packet; and preprocessing the training data corresponding to the data identification acquired from the database according to the preprocessing rule.

In one embodiment, the pre-processing rules include, but are not limited to, screening rules. In one example, assuming that the training data includes user information, user behavior data, and product information, the deep learning model-based data processing apparatus may filter the user information, the user behavior data, and the product information according to the determined filtering rule.

For example, the data processing apparatus based on the deep learning model may filter user information corresponding to a user age group specified in the filtering rule from among the user information; the user behavior data can also be screened according to the behavior data of a certain product browsed or clicked by the user specified in the screening rule; product information may also be filtered according to product categories, prices, labels, etc. specified in the filtering rules.

The automation and the intellectualization of the data preprocessing can be realized by defining the parameters for preprocessing the data in the pip packet, the efficiency of the data preprocessing is improved, and the problem of low efficiency of the data preprocessing in the prior art by a manual coding mode is solved.

S104: and splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model.

In the embodiment of the invention, the data processing device based on the deep learning model can split the data set to obtain the training data set, and the training data set is input into the preset deep learning model to be trained to obtain the data recommendation model.

In an embodiment, when the data processing apparatus based on the deep learning model splits the data set to obtain a training data set, the data processing apparatus may split the data set according to a preset ratio to obtain the training data set.

In some embodiments, when the data set is split according to a preset ratio, the deep learning model-based data processing apparatus may split the data set into several sets, such as a training data set, a testing data set, and a verification data set, according to the preset ratio.

For example, if the data set has 100 pieces of data, and the data set is sequentially split according to the preset ratio of 0.7/0.15/0.15, the training data set has 70 pieces, the test data set has 15 pieces, and the verification data set has 15 pieces.

In some embodiments, the splitting manner may be random splitting, or may be extraction splitting according to distribution of some important features in the data.

In an embodiment, when the deep learning model-based data processing apparatus inputs the training data set into a preset deep learning model for training to obtain a data recommendation model, the deep learning model-based data processing apparatus may perform batch splitting processing on the training data set according to a preset batch splitting rule to obtain a plurality of sub-training data sets; and inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model.

For example, assuming that the total amount of the training data set is 1000, if the preset batch splitting rule batch _ size is 50, the total amount of the batch is 1000/50 ═ 20, so that each batch contains 50 pieces of data, that is, 20 sub-training data sets are obtained by splitting, and each sub-training data set contains 50 pieces of data.

In an embodiment, when the data recommendation model is obtained by inputting the plurality of sub-training data sets into the preset deep learning model for training, the deep learning model-based data processing apparatus may input the plurality of sub-training data sets into the preset deep learning model to obtain the loss function value; when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep learning model according to the loss function value, and inputting the plurality of sub-training data sets into the deep learning model with the adjusted model parameters for iterative training; and when the loss function value obtained by iterative training meets the preset condition, determining to obtain the data recommendation model.

S105: acquiring a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed.

In the embodiment of the invention, a data processing device based on a deep learning model can obtain a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and the target interface identifier and the data identifier of the data to be processed are input into the data recommendation model to obtain recommendation result data of the data to be processed.

In one embodiment, when the data processing apparatus based on the deep learning model inputs the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed, the data processing apparatus may input the target interface identifier and the data identifier of the data to be processed into the data recommendation model, and call a pip packet in the data recommendation model through a target interface corresponding to the target interface identifier; acquiring data to be processed corresponding to the data identification of the data to be processed from the database according to the definition of the data processing parameters in the pip packet; and preprocessing the data to be processed according to the definition of the data processing parameters in the pip packet to obtain target data, and inputting the target data into the data recommendation model to obtain recommendation result data corresponding to the target data.

In the embodiment of the invention, a data processing device based on a deep learning model can define data processing parameters by using a specified coding language according to a preset compiling rule, acquire logic parameters corresponding to the definition, compile operator codes according to the logic parameters to generate a pip packet generation script, and generate a pip packet according to the pip packet generation script, wherein the data processing parameters are used for indicating to acquire data from a database and/or preprocess the acquired data; receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers; determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set; splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model; acquiring a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed. According to the embodiment of the invention, the data acquisition, data screening and preprocessing processes are abstracted into one process of deep learning model training, and the data acquisition, data screening and preprocessing logics are packaged into the deep learning model to be trained to obtain the data recommendation model, so that the coding work required by database connection, data acquisition, data screening, data preprocessing, data set splitting, batch conversion and the like can be reduced; the data recommendation result data of the data to be processed is obtained by inputting the data identification and the target interface identification of the data to be processed into the data recommendation model, so that the efficiency and the accuracy of data recommendation are further improved.

The embodiment of the invention also provides a deep learning model-based data processing device, which is used for executing the unit of the method in any one of the preceding items. Specifically, referring to fig. 2, fig. 2 is a schematic block diagram of a data processing apparatus based on a deep learning model according to an embodiment of the present invention. The data processing device based on the deep learning model of the embodiment comprises: a generating unit 201, a receiving unit 202, a processing unit 203, a training unit 204 and a recommending unit 205.

The generating unit 201 is configured to define a data processing parameter by using a specified coding language according to a preset compiling rule, acquire a logic parameter corresponding to the definition, compile an operator code according to the logic parameter to generate a pip packet generating script, and generate a pip packet according to the pip packet generating script, where the data processing parameter is used to instruct to acquire data from a database and/or to preprocess the acquired data;

a receiving unit 202, configured to receive an acquisition request of a data set, where the acquisition request of the data set carries an interface identifier and multiple data identifiers;

the processing unit 203 is configured to determine an interface corresponding to the interface identifier, call the pip packet through the interface to obtain training data corresponding to the data identifier from the database, and pre-process the training data to obtain a data set;

the training unit 204 is configured to split the data set to obtain a training data set, and input the training data set into a preset deep learning model for training to obtain a data recommendation model;

the recommending unit 205 is configured to obtain a data recommending request, where the data recommending request carries a target interface identifier and a data identifier of data to be processed, and input the target interface identifier and the data identifier of the data to be processed into the data recommending model to obtain recommending result data of the data to be processed.

Further, when the generating unit 201 defines the data processing parameter by using a specified coding language according to a preset compiling rule, the generating unit is specifically configured to:

obtaining a custom operator compiling rule in the preset deep learning model;

and compiling a general operator definition by using the specified coding language according to the custom operator compiling rule so as to realize the definition of the data processing parameters.

Further, when the processing unit 203 calls the pip packet through the interface to obtain the training data corresponding to the data identifier from the database, the processing unit is specifically configured to:

determining a data acquisition rule for acquiring data from the database according to the definition of the data processing parameters in the pip packet;

and acquiring training data corresponding to the data identification from the database according to the data acquisition rule.

Further, when the processing unit 203 preprocesses the training data to obtain a data set, it is specifically configured to:

determining a preprocessing rule for preprocessing data according to the definition of the data processing parameters in the pip packet;

and preprocessing the training data corresponding to the data identification acquired from the database according to the preprocessing rule.

Further, when the training unit 204 splits the data set to obtain a training data set, it is specifically configured to:

splitting the data set according to a preset proportion to obtain the training data set;

the training unit 204 inputs the training data set into a preset deep learning model for training, and when a data recommendation model is obtained, the training unit is specifically configured to:

carrying out batch splitting processing on the training data set according to a preset batch splitting rule to obtain a plurality of sub-training data sets;

and inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model.

Further, the training unit 204 is specifically configured to, when inputting the plurality of sub-training data sets into the preset deep learning model for training, and obtaining the data recommendation model:

inputting the plurality of sub-training data sets into the preset deep learning model to obtain a loss function value;

and when the loss function value obtained by iterative training meets the preset condition, determining to obtain the data recommendation model.

Further, when the recommending unit 205 inputs the target interface identifier and the data identifier of the to-be-processed data into the data recommending model to obtain recommendation result data of the to-be-processed data, the recommending unit is specifically configured to:

acquiring data to be processed corresponding to the data identification of the data to be processed from the database according to the definition of the data processing parameters in the pip packet;

Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device provided in an embodiment of the present invention, and in some embodiments, the computer device in the embodiment shown in fig. 3 may include: one or more processors 301; one or more input devices 302, one or more output devices 303, and memory 304. The processor 301, the input device 302, the output device 303, and the memory 304 are connected by a bus 305. The memory 304 is used for storing computer programs, including programs, and the processor 301 is used for executing the programs stored in the memory 304. Wherein the processor 301 is configured to invoke the program to perform:

receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers;

splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model;

Further, when the processor 301 defines the data processing parameter by using the specified encoding language according to the preset compiling rule, the method is specifically configured to:

obtaining a custom operator compiling rule in the preset deep learning model;

and compiling a general operator definition by using the specified coding language according to the custom operator compiling rule so as to realize the definition of the data processing parameters.

Further, when the processor 301 calls the pip packet through the interface to obtain the training data corresponding to the data identifier from the database, the method is specifically configured to:

determining a data acquisition rule for acquiring data from the database according to the definition of the data processing parameters in the pip packet;

and acquiring training data corresponding to the data identification from the database according to the data acquisition rule.

Further, the processor 301 performs preprocessing on the training data to obtain a data set, and is specifically configured to:

determining a preprocessing rule for preprocessing data according to the definition of the data processing parameters in the pip packet;

and preprocessing the training data corresponding to the data identification acquired from the database according to the preprocessing rule.

Further, when the processor 301 splits the data set to obtain a training data set, it is specifically configured to:

splitting the data set according to a preset proportion to obtain the training data set;

the processor 301 inputs the training data set into a preset deep learning model for training, and when obtaining a data recommendation model, is specifically configured to:

carrying out batch splitting processing on the training data set according to a preset batch splitting rule to obtain a plurality of sub-training data sets;

and inputting the plurality of sub-training data sets into the preset deep learning model for training to obtain the data recommendation model.

Further, the processor 301 inputs the plurality of sub-training data sets into the preset deep learning model for training, and when the data recommendation model is obtained, is specifically configured to:

inputting the plurality of sub-training data sets into the preset deep learning model to obtain a loss function value;

and when the loss function value obtained by iterative training meets the preset condition, determining to obtain the data recommendation model.

Further, when the processor 301 inputs the target interface identifier and the data identifier of the to-be-processed data into the data recommendation model to obtain recommendation result data of the to-be-processed data, the processor is specifically configured to:

acquiring data to be processed corresponding to the data identification of the data to be processed from the database according to the definition of the data processing parameters in the pip packet;

In the embodiment of the invention, computer equipment can define data processing parameters by using a specified coding language according to a preset compiling rule, acquire logic parameters corresponding to the definition, compile operator codes according to the logic parameters to generate a pip packet generation script, and generate a pip packet according to the pip packet generation script, wherein the data processing parameters are used for indicating to acquire data from a database and/or preprocess the acquired data; receiving an acquisition request of a data set, wherein the acquisition request of the data set carries an interface identifier and a plurality of data identifiers; determining an interface corresponding to the interface identifier, calling the pip packet through the interface to acquire training data corresponding to the data identifier from the database, and preprocessing the training data to obtain a data set; splitting the data set to obtain a training data set, inputting the training data set into a preset deep learning model for training to obtain a data recommendation model; acquiring a data recommendation request, wherein the data recommendation request carries a target interface identifier and a data identifier of data to be processed, and inputting the target interface identifier and the data identifier of the data to be processed into the data recommendation model to obtain recommendation result data of the data to be processed. According to the embodiment of the invention, the data acquisition, data screening and preprocessing processes are abstracted into one process of deep learning model training, and the data acquisition, data screening and preprocessing logics are packaged into the deep learning model to be trained to obtain the data recommendation model, so that the coding work required by database connection, data acquisition, data screening, data preprocessing, data set splitting, batch conversion and the like can be reduced; the data recommendation result data of the data to be processed is obtained by inputting the data identification and the target interface identification of the data to be processed into the data recommendation model, so that the efficiency and the accuracy of data recommendation are further improved.

It should be understood that, in the embodiment of the present invention, the Processor 301 may be a Central Processing Unit (CPU), and the Processor may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The input device 302 may include a touch pad, a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.

The memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.

In specific implementation, the processor 301, the input device 302, and the output device 303 described in this embodiment of the present invention may execute the implementation described in the method embodiment shown in fig. 1 provided in this embodiment of the present invention, and may also execute the implementation of the data processing apparatus based on the deep learning model described in fig. 2 in this embodiment of the present invention, which is not described herein again.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for processing data based on a deep learning model described in the embodiment corresponding to fig. 1 may be implemented, or the data processing apparatus based on a deep learning model described in the embodiment corresponding to fig. 2 may also be implemented, which is not described herein again.

The computer readable storage medium may be an internal storage unit of the deep learning model based data processing apparatus according to any of the foregoing embodiments, for example, a hard disk or a memory of the deep learning model based data processing apparatus. The computer readable storage medium may also be an external storage device of the deep learning model-based data processing apparatus, such as a plug-in hard disk provided on the deep learning model-based data processing apparatus, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the computer-readable storage medium may further include both an internal storage unit and an external storage device of the deep learning model-based data processing device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the deep learning model-based data processing apparatus. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a computer-readable storage medium, which includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

It is emphasized that the data may also be stored in a node of a blockchain in order to further ensure the privacy and security of the data. The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

18页详细技术资料下载

Data processing method, device, equipment and medium based on deep learning model

相关技术

网友询问留言