Text data manufacturing method and device based on deep learning, terminal and storage medium

文档序号：1378964 发布日期：2020-08-14 浏览：14次中文

阅读说明：本技术 基于深度学习的文本数据制造方法、装置、终端及存储介质 (Text data manufacturing method and device based on deep learning, terminal and storage medium ) 是由周康明胡威于 2020-04-29 设计创作，主要内容包括：本申请提供基于深度学习的文本数据制造方法、装置、终端及存储介质,包括：对原始字符数据进行预处理以生成对应的字符文本；对字符文本进行图像处理以生成对应的文字图像；构建用于制造样本图像的生成对抗网络模型,并在所构建的生成对抗网络模型中加入作为约束条件的空间变换网络,以使制造出的样本图像学习扭曲文本图像的空间位置信息。本发明在采用传统图像处理方式进行数据样本制造的基础上,将空间变换网络加入到生成对抗网络中,使样本更好的学习样本的空间位置等信息,增加对扭曲、旋转、抖动等样本的拟合性。同时,在计算网络损失值时引入参数惩罚因子q,依据生成样本的分布情况人为调节真实样本特征在生成样本的分布。(The application provides a text data manufacturing method, a text data manufacturing device, a text data manufacturing terminal and a text data storage medium based on deep learning, wherein the text data manufacturing method comprises the following steps: preprocessing original character data to generate corresponding character texts; performing image processing on the character text to generate a corresponding character image; and constructing a generation countermeasure network model for manufacturing the sample image, and adding a space transformation network serving as a constraint condition to the constructed generation countermeasure network model so that the manufactured sample image learns the space position information of the distorted text image. According to the method, on the basis of manufacturing the data sample by adopting a traditional image processing mode, the space transformation network is added into the generation countermeasure network, so that the sample can better learn the information such as the space position of the sample, and the fitting performance of the sample on distortion, rotation, jitter and the like is improved. Meanwhile, a parameter penalty factor q is introduced when a network loss value is calculated, and the distribution of real sample characteristics in the generated samples is adjusted manually according to the distribution condition of the generated samples.)

1. A text data manufacturing method based on deep learning is characterized by comprising the following steps:

preprocessing original character data to generate corresponding character texts;

performing image processing on the character text to generate a corresponding character image;

and constructing a generation countermeasure network model for manufacturing the sample image, and adding a space transformation network serving as a constraint condition to the constructed generation countermeasure network model so that the manufactured sample image learns the space position information of the distorted text image.

2. The method of claim 1, wherein preprocessing the raw character data to generate corresponding character text comprises:

generating a character file in which a correspondence table between Chinese characters and English is recorded;

expanding the character file to generate character files of different language versions;

character files of different font styles of each language version are collected.

3. The method of claim 1, wherein the image processing the character text to generate a corresponding text image comprises:

defining input parameters; the input parameters include: any one or more combinations of input and output catalogues, character and font file catalogues, image sizes, rotation angles, generation quantity ratios and rotation angles;

reading the character file by utilizing a Chinese character generating function, and presetting font format parameters to generate sample data;

and performing data enhancement processing on the sample data.

4. The method of claim 3, wherein the data enhancement processing of the sample data comprises:

adding any one or more of random noise, swelling corrosion and channel variation.

5. The method of claim 1, wherein constructing a generative confrontation network model for manufacturing the sample image, and adding a spatial transformation network as a constraint condition to the constructed generative confrontation network model, so that the manufactured sample image learns spatial position information of the warped text image, comprises:

constructing a discriminator network for discriminating true and false attributes of the sample;

constructing a generator network for outputting generated samples;

constructing a generative confrontation network model comprising the generator network and a generator network;

adding a space transformation network as a conditional constraint in the generation countermeasure network model;

constructing a loss function for generating a confrontation network model, wherein the loss function comprises a loss function when a real image is used as discrimination and a loss function when a generated image is used as discrimination;

a penalty strength parameter is introduced for the loss function to increase the weight of a first loss value when the first loss value is high when using the real image as the discrimination, and to increase the weight of a second loss value when the second loss value is high when using the generated image as the discrimination.

6. The method of claim 5, wherein generating the antagonistic network model comprises:

a generator network for inputting the three-dimensional noise vector and generating and outputting a corresponding false image;

a discriminator network having as inputs a real image and a false image output by the generator network, respectively;

a spatial transformation network;

wherein the generative countermeasure network model outputs a prediction value for predicting the image true and false based on the discriminator network and the spatial transform network.

7. The method of claim 5, comprising:

and after the input image is processed by the space conversion network, an affine transformation matrix is obtained through calculation, and the converted image is output based on the affine transformation matrix.

8. A text data producing apparatus based on deep learning, characterized by comprising:

the preprocessing module is used for preprocessing the original character data to generate a corresponding character text;

the image processing module is used for carrying out image processing on the character text to generate a corresponding character image;

and the model construction module is used for constructing a generation countermeasure network model for manufacturing the sample image and adding a space transformation network serving as a constraint condition into the constructed generation countermeasure network model so as to enable the manufactured sample image to learn the space position information of the distorted text image.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the deep learning-based text data manufacturing method according to any one of claims 1 to 7.

10. An electronic terminal, comprising: a processor and a memory;

the memory is used for storing a computer program;

the processor is configured to execute the computer program stored in the memory to cause the terminal to execute the text data manufacturing method based on deep learning according to any one of claims 1 to 7.

Technical Field

The present application relates to the field of data manufacturing technologies, and in particular, to a text data manufacturing method and apparatus based on deep learning, a terminal, and a storage medium.

Background

With the widespread application of computer vision technology, deep learning has become the basis of this application field. Due to the characteristics of faster speed, larger scale of processed data, stronger universality and the like of the deep learning technology compared with the traditional image processing technology, the method has better theoretical basis for introducing the deep learning technology into the manufacturing and training of data samples.

At present, the deep learning image processing technology is widely applied in the fields of target positioning, classification retrieval, face recognition and the like. At present, the idea of deep learning is applied to data set production, the problem that the universality of a model is restricted by data shortage, single data distribution and the like is solved, the recognition of the deep learning in a natural scene has better applicability, and the method is a hotspot research direction in the field.

However, the conventional method for manufacturing text data is mainly based on the conventional image processing method, and although a method based on deep learning is also used for sample manufacturing, the application is not comprehensive at present, and the method is basically used in specific fields such as face recognition, certain fixed scenes and the like. Although the traditional image processing method can meet the requirements of uniform distribution and large quantity of created data, the traditional image processing method often cannot create 'mythical' samples according to the style of a real scene, so that many created data are redundant repeatedly.

Therefore, how to apply the countermeasure generation network to the application, in combination with the conventional image processing method, data closer to the real style can be created under the condition of less data samples, so that the data distribution is more uniform, and the usability of the created data is improved, is a technical problem to be solved in the art.

Content of application

In view of the above-mentioned drawbacks of the prior art, it is an object of the present application to provide a text data manufacturing method, apparatus, terminal and storage medium based on deep learning, which solve the problems in the prior art.

To achieve the above and other related objects, a first aspect of the present application provides a text data manufacturing method based on deep learning, including: preprocessing original character data to generate corresponding character texts; performing image processing on the character text to generate a corresponding character image; and constructing a generation countermeasure network model for manufacturing the sample image, and adding a space transformation network serving as a constraint condition to the constructed generation countermeasure network model so that the manufactured sample image learns the space position information of the distorted text image.

In some embodiments of the first aspect of the present application, the preprocessing the raw character data to generate corresponding character text includes: generating a character file in which a correspondence table between Chinese characters and English is recorded; expanding the character file to generate character files of different language versions; character files of different font styles of each language version are collected.

In some embodiments of the first aspect of the present application, the image processing the character text to generate a corresponding text image includes: defining input parameters; the input parameters include: any one or more combinations of input and output catalogues, character and font file catalogues, image sizes, rotation angles, generation quantity ratios and rotation angles; reading the character file by utilizing a Chinese character generating function, and presetting font format parameters to generate sample data; and performing data enhancement processing on the sample data.

In some embodiments of the first aspect of the present application, the data enhancement processing on the sample data includes: adding any one or more of random noise, swelling corrosion and channel variation.

In some embodiments of the first aspect of the present application, constructing a generative confrontation network model for manufacturing a sample image, and adding a spatial transformation network as a constraint condition to the constructed generative confrontation network model, so that the manufactured sample image learns spatial position information of a warped text image, includes: constructing a discriminator network for discriminating true and false attributes of the sample; constructing a generator network for outputting generated samples; constructing a generative confrontation network model comprising the generator network and a generator network; adding a space transformation network as a conditional constraint in the generation countermeasure network model; constructing a loss function for generating a confrontation network model, wherein the loss function comprises a loss function when a real image is used as discrimination and a loss function when a generated image is used as discrimination; a penalty strength parameter is introduced for the loss function to increase the weight of a first loss value when the first loss value is high when using the real image as the discrimination, and to increase the weight of a second loss value when the second loss value is high when using the generated image as the discrimination.

In some embodiments of the first aspect of the present application, the generating the antagonistic network model comprises: a generator network for inputting the three-dimensional noise vector and generating and outputting a corresponding false image; a discriminator network having as inputs a real image and a false image output by the generator network, respectively; a spatial transformation network; wherein the generative countermeasure network model outputs a prediction value for predicting the image true and false based on the discriminator network and the spatial transform network.

In some embodiments of the first aspect of the present application, the method comprises: and after the input image is processed by the space conversion network, an affine transformation matrix is obtained through calculation, and the converted image is output based on the affine transformation matrix.

To achieve the above and other related objects, a second aspect of the present application provides a text data manufacturing apparatus based on deep learning, comprising: the preprocessing module is used for preprocessing the original character data to generate a corresponding character text; the image processing module is used for carrying out image processing on the character text to generate a corresponding character image; and the model construction module is used for constructing a generation countermeasure network model for manufacturing the sample image and adding a space transformation network serving as a constraint condition into the constructed generation countermeasure network model so as to enable the manufactured sample image to learn the space position information of the distorted text image.

To achieve the above and other related objects, a third aspect of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the deep learning-based text data manufacturing method.

To achieve the above and other related objects, a fourth aspect of the present application provides an electronic terminal comprising: a processor and a memory; the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the text data manufacturing method based on deep learning.

As described above, the text data manufacturing method, apparatus, terminal and storage medium based on deep learning according to the present application have the following advantages: according to the method, on the basis of manufacturing the data sample by adopting a traditional image processing mode, the space transformation network is added into the generation countermeasure network, so that the manufactured sample can better learn the information such as the space position of the sample, the fitting performance of the sample such as distortion, rotation and shake is improved, and the learning effect is closer to the real sample. Meanwhile, a parameter penalty factor q is introduced when a network loss value is calculated, and the distribution of real sample characteristics in generated samples can be artificially and dynamically adjusted according to the distribution condition of the generated samples.

Drawings

Fig. 1 is a flowchart illustrating a text data manufacturing method based on deep learning according to an embodiment of the present application.

Fig. 2 is a schematic flow chart illustrating a process of constructing a generation countermeasure network model according to an embodiment of the present application.

Fig. 3 is a schematic structural diagram of generating a countermeasure network model according to an embodiment of the present application.

Fig. 4A is a schematic structural diagram of a spatial transform network according to an embodiment of the present application.

Fig. 4B is a schematic diagram illustrating a transformation effect of a spatial transformation network according to an embodiment of the present application.

Fig. 5 is a schematic structural diagram of an apparatus for generating text data based on deep learning according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of an electronic terminal according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application is provided by way of specific examples, and other advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure herein. The present application is capable of other and different embodiments and its several details are capable of modifications and/or changes in various respects, all without departing from the spirit of the present application. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It is noted that in the following description, reference is made to the accompanying drawings which illustrate several embodiments of the present application. It is to be understood that other embodiments may be utilized and that mechanical, structural, electrical, and operational changes may be made without departing from the spirit and scope of the present application. The following detailed description is not to be taken in a limiting sense, and the scope of embodiments of the present application is defined only by the claims of the issued patent. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. Spatially relative terms, such as "upper," "lower," "left," "right," "lower," "below," "lower," "above," "upper," and the like, may be used herein to facilitate describing one element or feature's relationship to another element or feature as illustrated in the figures.

In this application, unless expressly stated or limited otherwise, the terms "mounted," "connected," "secured," "retained," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

Also, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," and/or "comprising," when used in this specification, specify the presence of stated features, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, operations, elements, components, items, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions or operations are inherently mutually exclusive in some way.

The existing method for manufacturing text data is mainly based on the traditional image processing mode, although a method based on deep learning is used for sample manufacturing, the application is not comprehensive at present, and the method is basically used in specific fields such as face recognition, certain fixed scenes and the like. Although the traditional image processing method can meet the requirements of uniform distribution and large quantity of created data, the traditional image processing method often cannot create 'mythical' samples according to the style of a real scene, so that many created data are redundant repeatedly.

In view of this, the present invention provides a text data manufacturing method, apparatus, terminal and storage medium based on deep learning, and it should be understood that the deep learning method provided in this patent mainly applies a deep learning technique to the manufacturing of text data samples, such as text data manufacturing in natural scenes, such as identity cards and insurance policies, which are common in daily life. Due to the fact that collected data in real life are limited by factors such as sample distribution is not wide, data are single, types are deficient, and the like, the learned recognition model is not robust enough, and scene applicability is not strong.

Therefore, on the basis of manufacturing the data sample by adopting a traditional image processing mode, the space transformation network is added into the generation countermeasure network, so that the manufactured sample can better learn the information such as the space position of the sample, the fitting performance of the sample such as distortion, rotation and shake is improved, and the learning effect is closer to the real sample. Meanwhile, a parameter penalty factor q is introduced when a network loss value is calculated, and the distribution of real sample characteristics in generated samples can be artificially and dynamically adjusted according to the distribution condition of the generated samples.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention are further described in detail by the following embodiments in conjunction with the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

15页详细技术资料下载

Text data manufacturing method and device based on deep learning, terminal and storage medium

相关技术

网友询问留言