Training method of character generation model, character generation method, device, equipment and medium

文档序号：1964122 发布日期：2021-12-14 浏览：12次中文

阅读说明：本技术 字符生成模型的训练方法、字符生成方法、装置和设备和介质 (Training method of character generation model, character generation method, device, equipment and medium ) 是由唐礼承刘家铭于 2021-09-09 设计创作，主要内容包括：本公开提供了字符生成模型的训练方法、字符生成方法、装置和设备,涉及人工智能技术领域,具体为计算机视觉和深度学习技术领域。具体实现方案为：将源域样本字和目标域风格字输入至字符生成模型,得到目标域生成字；将所述目标域生成字和目标域样本字输入至预先训练的字符分类模型,计算所述字符生成模型的特征损失；根据所述特征损失调整所述字符生成模型的参数。本公开实施例可以提高字符生成模型生成的字体的准确率。(The disclosure provides a training method of a character generation model, a character generation method, a character generation device and a character generation device, and relates to the technical field of artificial intelligence, in particular to the technical field of computer vision and deep learning. The specific implementation scheme is as follows: inputting the source domain sample characters and the target domain style characters into a character generation model to obtain target domain generation characters; inputting the target domain generating words and the target domain sample words into a pre-trained character classification model, and calculating the characteristic loss of the character generating model; and adjusting parameters of the character generation model according to the characteristic loss. The method and the device for generating the character font can improve the accuracy of the font generated by the character generation model.)

1. A training method of a character generation model comprises the following steps:

inputting the source domain sample characters and the target domain style characters into a character generation model to obtain target domain generation characters;

inputting the target domain generating words and the target domain sample words into a pre-trained character classification model, and calculating the characteristic loss of the character generating model;

and adjusting parameters of the character generation model according to the characteristic loss.

2. The method of claim 1, wherein the inputting the target domain generated words and target domain sample words to a pre-trained character classification model, calculating feature losses of the character generation model, comprises:

inputting the target domain generating words into the character classification model to obtain a generated feature map output by at least one feature layer of the character classification model;

inputting the target domain sample word into the character classification model to obtain a sample feature map output by the at least one feature layer of the character classification model;

and calculating the feature loss of the character generation model according to the difference between the generated feature map of the at least one feature layer and the sample feature map.

3. The method of claim 2, wherein said calculating a feature loss of the character generation model comprises:

calculating pixel difference between a generated feature map and a sample feature map of the feature layer aiming at each feature layer in the at least one feature layer to obtain pixel loss of the feature layer;

and calculating the characteristic loss of the character generation model according to the pixel loss of the at least one characteristic layer.

4. The method of claim 3, wherein the calculating pixel differences between the generated feature map and the sample feature map of the feature layer comprises:

calculating the absolute value of the difference between the pixel value of the pixel point and the pixel value of the pixel point at the corresponding position in the sample characteristic diagram aiming at the pixel point at each position in the generated characteristic diagram of the characteristic layer to obtain the difference of the pixel point at each position;

and determining the pixel difference between the generated feature map and the sample feature map of the feature layer according to the difference of the pixel points of the plurality of positions.

5. The method of claim 1, further comprising:

inputting the target domain style words into a character generation model to obtain first style characteristic vectors of the target domain style words;

inputting the target domain generating words into the character generating model to obtain second style characteristic vectors of the target domain generating words;

inputting the second style feature vector and the first style feature vector into a component classification model, and calculating component classification loss;

inputting the target domain sample word and the target domain generating word into an identification model, and calculating character countermeasure loss and style countermeasure loss;

inputting the target domain generated words into the character classification model, and calculating the wrong word loss;

and adjusting parameters of the character generation model according to the component classification loss, the character confrontation loss, the style confrontation loss and the misword loss.

6. The method of any of claims 1-5, wherein the source domain sample words are images having a source domain font style and the target domain sample words are images having a target domain font style.

7. A character generation method, comprising:

acquiring a source domain input word and a corresponding target domain input word;

inputting the source domain input word and the target input word into a character generation model to obtain a target domain new word; wherein the character generation model is obtained by training according to the training method of the character generation model as claimed in any one of claims 1 to 6.

8. A training apparatus for a character generation model, comprising:

the target domain generating character acquisition module is used for inputting the source domain sample characters and the target domain style characters into the character generating model to obtain target domain generating characters;

the characteristic loss calculation module is used for inputting the target domain generating words and the target domain sample words into a pre-trained character classification model and calculating the characteristic loss of the character generating model;

and the first loss adjusting module is used for adjusting the parameters of the character generating model according to the characteristic loss.

9. The apparatus of claim 8, wherein the feature loss calculation module comprises:

the first feature map generating unit is used for inputting the target domain generating words into the character classification model to obtain a generated feature map output by at least one feature layer of the character classification model;

the second feature map generation unit is used for inputting the target domain sample words into the character classification model to obtain a sample feature map output by the at least one feature layer of the character classification model;

and the characteristic loss calculation unit is used for calculating the characteristic loss of the character generation model according to the difference between the generated characteristic diagram and the sample characteristic diagram of the at least one characteristic layer.

10. The apparatus of claim 9, wherein the feature loss calculation unit comprises:

the pixel loss calculation subunit is used for calculating the pixel difference between the generated feature map and the sample feature map of the feature layer aiming at each feature layer in the at least one feature layer to obtain the pixel loss of the feature layer;

and the characteristic loss calculation subunit is used for calculating the characteristic loss of the character generation model according to the pixel loss of the at least one characteristic layer.

11. The apparatus of claim 10, wherein the pixel loss calculation subunit is to: calculating the absolute value of the difference between the pixel value of the pixel point and the pixel value of the pixel point at the corresponding position in the sample characteristic diagram aiming at the pixel point at each position in the generated characteristic diagram of the characteristic layer to obtain the difference of the pixel point at each position; and determining the pixel difference between the generated feature map and the sample feature map of the feature layer according to the difference of the pixel points of the plurality of positions.

12. The apparatus of claim 8, further comprising:

the first feature vector calculation module is used for inputting the target domain style words into a character generation model to obtain first style feature vectors of the target domain style words;

the second feature vector calculation module is used for inputting the target domain generated words into the character generation model to obtain second style feature vectors of the target domain generated words;

the component classification loss calculation module is used for inputting the second style feature vector and the first style feature vector into a component classification model and calculating component classification loss;

the confrontation loss calculation module is used for inputting the target domain sample word and the target domain generating word into an identification model and calculating character confrontation loss and style confrontation loss;

the wrong character loss calculation module is used for inputting the target domain generated characters into the character classification model and calculating the wrong character loss;

and the second loss adjusting module is used for adjusting parameters of the character generation model according to the component classification loss, the character confrontation loss, the style confrontation loss and the wrong character loss.

13. The apparatus of any of claims 8 to 12, wherein the source domain sample words are images having a source domain font style and the target domain sample words are images having a target domain font style.

14. A character generation apparatus comprising:

the input word acquisition module is used for acquiring a source domain input word and a corresponding target domain input word;

the character generation module is used for inputting the source domain input characters and the target input characters into a character generation model to obtain new characters of a target domain; wherein the character generation model is obtained by training according to the training method of the character generation model as claimed in any one of claims 1 to 6.

15. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of training a character generation model according to any one of claims 1 to 6 or to perform a method of character generation according to claim 7.

16. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a training method of a character generation model according to any one of claims 1 to 6 or perform the character generation method of claim 7.

17. A computer program product comprising a computer program which, when executed by a processor, implements a training method for a character generation model according to any one of claims 1-6, or performs a character generation method according to claim 7.

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to the field of computer vision and deep learning technologies, and in particular, to a training method for a character generation model, a character generation method, an apparatus, a device, and a medium.

Background

Image processing is a practical technology with great social and economic benefits, and is widely applied to various industries and daily life of people.

Style migration of an image means that the content of one image is kept unchanged, and the style is migrated from the image to another image to form a new artistic image.

Disclosure of Invention

The present disclosure provides a training method of a character generation model, a character generation method, apparatus, device, and medium.

According to an aspect of the present disclosure, there is provided a training method of a character generation model, including:

inputting the source domain sample characters and the target domain style characters into a character generation model to obtain target domain generation characters;

and adjusting parameters of the character generation model according to the characteristic loss.

According to another aspect of the present disclosure, there is provided a character generation method including:

acquiring a source domain input word and a corresponding target domain input word;

inputting the source domain input word and the target input word into a character generation model to obtain a target domain new word; wherein the character generation model is trained according to the method according to any embodiment of the present disclosure.

According to an aspect of the present disclosure, there is provided a training apparatus for a character generation model, including:

and the first loss adjusting module is used for adjusting the parameters of the character generating model according to the characteristic loss.

According to another aspect of the present disclosure, there is provided a character generation apparatus including:

the input word acquisition module is used for acquiring a source domain input word and a corresponding target domain input word;

the character generation module is used for inputting the source domain input characters and the target input characters into a character generation model to obtain new characters of a target domain; the character generation model is obtained according to the training method of the character generation model according to any embodiment of the present disclosure.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of training a character generation model according to any of the embodiments of the present disclosure or to perform a method of character generation according to any of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a training method of a character generation model according to any one of the embodiments of the present disclosure or perform a character generation method according to any one of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the training method of the character generation model according to any of the embodiments of the present disclosure, or performs the character generation method according to any of the embodiments of the present disclosure.

The method and the device for generating the character font can improve the accuracy of the font generated by the character generation model.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of a training method for a character generation model according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a character generation model provided in accordance with an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of another training method for a character generation model provided in accordance with an embodiment of the present disclosure;

FIG. 4 is a visualization effect graph of a character generation model constrained using feature loss according to one embodiment provided by the present disclosure;

FIG. 5 is a visualization effect graph of a character generation model using feature loss constraints according to another embodiment provided by the embodiments of the present disclosure;

FIG. 6 is a graph comparing the effects of a generated result of a character generation model using feature loss constraint according to an embodiment of the disclosure;

FIG. 7 is an effect diagram of generating words by a character generation model according to an embodiment of the disclosure;

FIG. 8 is a schematic diagram of another training method for a character generation model provided in accordance with an embodiment of the present disclosure;

FIG. 9 is a scene diagram of a training method of a character generation model according to an embodiment of the present disclosure;

FIG. 10 is a schematic diagram of a character generation method provided in accordance with an embodiment of the present disclosure;

FIG. 11 is a schematic diagram of a training apparatus for a character generation model according to an embodiment of the present disclosure;

FIG. 12 is a schematic diagram of a character generation apparatus provided in accordance with an embodiment of the present disclosure;

fig. 13 is a block diagram of an electronic device for implementing a training method of a character generation model or a character generation method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a flowchart of a training method for a character generation model according to an embodiment of the present disclosure, which may be applied to training the character generation model, where the character generation model is used for converting a source-domain style character into a target-domain style character. The method of this embodiment may be executed by a training apparatus for generating a character model, where the training apparatus may be implemented in a software and/or hardware manner, and is specifically configured in an electronic device with a certain data operation capability, where the electronic device may be a client device or a server device, and the client device may be a mobile phone, a tablet computer, a vehicle-mounted terminal, a desktop computer, and the like.

S101, inputting the source domain sample characters and the target domain style characters into a character generation model to obtain target domain generation characters.

The source domain sample word may refer to an image having a source domain font style, and the source domain font style may refer to a conventional font of a character, and may also be referred to as a printing font, for example, a regular font, a song font, or a black font in a chinese character, and a New Roman (Times New Roman) or Calibri font in a western character, and further, the character may further include a numeric character. The western characters may include characters such as english, german, russian, or italian, and are not particularly limited. The target domain style word may refer to an image having a target domain font style. The target domain generated word may refer to an image having a target domain font style. The target domain font style may be a user handwritten font style of characters or other artistic word font style. It should be noted that the words in the embodiments of the present disclosure are actually characters. The image contents of the source domain sample word and the target domain generating word are the same, and the styles are different. The target domain style words and the source domain sample words have the same partial image content and different styles, and the target domain style words and the target domain generated words have the same partial image content and the same styles. The character may be composed of at least one component, and having the same partial image content may mean having the same component, and in fact, the style standard word, the source domain sample word, and the target domain generated word have the same at least one component. The components can be radicals of Chinese characters, roots of English characters and the like. For example, "you" can consist of the components "," and "Er"; may be composed of components, & and components "therefore" "or may be composed of components, & alpha, & ancient, & component" ""; the garment may be comprised of a component "garment".

The method comprises the steps of obtaining a set of pre-acquired target domain font style words, searching for words comprising at least one component in the set of pre-acquired target domain font style words according to the components, and determining the words as the target domain style words.

In a specific example, the source domain sample word is an image generated by a regular script "you", and the target domain generator word is an image generated by a handwritten script "you" generated by the model. You can split into components alpha and Er. The target domain style words are images generated by the real handwritten word "call" and images generated by the real handwritten word "good". Wherein "call" includes the component "er" which is the same as the "er" component in "you"; "good" includes the components, ", which are the same as the" components "" of "" you "".

The character generation model is used for converting the source domain sample words into target domain generation words. For example, an image containing a regular script "you" is input to the character generation model, which can output an image containing the handwritten script "you".

S102, inputting the target domain generating words and the target domain sample words into a pre-trained character classification model, and calculating the characteristic loss of the character generating model.

The target domain sample word is the true value corresponding to the source domain sample word. For example, the target domain generating word is an image generated by the character generating model and containing a handwritten word "do", the target domain sample word is a real image containing the handwritten word "do", and the real image containing the handwritten word "do" may be an image generated by a real handwritten word of the user. In addition, in the foregoing, the target-domain-style word is also an image generated from a word actually handwritten by the user. It should be noted that the target domain style word and the target domain sample word, i.e. the image of the user's handwriting font style, may be based on a public data set, or an image generated by obtaining the user's true handwriting word through the user's authorization.

The character classification model is used for judging whether the target generation word and the target domain sample word are wrong words or not. The character classification model may be a ResNet-18(Residual Network) structure, wherein the model of the ResNet18 structure includes 17 convolutional layers and 1 fully-connected layer. For example, the training sample is a data set of 500 fonts and 6763 characters in each font, and through experiments, the trained character classification model obtains a classification accuracy rate of 97% on the data set.

The character classification model may include a plurality of feature layers (e.g., 90 feature layers), and the generated feature map output by each layer may be obtained by inputting the target domain generating word into the character classification model. And inputting the target domain sample words into the character classification model to obtain a sample characteristic diagram output by each layer.

Based on the difference between the generated feature map and the sample feature map output by each layer of feature layers, the feature loss of that layer can be determined. For example, the sum of the feature losses of at least one preset layer (for example, the 41 th layer and the 42 th layer) in the multi-layer feature layer can be selected as the total feature loss. In one specific example, the loss of features of an intermediate layer (e.g., layer 45) of the multi-layer feature layer may be selected as the overall loss of features.

S103, adjusting parameters of the character generation model according to the characteristic loss.

And adjusting parameters of the character generation model according to the characteristic loss to obtain an updated character generation model. And determining a corresponding target domain style word aiming at the next source domain sample word, using the updated character generation model, returning to the operation S101, repeating the training until a preset training stopping condition is reached, and stopping adjusting parameters of the character generation model to obtain the trained character generation model. The training stop condition may include feature loss convergence or a number of iterations equal to or greater than a set number threshold, or the like.

According to the technical scheme, the character generation model is used for generating the target domain generated words based on the source domain sample words and the target domain style words, font generation of various styles can be achieved, and the character classification model is used for introducing characteristic loss, so that the character generation model learns characteristics with larger difference between the target domain generated words and the target domain sample words, the character generation model learns more font details, the capability of the character generation model for learning the font characteristics can be improved, and the accuracy of the target domain font style words generated by the character generation model is improved.

FIG. 2 is a schematic diagram of a character generation model provided according to an embodiment of the present disclosure. As shown in fig. 2, the character generation model 204 includes a style encoder 205, a content encoder 206, and a decoder 207. The style encoder 205 is configured to encode the target domain style word 202, the content encoder 206 is configured to encode the source domain sample word 201, fuse the two results obtained by encoding, and input the fused result to the decoder 207 to obtain the target domain generated word 203, where the target domain style word 202 is determined according to the source domain sample word 201.

Fig. 3 is a flowchart of another training method for a character generation model according to an embodiment of the present disclosure, which is further optimized and expanded based on the above technical solution, and can be combined with the above optional embodiments. Inputting the target domain generating words and the target domain sample words into a pre-trained character classification model, and calculating the characteristic loss of the character generating model, specifically: inputting the target domain generating words into the character classification model to obtain a generated feature map output by at least one feature layer of the character classification model; inputting the target domain sample word into the character classification model to obtain a sample feature map output by the at least one feature layer of the character classification model; and calculating the feature loss of the character generation model according to the difference between the generated feature map of the at least one feature layer and the sample feature map.

S301, inputting the source domain sample word and the target domain style word into the character generation model to obtain the target domain generation word.

S302, inputting the target domain generating words into the character classification model to obtain a generated feature map output by at least one feature layer of the character classification model.

The generated feature map is an input feature map using the target domain generated word as a character classification model, and the feature map is output by any feature layer in the character classification model. Each feature layer of the character classification model can correspondingly output a feature map. The target domain generating words are images of the words of the target domain font style generated by the character generating model, and are input into the character classification model as input feature maps. In the character classification model, the 1 st characteristic layer processes the generated words of the target domain to obtain an output characteristic diagram corresponding to the 1 st characteristic layer; and (5) the ith characteristic layer (i is larger than 1) processes the output characteristic diagram output by the ith-1 characteristic layer to obtain the output characteristic diagram corresponding to the ith characteristic layer.

S303, inputting the target domain sample word into the character classification model to obtain a sample feature map output by the at least one feature layer of the character classification model.

The sample feature map is an input feature map using the target domain sample word as a character classification model, and the feature map is output by any feature layer in the character classification model. Each feature layer of the character classification model can correspondingly output a feature map. The target domain sample words are images of the truly handwritten target domain font style words, and are input into the character classification model as input feature maps. Processing the target domain sample word at a 1 st characteristic layer of the character classification model to obtain an output characteristic diagram corresponding to the 1 st characteristic layer; and (5) the ith characteristic layer (i is larger than 1) processes the output characteristic diagram output by the ith-1 characteristic layer to obtain the output characteristic diagram corresponding to the ith characteristic layer.

Optionally, the source domain sample word is an image with a source domain font style, and the target domain sample word is an image with a target domain font style.

The source domain sample words are images generated for words having a source domain font style. The target domain sample word is an image generated for a word having a target domain font style. The source domain font style is different from the target domain font style. Illustratively, the source domain font style is a typographical font, for example, for a chinese character font, the source domain font style is a song style, a regular style, a black body, an clerical script, or the like; the target domain font style is an artistic font style such as a real handwriting font style of a user.

By configuring the source domain sample words as images with the source domain font style and the target domain sample words as images with the target domain font style, conversion of different font styles can be realized, and the number of fonts with new styles can be increased.

S304, calculating the feature loss of the character generation model according to the difference between the generated feature map and the sample feature map of the at least one feature layer.

The character classification model comprises at least one characteristic layer, at least one characteristic layer can be selected from the characteristic layers, and the difference between a generated characteristic diagram of the characteristic layer and a sample characteristic diagram of the characteristic layer can be calculated aiming at any selected characteristic layer. The difference is used to describe the degree of difference between the generated feature map and the sample feature map to evaluate the degree of similarity of the generated word of the model to the actual handwritten sample word. By calculating the feature loss according to the difference, whether the generated word of the model is different from the real handwritten sample word can be described in more detail from the dimension of the feature.

According to the embodiment of the disclosure, the feature loss can be used for restricting the similarity degree of the target domain generated word output by the cycle generation network model and the target domain sample word, so that the accuracy of the style migration of the cycle generation network model is improved.

The selected feature layer may be set as needed, for example, a difference between a generated feature map and a sample feature map of a median feature layer of the plurality of feature layers may be selected, and a feature loss of the character generation model is calculated, where the total number is 90 feature layers, and the median is 45 th feature layer and 46 th feature layer. The number of the selected feature layers is one, and the difference between the generated feature map and the sample feature map of the feature layers can be directly used as feature loss; the number of the selected feature layers is at least two, and the difference of the plurality of feature layers can be subjected to numerical calculation to obtain the feature loss, wherein the numerical calculation can be summation calculation, product calculation or weighted average calculation and the like.

Optionally, the calculating a feature loss of the character generation model includes: calculating pixel difference between a generated feature map and a sample feature map of the feature layer aiming at each feature layer in the at least one feature layer to obtain pixel loss of the feature layer; and calculating the characteristic loss of the character generation model according to the pixel loss of the at least one characteristic layer.

The feature maps output by the same feature layer have the same size, and pixel differences can be calculated according to pixels forming the feature maps, so that the difference between images can be calculated from pixel dimensions and is used as pixel loss of the feature layer. Calculating the characteristic loss according to the pixel loss of the characteristic layer, specifically, the characteristic loss may be: the number of the characteristic layers is one, and the pixel loss is taken as the characteristic loss; the number of the characteristic layers is at least two, and the sum of pixel loss is calculated to be used as characteristic loss.

Illustratively, the pixel penalty for each feature layer may be calculated according to the L1 norm penalty function, i.e., the sum of the absolute differences between the real word and the co-located pixel in the generated word is calculated.

The pixel difference between the generated feature map and the sample feature map is used as the difference between the generated feature map and the sample feature map, the pixel loss is calculated, the feature loss is determined, the feature loss can be calculated from the pixel dimension, the calculation fine granularity of the feature loss is controlled, whether the generated word of the model is different from the real handwritten sample word or not is described from the pixel detail, the feature loss is calculated to adjust the parameters of the character generation model, so that the character generation model learns the more detailed font style details of the sample word, and the accuracy of the generated word of the character generation model is improved.

Optionally, the calculating a pixel difference between the generated feature map of the feature layer and the sample feature map includes: calculating the absolute value of the difference between the pixel value of the pixel point and the pixel value of the pixel point at the corresponding position in the sample characteristic diagram aiming at the pixel point at each position in the generated characteristic diagram of the characteristic layer to obtain the difference of the pixel point at each position; and determining the pixel difference between the generated feature map and the sample feature map of the feature layer according to the difference of the pixel points of the plurality of positions.

And calculating the absolute value of the difference value between the pixel value of the pixel point in the generated feature map at the same position and the pixel value of the pixel point in the sample feature map, and determining the difference as the difference of the pixel point at the position. The sizes of the generated feature map and the sample feature map are the same, the number of pixels included in the feature map is the same, namely the number of positions included in the feature map is the same, and the sum of differences of pixel points of a plurality of positions is determined as the pixel difference between the generated feature map and the sample feature map of the feature layer. The plurality of positions may be all positions included in the feature map output by the feature map, or may be partial positions of the screen.

In a specific example, the size of each of the generated feature map and the sample feature map is 64 × 64, and includes 4096 positions, and the absolute value of the pixel value difference between the pixel point of the generated feature map and the pixel point of the sample feature map may be calculated for each position to obtain 4096 absolute values of the difference, and the sum of the 4096 absolute values of the difference is counted to obtain the pixel difference between the generated feature map and the sample feature map of the feature layer. It should be noted that the pixel difference is actually calculated by using an L1 norm loss function, and an element of the L1 norm loss function is a pixel value of a pixel point at the ith position in the feature map.

The pixel value difference absolute value of the pixel values of the two characteristic graphs at each position between the corresponding pixel points is calculated, the pixel difference of the characteristic layer is determined according to the absolute values of a plurality of positions, the pixel values of the pixel points at the same position are used as elements of an L1 norm loss function, and the L1 norm loss is calculated, so that the robustness of a character generation model can be improved.

S305, adjusting parameters of the character generation model according to the characteristic loss.

And adjusting parameters of the character generation model according to the feature loss of the difference calculation, so that more font details of the real handwritten sample word can be learned. Illustratively, according to the characteristic loss adjustment parameter, the parameter may be adjusted in a manner that adjusts the model parameter with reference to the L1 norm loss function until the sum of the absolute differences of the real word and the generated word is minimized.

According to the technical scheme, the difference between the generated feature map of at least one feature map in the character classification model and the sample feature map is calculated, the feature loss is determined, whether the generated characters of the model are different from the real handwritten sample characters or not can be described in more detail from the dimension of the features, the parameters of the character generation model are adjusted according to the feature loss calculated in different degrees, more font details of the real handwritten sample characters can be learned by the character generation model, the generated characters of the character generation model are finally similar to the real handwritten sample characters, and the accuracy of the generated characters of the character generation model is improved.

Fig. 4 is a visualization effect diagram of a character generation model using feature loss constraint according to an embodiment provided in the present disclosure. As shown in fig. 4, the target field sample word 401 is a real image containing the handwritten word "god", i.e. the "god" word in the target field sample word 401 is the real handwritten word of the user. The target domain generating word 402 is an image generated by the character generating model and containing the handwritten word "god", and the size of the target domain sample word 401 and the size of the target domain generating word 402 are both 256 × 256. The target field sample word 404 is the actual image containing the handwritten "jacket," i.e., the "jacket" word in the target field sample word 404 is the actual handwritten word by the user. The target field generation word 405 is an image generated by the character generation model and containing a handwritten word "jacket", and the size of each of the target field sample word 401, the target field generation word 402, the target field sample word 404 and the target field generation word 405 is 256 × 256. The target domain sample word 401, the target domain generating word 402, the target domain sample word 404 and the target domain generating word 405 are input into the character classification model, a sample feature map and a sample feature map are respectively output at a first preset layer (for example, 30 th feature layer) of the character classification model, the size of the sample feature map and the size of the sample feature map are both 64 x 64, and after pixel difference calculation is performed on the two images of 64 x 64, thermal effect maps 403 and 406 representing the difference between the two images are obtained. The thermal effect maps 403 and 406 are also 64 × 64 images, and the darker the color in the thermal effect map 403 represents the greater the difference between the target domain sample word 401 and the target domain generated word 402, and the darker the color in the thermal effect map 406 represents the greater the difference between the target domain sample word 404 and the target domain generated word 405, so that the character generation model can be more focused on learning the features of the darker color portions in the thermal effect maps 403 and 406, thereby improving the ability of the character generation model to learn the features.

Fig. 5 is a visualization effect diagram of a character generation model using feature loss constraints according to another embodiment provided by the embodiments of the present disclosure. As shown in fig. 5, a target domain sample word 501, a target domain generating word 502, a target domain sample word 504, and a target domain generating word 505 are input to the character classification model, a sample feature map and a sample feature map are output at a second preset layer (e.g., 31 st feature layer) of the character classification model, the sample feature map and the sample feature map are both 32 × 32 in size, and after pixel difference calculation is performed on the two 32 × 32 images, thermodynamic effect maps 503 and 506 representing the difference between the two images are obtained. The thermal effect maps 503 and 506 are also 32 × 32 images, and the darker the color in the thermal effect map 503 indicates the greater the difference between the target domain sample word 501 and the target domain generated word 502, and the darker the color in the thermal effect map 506 indicates the greater the difference between the target domain sample word 504 and the target domain generated word 505, so that the character generation model can be more focused on learning the features of the darker color parts in the thermal effect maps 503 and 506, thereby improving the ability of the character generation model to learn the features.

It is to be understood that the ability of the character generation model to learn features may be improved in conjunction with the thermal effect maps 403 and 503 such that the character generation model learns features with large differences between the target domain sample word 401 and the target domain generation word 402 and features with large differences between the target domain sample word 501 and the target domain generation word 502, and in conjunction with the thermal effect maps 406 and 506, features with large differences between the target domain sample word 404 and the target domain generation word 405 and features with large differences between the target domain sample word 504 and the target domain generation word 505.

FIG. 6 is a graph comparing the effects of using feature loss according to one embodiment of the present disclosure. As shown in FIG. 6, the image 601 is a real image containing handwritten "Tong", i.e., "Tong" in the image 601 is the user's real handwritten word. Image 602 is an image containing handwritten "Tong" generated without using the feature loss constraint character generation model. Image 603 is an image containing handwritten "Tong" generated using the feature loss constraint character generation model. Through experimentation, the "Tong" word in the image 603 learned more characteristics of the real user's handwritten "Tong" (i.e., the "Tong" word in the image 601) than the "Tong" word in the image 602, and was more similar to the real user's handwritten "Tong" word.

FIG. 7 is an illustration of the effect of using a feature loss constrained character generation model to generate words for a trained character generation model according to one embodiment of the disclosure. The characters in the frame are real hand-written characters, and the characters which are not located in the frame are generated characters of the character generation model. Therefore, the font style of the generated word of the character generation model is basically consistent with the font style of the real hand-written word.

Fig. 8 is a flowchart of another training method for a character generation model according to the embodiment of the present disclosure, which is further optimized and expanded based on the above technical solution, and can be combined with the above optional embodiments. Optimizing the training method of the character generation model as follows: inputting the target domain style words into a character generation model to obtain first style characteristic vectors of the target domain style words; inputting the target domain generating words into the character generating model to obtain second style characteristic vectors of the target domain generating words; inputting the second style feature vector and the first style feature vector into a component classification model, and calculating component classification loss; inputting the target domain sample word and the target domain generating word into an identification model, and calculating character countermeasure loss and style countermeasure loss; inputting the target domain generated words into the character classification model, and calculating the wrong word loss; and adjusting parameters of the character generation model according to the component classification loss, the character confrontation loss, the style confrontation loss and the misword loss.

S801, inputting the source domain sample word and the target domain style word into a character generation model to obtain a target domain generation word and a first style characteristic vector of the target domain style word.

The first style feature vector of the target domain style word refers to a feature vector obtained by encoding the target domain style word by a style encoder.

The method comprises the steps of inputting source domain sample words and target domain style words into a character generation model, specifically sending the source domain sample words to a content encoder to obtain content feature vectors, and sending the target domain style words to a style encoder to obtain first style feature vectors. The method comprises the steps of obtaining a plurality of target domain style words, obtaining a plurality of first style feature vectors, fusing the plurality of first style feature vectors to obtain a fused style feature vector, fusing the fused style feature vector and a content feature vector to obtain a target feature vector, and sending the target feature vector to a decoder for decoding to obtain a target domain generated word. The method includes the steps of fusing a plurality of first style feature vectors to obtain a fusion style feature vector, summing and averaging values of vector elements at each position of the first style feature vectors to obtain a value of the vector element at the position, and determining the fusion style feature vector according to the values of the vector elements at all the positions. And fusing the fusion style feature vector and the content feature vector to obtain a target fusion feature vector, wherein the target fusion feature vector can be obtained by summing the numerical value of the vector element of each position with the numerical value of the vector element of the content feature vector of the corresponding position aiming at the fusion style feature vector to obtain the numerical value of the vector element of the position, and determining the target fusion feature vector according to the numerical values of the vector elements of all the positions.

S802, inputting the target domain generating words and the target domain sample words into a pre-trained character classification model, and calculating the characteristic loss of the character generating model.

And S803, inputting the target domain generating words into the character generating model to obtain second style characteristic vectors of the target domain generating words.

The second style feature vector of the target domain generated word is a feature vector obtained by encoding the target domain generated word by the style encoder. And inputting the target domain generating words into the character generating model, namely inputting the target domain generating words into the style encoder to obtain second style characteristic vectors of the target domain generating words.

S804, inputting the second style characteristic vector and the first style characteristic vector into a component classification model, and calculating component classification loss.

The component classification model is used for detecting whether components which are the same as those included in the source domain sample word exist in the components included in the word corresponding to the style feature vector, namely, the component classification model is used for detecting whether the components of the components which are the same as those included in the source domain sample word exist in the word corresponding to the style feature vector. And inputting the second style feature vector and the first style feature vector into a component classification model, and calculating the component classification loss. The component classification loss is used for restricting the accuracy of the target domain generated word output by the character generation model including the component, and is particularly used for judging whether the component included by the word is correct or not. In effect, a component classification penalty is a difference between the included component that a word identifies and the correct component that the word includes.

Exemplary, first-style feature vectors Each element in (a) may represent a component in a component table, a second style feature vector Each element in (1) may represent a component in a component table, and m represents the number of components in the component table. For example, a component table has 100 components, for a Chinese character, the component is a radical, the component table has 100 radicals, and then m may be equal to 99. For example, if the target domain style word is a "good" word, which may be composed of components ". alpha" and "Gui", respectively, at 2 nd and 3 rd among the m words in the component table, then the first style feature vector of the "good" word may be expressed asFor another example, if the target domain generating word is a "you" word, which may be composed of components, "and" may be located at the 2 nd and 5 th among the m words in the component table, respectively, the second style feature vector of the "you" word may be expressed as

Target first style feature vectors are preset for target domain style words Wherein the content of the first and second substances,each element in (a) may represent a component in a component table. Generating words aiming at the target domain, and presetting target second style characteristic vectorsWherein the content of the first and second substances,each element in (a) may represent a component in a component table. Target first style feature vectorRepresenting the vector that the target domain style word is input to the character classification model and the character classification model should output. For example, the target domain style word is a "good" word, which may be composed of components "" alpha "" and "" Gui "", respectivelyThe m words of the component table are located at 2 nd and 3 rd, the target first style feature vector of the "good" word can representAccordingly, the target second style feature vectorThe vector that represents the target domain generating word input to the character classification model and the character classification model should output. For example, if the target generated word is a "you" word, which may be composed of components ". alpha" and "Er", respectively located at 2 nd and 5 th among m words of the component table, the target second style feature vector may be expressed as

First style feature vector based on target domain style wordsAnd target first style feature vectorsCross entropy between, the first component classification penalty can be determined. The first component classification loss can be expressed by the following equation (1):

wherein L is_cls1Represents a first component classification loss, a_iRepresenting an element of the first style feature vector with index i, a_iAnd representing an element with subscript i in the target first style feature vector, wherein i is an integer which is greater than or equal to 0 and less than or equal to m, and m represents the number of elements in the first style feature vector and the target first style feature vector.

Generating a second style feature vector for the word from the target domainAnd target second style feature vectorCross entropy between, a second component classification penalty can be determined. The second component classification loss can be expressed by the following equation (2):

wherein L is_cls2Representing a loss of classification of the second component, b_iRepresenting the element of the second style feature vector with index i, b_iAnd representing an element with index i in the target second style feature vector, wherein i is an integer greater than or equal to 0 and less than or equal to m, and m represents the number of elements in the second style feature vector and the target second style feature vector.

A component classification penalty for the character generation model may be determined based on the first component classification penalty and the second component classification penalty. The component classification penalty of the character generation model can be expressed by the following equation (3):

L_clsrepresenting the component classification penalty of the character generation model.

According to embodiments of the present disclosure, component classification penalties may be used to constrain the accuracy with which a target domain generated word output by a character generation model includes components, thereby reducing the probability that the character generation model generates a generated word composed of incorrect components.

And S805, inputting the target domain sample word and the target domain generating word into an identification model, and calculating character countermeasure loss and style countermeasure loss.

The source domain sample words are true handwritten digital images, while the target domain sample words are model-generated digital images, which may be referred to as false digital images. The target domain generated word is a model generated handwritten digital image and may be referred to as a false handwritten digital image. During the training process, the target field sample word may be labeled as true Real (e.g., value 1), and the target field generation word may be labeled as false (e.g., value 0). And detecting whether the target domain sample word and the target domain generating word are real handwritten words or not, actually detecting whether the target domain sample word and the target domain generating word are model generating words or not, and under the condition that the output result of the character generated by the character generating model through the identification model is real, indicating that the character generated by the character generating model is very similar to the handwritten word and can be falsified or mistrued.

The identification model is used for detecting whether the target domain sample word and the target domain generating word are real handwritten words, classifying character types, classifying style types and detecting whether the target domain generating word is the target domain sample word expected to be generated. Wherein, the character confrontation loss is used for carrying out character classification on the character and whether the character is a real handwritten word; style fighting loss is used to style classify a word and whether the word is a real hand word. The character confrontation loss is the difference between the character classification of a word and the correct character type of the word, and the difference between the word and a real handwritten word; style resistance loss refers to the difference between the style type of a word and the correct style type of the word, as well as the difference between a word and a true handwritten word.

And detecting whether the target domain sample word and the target domain generating word are real handwritten words or not according to the identification model, and classifying character types. And inputting the target domain sample word into the identification model to obtain a first character confrontation vector of the target domain sample word, and inputting the target domain generating word into the identification model to obtain a second character confrontation vector of the target domain generating word.

Illustratively, a first character confrontation vector Each element of (a) may represent a character of a character table, a second character confrontation vector Each element in (a) may represent a character in a character table, and j represents the number of characters in the character table. For example, if the character table has 6000 characters, and for a Chinese character, the character table includes 6000 Chinese characters, then j may be equal to 5999. And, the element is 1, which means that the corresponding word is a real handwritten word, and the element is-1, which means that the corresponding word is a model generation word. For example, if the target field sample word is "you" word, the "you" word is located at 1 st in the character table, and the target field sample word is a real handwritten word, the value corresponding to the 1 st element is 1, then the first character confrontation vector of the "you" word is represented as For another example, if the target field generation word is "good" word, the "good" word is located at 2 nd in the character table, and the target field generation word is the model generation word, and the value corresponding to the 2 nd element is-1, then the second character confrontation vector of the "good" word can be represented as

Aiming at a target domain sample word, a target first character countermeasure vector is preset Wherein the content of the first and second substances,each element in (a) may represent a character in a character table. Generating words aiming at the target domain, and presetting a target second character countermeasure vectorWherein the content of the first and second substances,each element in (a) may represent a character in a character table. Target first character confrontation vectorRepresenting the vector that the target domain sample word is input to the authentication model, which should output. For example, if the target field sample word is "you" word, the "you" word is located at 1 st in the character table, and the target field sample word is a real handwritten word, and the value corresponding to the 1 st element is 1, then the first character confrontation vector of the "you" word is represented asAccordingly, the target second character confrontation vectorThe vector that the target domain generating word is input to the authentication model and the authentication model should output is indicated. For example, if the target generator word is the "good" word, the "good" word is located at 2 nd in the character table, and the target domain generator word is the model generator word, the value corresponding to the 2 nd element is-1, then the second character confrontation vector of the "good" word can be expressed as

First character confrontation vector according to target domain sample wordAnd target first character confrontation vectorCross entropy between, the first character opportunistically loss can be determined. The first character confrontation loss can be expressed by the following equation (4):

wherein the content of the first and second substances,representing the first character against loss, c_iDenotes the element of index i in the first character pair vector, c_iThe number of elements in the target first character countermeasure vector and the target first character countermeasure vector is represented by an element having an index i, i is an integer of 0 or more and j or less, and j represents the number of elements in the first character countermeasure vector and the target first character countermeasure vector.

Generating a second character confrontation vector of the word according to the target domainAnd target first character confrontation vectorCross entropy between, the second character oppositional loss can be determined. The second-character countermeasure loss can be expressed by the following equation (5):

wherein the content of the first and second substances,indicating a second character fighting against loss, d_iRepresenting the element of index i in the second character pair vector, d_iAnd (3) indicating an element with index i in the target second character countermeasure vector, wherein i is an integer greater than or equal to 0 and less than or equal to j, and j represents the number of elements in the second character countermeasure vector and the target second character countermeasure vector.

The character opponent loss of the character generation model may be determined based on the first character opponent loss and the second character opponent loss. The character opposition loss of the character generation model can be expressed by the following equation (6):

a character opposition loss representing a character generation model.

The discrimination model is used for detecting whether the target domain sample word and the target domain generating word are real hand-written words or not and classifying style types. And inputting the target domain sample word into the identification model to obtain a first style confrontation vector of the target domain sample word, and inputting the target domain generating word into the identification model to obtain a second style confrontation vector of the target domain generating word.

Illustratively, a first-style confrontation vector Each element of (1) can represent a style type in a style sheet, a second style confrontation vector Each element in (a) may represent a style type in a style sheet, and k represents the number of style types in the style sheet. For example, if the style sheet has 1000 style types, for hand writing, and the style sheet includes 1000 handwritten fonts, k may be equal to 999. And, the element is 1, which means that the corresponding word is a real handwritten word, and the element is-1, which means that the corresponding word is a model generation word. For example, if the target domain sample word is "you" word, the style type of "you" word is located at the 998 th in the style sheet, and the target domain sample word is real handwritten word, the value corresponding to the 998 th element is 1, then the first style confrontation vector of "you" word is represented asAs another example, if the target domain generator word is "Jia" word, the style type of the "Jia" word is at 999 th in the style sheet, and the target domain generator word is the model generator word, and the corresponding value of the 999 th element is-1, then the second style confrontation vector of the "Jia" word can be expressed as

Target first-style countermeasure vectors are preset for target domain sample words Wherein the content of the first and second substances,each element in (a) may represent a style type in a style sheet. Generating words aiming at the target domain, and presetting a target second style counter vector Wherein the content of the first and second substances,each element in (a) may represent a style type in a style sheet. Target first style confrontation vectorRepresenting the vector that the target domain sample word is input to the authentication model, which should output. For example, if the target field sample word is "you" word, the style type of "you" word is located at the 998 th in the style sheet, and the target field sample word is real handwritten word, the value corresponding to the 998 th element is 1, then the first style pair of "you" wordThe resistance vector is expressed asAccordingly, the target second style confrontation vectorThe vector that the target domain generating word is input to the authentication model and the authentication model should output is indicated. For example, if the target-generation word is "good" whose style-type is at 999 th in the style sheet, and the target-domain-generation word is the model-generation word, and the value corresponding to the 999 th element is-1, then the second style confrontation vector of "good" can be expressed as

Confrontation vector according to first style of target domain sample wordAnd a target first style confrontation vectorCross entropy between, a first style of opposition loss can be determined. The first style of opposition loss can be expressed by the following equation (7):

wherein the content of the first and second substances,representing a first style of confrontation loss, e_iDenotes the element of index i in the first-style confrontation vector, e_iAnd (3) representing an element with index i in the target first-style confrontation vector, wherein i is an integer which is greater than or equal to 0 and less than or equal to k, and k represents the number of elements in the first-style confrontation vector and the target first-style confrontation vector.

Generating second style countermeasure vectors for words from a target domainAnd a target second style confrontation vectorCross entropy between, a second style counter-loss can be determined. The second style of confrontation loss can be expressed by the following equation (8):

wherein the content of the first and second substances,representing a second style of opposition loss, f_iDenotes the element of index i, f in the second style challenge vector_iAnd (3) representing an element with index i in the target second-style confrontation vector, wherein i is an integer greater than or equal to 0 and less than or equal to k, and k represents the number of elements in the second-style confrontation vector and the target second-style confrontation vector.

A style opposition loss of the character generation model may be determined based on the first style opposition loss and the second style opposition loss. The style-confrontation loss of the character generation model can be expressed by the following equation (9):

representing the style of the character generation model against loss.

The authentication model is used to detect whether the target domain generated word is the target domain sample word that is desired to be generated. And inputting the target domain sample word and the target domain generating word into an identification model to obtain consistency loss.

Target for ensuring input of source domain sample words into character generation modelThe domain-generated word is just a style conversion, the content remains unchanged, and a consistency loss (cycle-consistency loss) can be added to the character generation model. The penalty may be calculated from the difference between the target domain sample word and the target domain generated word. For example, the difference between the pixel values of each corresponding pixel point of the two images of the target domain sample word and the target domain generating word is calculated, the absolute value is calculated to obtain the difference of each pixel point, the differences of all the pixel points are summed to obtain the cycle consistency loss of the character generating model, which can be recorded as L1_A2B。

Optionally, the training method of the character generation model further includes: inputting the target domain sample word and the target domain generating word into an identification model, and calculating consistency loss; and adjusting parameters of the character generation model according to the consistency loss.

And S806, inputting the target domain generating words into the character classification model, and calculating the wrong word loss.

The character classification model is used for detecting whether the target domain generated word is a wrong word. The mischaracter loss is used for restricting the mischaracter rate of a target domain generated word output by the character generation model, and particularly refers to the difference between the word and a correct word.

Inputting the target domain generated word into the character classification model to obtain a generated character vector of the target domain generated wordWherein the vectorEach element in (a) may represent a character in the training sample, then n represents the number of characters in the training sample, e.g., the training sample has 6761 words, then n may be equal to 6760. Generating words for the first target domain, and presetting standard character vectorsWherein the content of the first and second substances,each element in (a) may represent a character in the training sample, then n represents the number of characters in the training sample, e.g., the training sample has 6761 words, then n may be equal to 6760.

Standard character vectorThe vector indicates a vector to be output by inputting the target domain generating word to the character classification model. For example, if the target field generation word is the "do" word, which is first in the n words in the training sample, the standard character vector of the "do" word may be expressed asGenerating a generated character vector for a word from a first target domainAnd a standard character vectorCross entropy between them, the misword penalty can be determined. The misword loss can be expressed by the following equation (10):

wherein L is_CIndicating a loss of erroneous words, x_iDenotes the element with index i in the resulting character vector, y_iAnd representing elements with index i in the standard character vector, wherein i is an integer which is greater than or equal to 0 and less than or equal to n, and n represents the number of the elements in the generated character vector and the standard character vector.

According to the embodiment of the disclosure, the wrong word loss can be used for restricting the wrong word rate of the target domain generated word output by the character generation model, so that the probability of generating wrong words by the character generation model is reduced.

It should be noted that the identification model and the component classification model may be trained together with the character generation model, and when applied in a later stage, the trained character generation model may be used only to implement style migration of the image.

S807, adjusting parameters of the character generation model according to the feature loss, the component classification loss, the character confrontation loss, the style confrontation loss and the misprint loss.

According to the technical scheme of the method, the character generation model is used for generating the target domain generation words based on the source domain sample words, font generation of multiple styles can be achieved, component classification loss is introduced by using the component classification model, the learning range of the font style is increased, and the migration accuracy of the font style is improved; by using the identification model, the character confrontation loss and the style confrontation loss are introduced, so that the capability of the character generation model for learning correct fonts and the capability of learning font styles can be improved; by introducing misword loss and feature loss using the character classification model, the ability of the character generation model to learn font features can be improved, and the probability of generating miswords can be reduced.

Fig. 9 is a scene diagram of a training method of a character generation model according to an embodiment of the present disclosure. As shown in fig. 9, according to a scene diagram of a training method of a character generation model disclosed in an embodiment of the present disclosure, the character generation model includes a style encoder 910, a content encoder 911, and a decoder 912. The source domain sample words 901 are sent to a content encoder 911 to obtain content feature vectors, target domain style words 902 are determined according to the source domain sample words 901, and the target domain style words 902 are sent to a style encoder 910 to obtain first style feature vectors. The number of the target domain style words 902 is multiple, the number of the first style feature vectors is multiple correspondingly, the multiple first style feature vectors are fused to obtain a fusion style feature vector, the fusion style feature vector is fused with the content feature vector to obtain a target feature vector, and the target feature vector is sent to a decoder 912 to be decoded to obtain a target domain generating word 903. The target domain generating word 903 is input to the style encoder 910, and a second style feature vector of the target domain generating word 903 is obtained. The second style feature vector and the first style feature vector are input into the component classification model 913, and a component classification loss 905 is calculated. The target domain sample word 904 and the target domain generating word 903 are input into the discrimination model 914, and the character countermeasure loss 906 and the lattice countermeasure loss 907 are calculated. The target-domain generated word 903 and the target-domain sample word 904 are input to a pre-trained character classification model 915, and the feature loss 909 of the character generation model is calculated. The target domain generating word 903 is input to the character classification model 915, and the wrong word loss 908 is calculated.

Fig. 10 is a flowchart of a character generation method disclosed in an embodiment of the present disclosure, which may be applied to a case where a source domain style word is converted into a target domain style word according to a training character generation model to generate a new character. The method of this embodiment may be executed by a character generating apparatus, which may be implemented in a software and/or hardware manner, and is specifically configured in an electronic device with certain data operation capability, where the electronic device may be a client device or a server device, and the client device may be, for example, a mobile phone, a tablet computer, a vehicle-mounted terminal, a desktop computer, and the like.

S1001, a source domain input word and a corresponding target domain input word are obtained.

The source domain input words may refer to images formed of words that need to be converted to a target domain font style. The target domain input word may refer to an image formed of a target domain font style word. And carrying out component splitting on the source domain input words, determining at least one component forming the source domain input words, and screening target domain input words corresponding to the source domain input words from a pre-generated set of target domain input words according to each component. The number of target field input words is at least one.

An image formed by the target domain font style words may be obtained in advance and a set of target domain input words may be formed. The set is an image of pre-acquired words that cover the target domain font style of the full component. Illustratively, for a chinese character, the target domain font style is a user handwritten font style, and an image of a word of the handwritten font style authorized by the user may be obtained in advance to generate a set of target domain input words. More specifically, 100 words that are overlaid with a full radical may be pre-configured and the user may be prompted to authorize the provision of handwritten font-style words for the 100 words overlaid with a full radical, generating a set of target domain input words.

S1002, inputting the source domain input word and the target input word into a character generation model to obtain a target domain new word; the character generation model is obtained by training according to the training method of the character generation model according to any embodiment of the present disclosure.

The character generation model is obtained by training according to a training method of the character generation model. The target domain new word may refer to a target domain font style word of the content corresponding to the source domain input word. For example, the source field input word is a regular script digital image, the target field new word is a handwritten digital image, and the regular script digital image is input into the character generation model to obtain the handwritten digital image, i.e., the target field new word.

In the case of a new word of the target domain, a word stock may be established based on the new word of the target domain. For example, a new character generated by the character generation model is stored, a word stock with a handwritten font style is established, the word stock can be applied to an input method, a user can directly obtain the character with the handwritten font style by using the input method based on the word stock, the diversified requirements of the user can be met, and the user experience is improved.

By acquiring the source domain input words and the corresponding target domain input words and inputting the source domain input words and the corresponding target domain input words into the character generation model, the target domain new words are acquired, the source domain input words can be accurately converted into the target domain new words, the generation accuracy of the target domain new words is improved, the generation efficiency of the target domain new words is improved, and the labor cost for generating the target domain new words is reduced.

Fig. 11 is a block diagram of a training apparatus for a character generation model in an embodiment of the present disclosure, and the embodiment of the present disclosure is applied to training the character generation model, where the character generation model is used for converting a source domain style word into a target domain style word. The device is realized by software and/or hardware and is specifically configured in electronic equipment with certain data operation capacity.

An apparatus 1100 for training a character generation model shown in fig. 11 includes: a target domain generating word obtaining module 1101, a feature loss calculating module 1102 and a first loss adjusting module 1103; wherein the content of the first and second substances,

a target domain generating word obtaining module 1101, configured to input the source domain sample word and the target domain style word into the character generating model, so as to obtain a target domain generating word;

a feature loss calculation module 1102, configured to input the target domain generating words and the target domain sample words into a pre-trained character classification model, and calculate a feature loss of the character generation model;

a first loss adjusting module 1103, configured to adjust parameters of the character generation model according to the feature loss.

Further, the feature loss calculating module 1102 includes: the first feature map generating unit is used for inputting the target domain generating words into the character classification model to obtain a generated feature map output by at least one feature layer of the character classification model; the second feature map generation unit is used for inputting the target domain sample words into the character classification model to obtain a sample feature map output by the at least one feature layer of the character classification model; and the characteristic loss calculation unit is used for calculating the characteristic loss of the character generation model according to the difference between the generated characteristic diagram and the sample characteristic diagram of the at least one characteristic layer.

Further, the feature loss calculation unit includes: the pixel loss calculation subunit is used for calculating the pixel difference between the generated feature map and the sample feature map of the feature layer aiming at each feature layer in the at least one feature layer to obtain the pixel loss of the feature layer; and the characteristic loss calculation subunit is used for calculating the characteristic loss of the character generation model according to the pixel loss of the at least one characteristic layer.

Further, the pixel loss calculating subunit is configured to: calculating the absolute value of the difference between the pixel value of the pixel point and the pixel value of the pixel point at the corresponding position in the sample characteristic diagram aiming at the pixel point at each position in the generated characteristic diagram of the characteristic layer to obtain the difference of the pixel point at each position; and determining the pixel difference between the generated feature map and the sample feature map of the feature layer according to the difference of the pixel points of the plurality of positions.

Further, the training device for the character generation model further includes: the first feature vector calculation module is used for inputting the target domain style words into a character generation model to obtain first style feature vectors of the target domain style words; the second feature vector calculation module is used for inputting the target domain generated words into the character generation model to obtain second style feature vectors of the target domain generated words; the component classification loss calculation module is used for inputting the second style feature vector and the first style feature vector into a component classification model and calculating component classification loss; the confrontation loss calculation module is used for inputting the target domain sample word and the target domain generating word into an identification model and calculating character confrontation loss and style confrontation loss; the wrong character loss calculation module is used for inputting the target domain generated characters into the character classification model and calculating the wrong character loss; and the second loss adjusting module is used for adjusting parameters of the character generation model according to the component classification loss, the character confrontation loss, the style confrontation loss and the wrong character loss.

Further, the source domain sample word is an image with a source domain font style, and the target domain sample word is an image with a target domain font style.

The training device for the character generation model can execute the training method for the character generation model provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the training method for the character generation model.

Fig. 12 is a block diagram of a character generation apparatus in an embodiment of the present disclosure, and the embodiment of the present disclosure is applied to a case where a new character is generated by converting a word in a source domain style into a word in a target domain style according to a training character generation model. The device is realized by software and/or hardware and is specifically configured in electronic equipment with certain data operation capacity.

A character generating apparatus 1200 shown in fig. 12 includes: an input word acquisition module 1201 and a character generation module 1202; wherein the content of the first and second substances,

an input word obtaining module 1201, configured to obtain a source domain input word and a corresponding target domain input word;

a character generation module 1202, configured to input the source domain input word and the target input word into a character generation model to obtain a target domain new word; the character generation model is obtained by training according to the training method of the character generation model according to any embodiment of the present disclosure.

The character generation device can execute the character generation method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the character generation method.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

Fig. 13 illustrates a schematic block diagram of an example electronic device 1300 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 13, the apparatus 1300 includes a computing unit 1301 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)1302 or a computer program loaded from a storage unit 1308 into a Random Access Memory (RAM) 1303. In the RAM 1303, various programs and data necessary for the operation of the device 1300 can also be stored. The calculation unit 1301, the ROM 1302, and the RAM 1303 are connected to each other via a bus 1304. An input/output (I/O) interface 1305 is also connected to bus 1304.

A number of components in the device 1300 connect to the I/O interface 1305, including: an input unit 1306 such as a keyboard, a mouse, or the like; an output unit 1307 such as various types of displays, speakers, and the like; storage unit 1308, such as a magnetic disk, optical disk, or the like; and a communication unit 1309 such as a network card, modem, wireless communication transceiver, etc. The communication unit 1309 allows the device 1300 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Computing unit 1301 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of computing unit 1301 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 1301 performs the respective methods and processes described above, such as a training method of a character generation model or a character generation method. For example, in some embodiments, the training method of the character generation model or the character generation method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 1308. In some embodiments, some or all of the computer program may be loaded onto and/or installed onto device 1300 via ROM 1302 and/or communications unit 1309. When the computer program is loaded into the RAM 1303 and executed by the computing unit 1301, one or more steps of the training method of the character generation model or the character generation method described above may be performed. Alternatively, in other embodiments, the computing unit 1301 may be configured in any other suitable way (e.g., by means of firmware) to perform a training method or character generation method of a character generation model.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

30页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种字母框架的算法

Training method of character generation model, character generation method, device, equipment and medium

相关技术

网友询问留言