Text translation method and related device

文档序号:1938186 发布日期:2021-12-07 浏览:23次 中文

阅读说明:本技术 一种文本翻译方法和相关装置 (Text translation method and related device ) 是由 刘宜进 徐杨一帆 孟凡东 徐金安 于 2021-05-21 设计创作,主要内容包括:本申请实施例公开了一种文本翻译方法和相关装置,为了提高翻译质量,处理设备可以结合源语种和目标语种所对应词语片段之间的翻译映射关系,调节用于训练初始翻译模型的相关参数,并基于调节后的相关参数,通过AI技术训练初始翻译模型,从而使训练得到翻译模型能够基于该翻译映射关系对源语种文本进行翻译,使翻译得到的目标语种下的翻译文本更加贴合源语种文本的文本含义,提高翻译效果。同时,训练得到的翻译模型以及过程中所确定出的关联参数可以通过区块链技术进行上链存储,以便于后续过程中对其它模型进行训练以及翻译应用。(The embodiment of the application discloses a text translation method and a related device, in order to improve translation quality, processing equipment can be combined with a translation mapping relation between word segments corresponding to a source language and a target language, relevant parameters for training an initial translation model are adjusted, and based on the adjusted relevant parameters, the initial translation model is trained through an AI technology, so that the translation model obtained through training can translate a source language text based on the translation mapping relation, the translated text in the target language obtained through translation is enabled to be more fit with the text meaning of the source language text, and the translation effect is improved. Meanwhile, the trained translation model and the determined associated parameters in the process can be subjected to uplink storage through a block chain technology, so that other models can be conveniently trained and translated in the subsequent process.)

1. A text translation method is characterized in that a translation text training set is obtained, wherein the translation text training set comprises a plurality of text sample pairs, the text sample pairs comprise a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language;

determining an association parameter of a word segment included in the second text in a text sample pair, wherein a target text sample pair is any one of the text sample pairs, and the association parameter is used for embodying a translation mapping relationship between a target word segment in the second text of the target text sample pair and a word segment in the first text of the target text sample pair; the method comprises the following steps:

determining a model translation text in the target language through an initial translation model according to a first text in the target text sample pair;

determining word loss parameters corresponding to the model translation texts respectively for word segments included in the second text of the target text sample pair based on the corresponding associated parameters;

training the initial translation model according to the word loss parameters to obtain a translation model;

and translating the text to be processed in the source language into a translated text in the target language through the translation model.

2. The method of claim 1, wherein determining, based on the corresponding correlation parameters, word loss parameters corresponding to the word segments included in the second text of the target text sample pair and the model translated text respectively comprises:

respectively determining word differences between the word segments included in the second text of the target text sample pair and the corresponding word of the model translation text by taking the word segments included in the second text of the target text sample pair as granularity;

determining loss weights according to the associated parameters respectively corresponding to the word segments included in the second text of the target text sample pair;

determining the term loss parameter as a function of the term difference and the corresponding loss weight, wherein a value of the loss weight is inversely related to a complexity of the identified translation mapping.

3. The method according to claim 2, wherein the determining the loss weight according to the associated parameters respectively corresponding to the word segments included in the second text of the target text sample pair comprises:

and determining the loss weight according to a first hyperparameter, a second hyperparameter and the associated parameter, wherein the first hyperparameter is used for scaling the associated parameter, and the second hyperparameter is used for determining a lower limit value of the loss weight.

4. The method of claim 1, wherein a first text of the target text sample pair includes n word segments, a second text includes m word segments, and the target word segment is a jth word segment of the m word segments;

for the jth word segment in the target text sample pair, the determining the associated parameters of the word segments included in the second text in the target text sample pair comprises:

determining a co-occurrence frequency parameter of segment pairs formed by the jth word segment and the n word segments respectively in the plurality of text sample pairs;

determining a first word frequency parameter of the n word segments in the plurality of text sample pairs respectively;

and determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter and the first word frequency parameter.

5. The method of claim 4, further comprising:

determining a second word frequency parameter of the jth word segment in the plurality of text sample pairs;

the determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter and the first word frequency parameter includes:

and determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter, the first word frequency parameter and the second word frequency parameter.

6. The method of claim 5, wherein the co-occurrence frequency parameter is used to identify a number of pairs of text samples in the plurality of pairs of text samples in which the pair of segments co-occur;

for an ith word segment of the n word segments, the first word frequency parameter is used to identify a quantity of text in the plurality of text pairs for which the ith word segment occurs;

for the jth word segment, the second word frequency parameter is used to identify a number of texts in the plurality of text pairs in which the jth word segment respectively appears.

7. The method of claim 5, wherein the co-occurrence frequency parameter is used to identify a number of times the segment pairs co-occur in the plurality of pairs of text samples;

for an ith word segment of the n word segments, the first word frequency parameter is used to identify a number of times the ith word segment occurs in the plurality of text pairs;

for the jth word segment, the second word frequency parameter is used to identify a number of times the jth word segment occurs in each of the plurality of text pairs.

8. The method of any of claims 1-7, wherein training the initial translation model according to the word loss parameter comprises:

if the correlation parameters comprise target correlation parameters with numerical values smaller than a threshold value, neglecting the word loss parameters determined based on the target correlation parameters in the process of training the initial translation model according to the word loss parameters.

9. A text translation device is characterized by comprising an acquisition unit, a first determination unit, a second determination unit, a third determination unit, a training unit and a translation unit:

the acquiring unit is configured to acquire a translation text training set, where the translation text training set includes a plurality of text sample pairs, where the text sample pairs include a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language;

the first determining unit is configured to determine an association parameter of a word segment included in the second text in a text sample pair to which the second text belongs, where a target text sample pair is any one of the plurality of text sample pairs, and the association parameter is used to represent a translation mapping relationship between a target word segment in the second text of the target text sample pair and a word segment in the first text of the target text sample pair;

the second determining unit is configured to determine, according to the first text in the target text sample pair, a model translation text in the target language through an initial translation model;

the third determining unit is configured to determine, based on the corresponding associated parameters, word loss parameters corresponding to the model translation texts respectively for word segments included in the second text of the target text sample pair;

the training unit is used for training the initial translation model according to the word loss parameters to obtain a translation model;

the translation unit is used for translating the text to be processed in the source language into the translated text in the target language through the translation model.

10. The apparatus according to claim 9, wherein the fourth determining unit is specifically configured to:

respectively determining word differences between the word segments included in the second text of the target text sample pair and the corresponding word of the model translation text by taking the word segments included in the second text of the target text sample pair as granularity;

determining loss weights according to the associated parameters respectively corresponding to the word segments included in the second text of the target text sample pair;

determining the term loss parameter as a function of the term difference and the corresponding loss weight, wherein a value of the loss weight is inversely related to a complexity of the identified translation mapping.

11. The apparatus according to claim 10, wherein the fourth determining unit is specifically configured to:

and determining the loss weight according to a first hyperparameter, a second hyperparameter and the associated parameter, wherein the first hyperparameter is used for scaling the associated parameter, and the second hyperparameter is used for determining a lower limit value of the loss weight.

12. The apparatus of claim 9, wherein a first text of the target text sample pair comprises n word segments, a second text comprises m word segments, and the target word segment is a jth word segment of the m word segments;

for the jth word segment in the target text sample pair, the first determining unit is specifically configured to:

determining a co-occurrence frequency parameter of segment pairs formed by the jth word segment and the n word segments respectively in the plurality of text sample pairs;

determining a first word frequency parameter of the n word segments in the plurality of text sample pairs respectively;

and determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter and the first word frequency parameter.

13. The apparatus according to claim 12, characterized in that the apparatus further comprises a fifth determining unit:

the fifth determining unit is configured to determine a second word frequency parameter of the jth word segment in the plurality of text sample pairs;

the first determining unit is specifically configured to:

and determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter, the first word frequency parameter and the second word frequency parameter.

14. A computer device, the device comprising a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to execute the text translation method according to any one of claims 1 to 8 according to instructions in the program code.

15. A computer-readable storage medium for storing a computer program for executing the text translation method according to any one of claims 1 to 8.

Technical Field

The present application relates to the field of translation technologies, and in particular, to a text translation method and a related apparatus.

Background

With the rapid development of AI technology, more AI technologies are applied to text translation, for example, a first text in a source language can be translated into a second text in a target language through a translation model.

In the related art, when a translation model is trained through a text pair, one text pair usually includes a source language text as a model input and a target language text as a training label, but the training effect is poor, so that when the translation model is used for translating the source language text, the translation result is not ideal.

Disclosure of Invention

In order to solve the above technical problem, in the text translation method provided in the embodiments of the present application, the processing device may adjust the relevant parameters for training the initial translation model in combination with the translation mapping relationship between the word segments corresponding to the source language and the target language, so that the translation model obtained by training can translate the source language text based on the translation mapping relationship, and thus the translated text in the target language obtained by translation is more fit to the text meaning of the source language text, and the translation effect is improved.

The embodiment of the application discloses the following technical scheme:

in a first aspect, an embodiment of the present application discloses a text translation method, where a translation text training set is obtained, where the translation text training set includes a plurality of text sample pairs, where the text sample pairs include a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language;

determining an association parameter of a word segment included in the second text in a text sample pair, wherein a target text sample pair is any one of the text sample pairs, and the association parameter is used for embodying a translation mapping relationship between a target word segment in the second text of the target text sample pair and a word segment in the first text of the target text sample pair; the method comprises the following steps:

determining a model translation text in the target language through an initial translation model according to a first text in the target text sample pair;

determining word loss parameters corresponding to the model translation texts respectively for word segments included in the second text of the target text sample pair based on the corresponding associated parameters;

training the initial translation model according to the word loss parameters to obtain a translation model;

and translating the text to be processed in the source language into a translated text in the target language through the translation model.

In a second aspect, an embodiment of the present application discloses a text translation apparatus, which includes an obtaining unit, a first determining unit, a second determining unit, a third determining unit, a training unit, and a translation unit:

the acquiring unit is configured to acquire a translation text training set, where the translation text training set includes a plurality of text sample pairs, where the text sample pairs include a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language;

the first determining unit is configured to determine an association parameter of a word segment included in the second text in a text sample pair to which the second text belongs, where a target text sample pair is any one of the plurality of text sample pairs, and the association parameter is used to represent a translation mapping relationship between a target word segment in the second text of the target text sample pair and a word segment in the first text of the target text sample pair;

the second determining unit is configured to determine, according to the first text in the target text sample pair, a model translation text in the target language through an initial translation model;

the third determining unit is configured to determine, based on the corresponding associated parameters, word loss parameters corresponding to the model translation texts respectively for word segments included in the second text of the target text sample pair;

the training unit is used for training the initial translation model according to the word loss parameters to obtain a translation model;

the translation unit is used for translating the text to be processed in the source language into the translated text in the target language through the translation model.

In a third aspect, an embodiment of the present application discloses a computer device, where the device includes a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to perform the text translation method of the first aspect according to instructions in the program code.

In a fourth aspect, an embodiment of the present application discloses a computer-readable storage medium, where the computer-readable storage medium is configured to store a computer program, and the computer program is configured to execute the text translation method described in the first aspect.

According to the technical scheme, in order to improve translation quality, a translation text training set can be obtained, wherein the translation text training set comprises a plurality of text sample pairs, the text sample pairs comprise a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language. Subsequently, an association parameter of the word segment included in the second text in the belonging text sample pair may be determined, where the target text sample pair is any one of a plurality of text sample pairs, and the association parameter is used to embody a translation mapping relationship between the target word segment in the second text of the target text sample pair and the word segment in the first text of the target text sample pair. During model training, a model translation text in a target language can be determined through an initial translation model according to a first text in the target text sample pair, and a difference of the initial translation model during translation can be embodied through the model translation text and a second text in the target text sample pair. Based on the corresponding associated parameters, word loss parameters corresponding to the word segments included in the second text of the target text sample pair and the model translation text respectively can be determined. Because the translation accuracy of the word segments between the two languages can be analyzed through the translation mapping relationship, the degree of influence of the word segments included in the second text on the translation accuracy can be reflected by the word loss parameter on the basis of combining the translation mapping relationship. The initial translation model can be trained according to the word loss parameters, so that the initial translation model can learn word segments with different influence degrees on translation accuracy by adopting different learning strengths to obtain a translation model, a to-be-processed text in a source language can be translated into a translated text in a target language through the translation model, the translated text can be attached to the corresponding meaning of the to-be-processed text in the source language through more accurate word translation, and the translation accuracy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a text translation method in an actual application scenario according to an embodiment of the present application;

fig. 2 is a flowchart of a text translation method according to an embodiment of the present application;

FIG. 3 is a chart of experimental results provided in an embodiment of the present application;

FIG. 4 is a chart of experimental results provided in an embodiment of the present application;

FIG. 5 is a diagram of an initial translation model provided by an embodiment of the present application;

fig. 6 is a block diagram illustrating a structure of a text translation apparatus according to an embodiment of the present application;

fig. 7 is a block diagram of a computer device according to an embodiment of the present application;

fig. 8 is a block diagram of a server according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

Language translation is one of the popular applications of AI technology. In the related art, when a translation model is trained, training weights corresponding to words are adjusted correspondingly only by referring to word frequency information of a target language, so that a target language text translated by the translation model can meet the language characteristics of the target language. However, the training method may cause that the translation model cannot learn the language features of the source language, so that the translation result may not accurately reflect the text meaning of the text in the source language, and the translation effect is poor.

In order to solve the above technical problem, in the text translation method provided in the embodiments of the present application, the processing device may adjust the relevant parameters for training the initial translation model in combination with the translation mapping relationship between the word segments corresponding to the source language and the target language, so that the translation model obtained by training can translate the source language text based on the translation mapping relationship, and thus the translated text in the target language obtained by translation is more fit to the text meaning of the source language text, and the translation effect is improved.

It is understood that the method may be applied to a processing device having a data processing function, for example, a terminal device or a server having a data processing function. The method is independently executed by the terminal equipment or the server, can also be applied to a network scene of communication between the terminal equipment and the server, and is operated by the cooperation of the terminal equipment and the server. The terminal device may be a mobile phone, a desktop computer, a Personal Digital Assistant (PDA for short), a tablet computer, or the like. The server may be understood as an application server, or may also be a Web server, and in actual deployment, the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

The block chain technology can be applied to the embodiment of the application, for example, in the text translation method disclosed in the application, the determined associated parameters and the translation model obtained by training can be stored in the block chain, so that relevant personnel or relevant equipment can conveniently obtain the parameters or the model to perform operations such as text translation and model training.

In addition, the present application relates to Artificial Intelligence (AI) technology. Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. The present application relates generally, among other things, to natural language processing and machine learning techniques.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

For example, in the embodiment of the present application, through a natural language processing technology, the processing device may enable the initial translation model to recognize the text meaning of the first text, and determine the corresponding model translation text; through machine learning technique, more accurate translation model can be obtained in the training of processing equipment to the translation text accuracy that can make the translation obtain is higher, improves the translation effect.

In order to facilitate understanding of the technical solution provided by the embodiment of the present application, a text translation method provided by the embodiment of the present application will be introduced in combination with an actual application scenario.

Referring to fig. 1, fig. 1 is a schematic diagram of a text translation method in an actual application scenario provided in the embodiment of the present application, where a processing device is a server 101 in the actual application scenario.

The server 101 may first obtain a translation text training set, where the translation text training set includes a plurality of text sample pairs, for example, the text sample pair may include a text sample pair 1 to a text sample pair n, each text sample pair includes a first text in a source language and a second text in a target language, and taking the text sample pair 1 as an example, the second text is a translation text of the first text in the target language.

In order to improve the translation effect of the translation model, the server 101 may use the translation mapping relationship between the source language and the target language as a reference factor for model training, so that the trained translation model can output an output result more fitting the language meaning of the source language based on the translation mapping relationship. The server 101 may use the text sample pair 1 as a target text sample pair, and determine an association parameter of a target word segment included in the second text in the text sample pair, where the association parameter is used to embody a translation mapping relationship between the target word segment and a first text word segment of the target text sample pair, and the translation mapping relationship can embody a degree of mapping diversity between the target word segment and the first text word segment.

In the training process, the server 101 may determine a model translation text in the target language through the initial translation model according to the first text in the target text sample pair. It can be understood that, when the first text in the target text sample pair is translated, if there is a more diverse mapping relationship between the target word segment and the word segment in the first text, even if the word segment in the model translation text is not consistent with the target word segment, there is a word segment that is more likely to have the same meaning as the target word segment, and the influence on embodying the meaning of the first text is lower. When the target word segment has a single mapping relation with the word segment in the first text, it is indicated that when the model translation text does not include the target word segment, there is a high probability that the initial translation model translates the word segment with low accuracy, and the influence on the first text meaning is high. Therefore, the server 101 may determine, based on the corresponding association parameters, word loss parameters corresponding to the word segments included in the second text of the target text sample pair and the model translation text, respectively, so that the word segments having different translation mapping relationships may be learned in a targeted manner through the word loss parameters.

The server 101 may train the initial translation model according to the word loss parameter, so that the initial translation model can adjust learning strength for different word segments based on the translation mapping relationship, for example, learning strength for some word segments with a single translation mapping relationship may be increased, and learning strength for some word segments with multiple translation mapping relationships may be reduced under the condition of ensuring the translation effect, so as to obtain the translation model. The server 101 can translate the text to be processed in the source language into the translated text in the target language through the translation model, and the translated text can combine with the translation mapping relation to more accurately embody the meaning of the text to be processed in the source language, so that the translation effect is improved.

Next, a text translation method provided by an embodiment of the present application will be described with reference to the drawings.

Referring to fig. 2, fig. 2 is a flowchart of a text translation method provided in an embodiment of the present application, where the method includes:

s201: and acquiring a translation text training set.

It is understood that the translation mapping relationship between the word segments (tokens) included in the two languages can be used to analyze the accuracy of the translation result and to measure the difficulty of learning the word segments by the model. The term fragment refers to a fragment composed of one or more terms, the translation mapping relationship refers to a mapping relationship between a term fragment included in a source language text and a term fragment included in a corresponding target language text, and the target language text is obtained by translating the source language text. For example, "happy" in the english text can be translated into word fragments such as "happy" and "happy" in the chinese text, and a certain translation mapping relationship exists between "happy" and "happy" respectively.

Because of the reasons of various meanings and the like, part of the word segments may have a more complex translation mapping relationship between the source language and the target language, for example, "happy" can be translated into a Chinese text, and also can be translated into a word "happy", a plurality of word segments with translation mapping relationships corresponding to the word segments can be replaced mutually usually, and the influence on the meaning of the source language text expressed by the translation result is low; for example, some word segments with strong technical performance in the source language may be translated into fixed word segments in the target language to express the meaning of the word segment more accurately, for example, the translation of the chinese bearing into the bearing in english is more accurate, and the problem of inaccurate translation may occur if other word segments are obtained in the translation result. Therefore, based on the translation mapping relation, the processing equipment can determine the influence degree of different word segments on the meaning of the source language text, and accordingly targeted learning training can be performed on the translation model outputting the translation result based on the influence degree, and the trained translation model can restore the text meaning of the source language text as far as possible.

First, the processing device may obtain a translation text training set, where the translation text training set includes a plurality of text sample pairs, the text sample pairs include a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language. Through the first text and the second text, the processing equipment can analyze and obtain the translation mapping relation between the word segments included in the two languages, so that the language characteristics of the target language can be considered during model training, the translation mapping relation between the source language and the target language can be integrated for training, and the finally obtained translation model can reduce the text meaning of the text of the source language as far as possible through the output translation result.

S202: and determining the associated parameters of the word segments included in the second text in the affiliated text sample pair.

The target text sample pair can be any one of a plurality of sample pairs, the association parameter is used for reflecting a translation mapping relation between a target word segment in a second text of the target text sample pair and a word segment in a first text of the target text sample pair, and the numerical value of the association parameter can reflect the complexity of the translation mapping relation to a certain extent.

If the word segment in the first text can be translated into other word segments besides the target word segment, or if other word segments besides the source language of the word segment can be translated into the target word segment, the target word segment and the word segment in the first text may have a more complex translation mapping relationship; if the word segment in the first text and the target word segment have a close one-to-one mapping relationship in the texts included in the translation text training set, the translation mapping relationship between the target word segment and the word segment in the first text may be simpler.

It can be understood that, since the second text in the same text sample pair is a translated text of the first text in the target language, a word segment having a translation mapping relationship with a word segment included in the second text exists in the first text. Based on this, when determining the association parameter, the processing device may more accurately embody the complexity of the translation mapping relationship between the word segment included in the second text and the word segment included in the first text by determining the association parameter of the word segment included in the second text in the text sample pair to which the second text belongs. If the associated parameters of the word segments included in the second text in the whole translation text training set are determined instead of being determined according to the affiliated text sample pairs, the determined associated parameters are difficult to accurately reflect the complexity of the translation mapping relationship due to lack of pertinence on the word segments with the translation mapping relationship, and further, when the associated parameters for model training are subsequently determined, the accuracy of the parameters is low, and the model training effect is poor.

It is to be understood that the training set determining actions and the determining actions of the associated parameters performed in S201 to S202 are not actions that have to be performed before the model is trained. For example, after an initial model training, the translation text training set used for the model training and the determined association parameters can also be used for training other initial translation models, without acquiring a new translation text training set and re-determining the association parameters, thereby improving the convenience of model training.

S203: and determining a model translation text in the target language through the initial translation model according to the first text in the target text sample pair.

In the training process, the processing device may first obtain an initial translation model, and then translate the first text in the target text sample pair through the initial translation model to obtain a model translation text in the target language.

S204: and determining word loss parameters of the word segments included in the second text of the target text sample pair respectively corresponding to the model translation text based on the corresponding associated parameters.

As mentioned above, through the translation mapping relationship between the word segments, the processing device may determine the influence degree of different word segments on the source language text meaning, and the association parameter may embody the translation mapping relationship between the word segments, so that the processing device may train the initial translation model based on the association parameter, so that the translation result output by the initial translation model fits the text meaning of the source language text more.

The processing device may determine, based on the corresponding associated parameters, word loss parameters corresponding to the word segments included in the second text of the target text sample pair and the model translation text, respectively, where the word loss parameters are used to adjust the learning strength of the initial translation model. Therefore, the processing equipment can learn the word segments with different degrees of influence with different degrees of strength through the word loss parameter.

If the translation mapping relationship embodied by the corresponding association parameter is relatively complex, it is indicated that a plurality of word segments having the translation mapping relationship may exist in the source language of the word segment, or the word segment is one of a plurality of word segments which can be obtained after the translation of the word segment included in the first text. Because the translation mapping relationship is complex, there may be multiple word segments that may be the translation results of the word segments included in the first text. At this time, if the word segment included in the second text is different from the corresponding word segment in the model translation text, the influence degree on the expression of the text of the first text is generally low. For example, "happy" may correspond to "happy" in the second text and "happy" in the model translation text, and the influence of such difference on the meaning is low, so the processing device may set a low learning strength for such a word segment through the word loss parameter, that is, inform the initial translation model that the influence on the accuracy of the final output result is low even if the learning strength on how to determine the word segment is low, thereby reducing the time and effort required for training the initial translation model.

If the translation mapping relation embodied by the corresponding associated parameter is single, it is indicated that the word segment has a single word segment in the source language and corresponds to the single word segment, that is, under a normal condition, when the word segment in the source language is translated, the word segment corresponding to the second text is obtained. Therefore, if the word segment in the second text does not appear in the model translation text, it indicates that there is a high probability that a translation error occurs. For example, a word segment of "Bearing" is usually translated into "Bearing" in english, and if the word segment of "Bearing" is included in the first text but the word segment of "Bearing" is not included in the corresponding model translation text, there is a high probability that an incorrect translation result is generated for the initial translation model. At this moment, the processing device can set a larger learning strength for the word segments in the second text through the word loss parameters, so that the initial translation model can learn how to translate the word segments with the larger learning strength, and the translation accuracy of the translation model on the word segments is improved.

S205: and training the initial translation model according to the word loss parameters to obtain a translation model.

Through the word loss parameters determined in the steps, the processing equipment can enable the initial translation model to learn the word segments with different influence degrees in different degrees, so that the learning speed of the word segments with lower influence degrees can be improved, the learning precision of the word segments with higher influence degrees can be improved, and the expression accuracy of the source language text meaning can be improved as much as possible on the premise of ensuring the translation efficiency of the obtained translation model.

S206: and translating the text to be processed in the source language into a translated text in the target language through a translation model.

Because the translation model is obtained by training based on the translation text training set consisting of the first text of the source language and the second text of the target language, after the translation model is obtained, the text to be processed of the source language can be used as the model input of the translation model in practical application, the translation text in the target language can be obtained more accurately, and the text to be processed can be any text comprising the source language word fragments.

According to the technical scheme, in order to improve translation quality, the association parameters of the word segments included in the second text in the text sample pair can be determined, and the association parameters are used for reflecting the translation mapping relation between the target word segments in the second text of the target text sample pair and the word segments in the first text of the target text sample pair. During model training, a model translation text in a target language can be determined through an initial translation model according to a first text in the target text sample pair, and a difference of the initial translation model during translation can be embodied through the model translation text and a second text in the target text sample pair. Based on the corresponding associated parameters, word loss parameters corresponding to the word segments included in the second text of the target text sample pair and the model translation text respectively can be determined. Because the translation accuracy of the word segments between the two languages can be analyzed through the translation mapping relationship, the degree of influence of the word segments included in the second text on the translation accuracy can be reflected by the word loss parameter on the basis of combining the translation mapping relationship. The initial translation model can be trained according to the word loss parameters, so that the initial translation model can learn word segments with different influence degrees on translation accuracy by adopting different learning strengths, the translated text can be attached to corresponding meanings of the text to be processed in the source language through more accurate word translation, and translation accuracy is improved.

In the related art, when a translation model is trained, training weights corresponding to word segments are usually determined only based on word frequencies (token frequencies) of the word segments included in a target language in the target language, so as to solve the problem of unbalanced distribution of the word segments in a translation result. However, this method does not consider the language features of the source language on the one hand, and on the other hand, the word segments with similar word frequencies may have distinct translation mapping relations in the source language, and if the same training weights are assigned to the word segments with similar word frequencies, the training result may be poor, and it is difficult to accurately reflect the text meaning of the source language text. As shown in fig. 3, fig. 3 is an experimental result chart provided in this application, in which an abscissa of the chart is a Mutual Information (Mutual Information) value, the Mutual Information value can embody a translation mapping relationship between a target language term segment and a source language term segment, and the higher the Mutual Information value is, the more the translation mapping relationship is single. As can be seen from the graph, word segments with the same word frequency may correspond to various mutual information values, so that if the training weight is determined based on the word frequency, the translation mapping relationship cannot be effectively embodied, the accurate mapping between the obtained translation result and the source language text cannot be realized, and the translation effect is poor.

As mentioned above, the processing device can learn the word segments with different translation mapping relationships with different degrees of strength by using the word loss parameter. Specifically, in the training process, the processing device may use word segments included in the second text of the target text sample pair as the granularity, respectively determine word differences between the word segments included in the second text of the target text sample pair and the translation texts of the model, and then determine loss weights according to associated parameters respectively corresponding to the word segments included in the second text of the target text sample pair, where the loss weights are used to adjust learning strength of the model when learning the word segments, and the larger the value of the loss weights is, the larger the learning strength of the model when learning the word segments is, that is, to improve the accuracy of obtaining the word segments by translation.

From the above, the more complex the translation mapping relationship corresponding to a word segment is, the lower the influence degree on the meaning of the text embodying the source language is when the translation corresponding to the word segment is wrong. Meanwhile, the more complex the translation mapping relationship is, the more diversified the mapping relationship between the word segment included in the second text and the word segment included in the first text is, for example, the word segment included in the first text may have the translation results of a plurality of word segments including the word segment included in the second text, and the difficulty of learning whether the word segment is translated accurately is higher for the model.

As shown in fig. 4, fig. 4 is a graph in which a mean Bilingual Mutual Information (BMI) value is used as an abscissa, a Measure of vocabulary Diversity (MTLD) value and a Bilingual learning assessment (BLEU) value are used as ordinates, and the BMI value may be used to represent a translation mapping relationship between word segments, and the higher the BMI value, the more unique the translation mapping relationship between word segments. The average BMI value is the average value of the BMI values corresponding to the word segments included in the second text in the text sample pair, and the average BMI value can embody the translation mapping relation between the second text and the first text in the text sample pair. The solid line represents a BLEU value, the higher the BLEU value is, the more accurate the translation result is, that is, the learning difficulty of the text sample pair for translation training is lower for the translation model; the dotted line represents the MTLD value, with higher MTLD values indicating higher lexical diversity in the translation results. From the experimental results shown in the graph, when the average BMI value of the second text is higher, that is, the translation mapping relationship between the second text and the first text in the corresponding text sample pair is more single, the vocabulary diversity in the translation result is lower, and the translation result is more accurate, that is, the learning difficulty of the translation model for the text sample pair is lower. Therefore, the learning difficulty of the translation model for the word segments can be reflected to a certain extent through the corresponding associated parameters.

Based on this, in order to improve the learning effect of the initial translation model, reduce the overall learning difficulty of the initial translation model, and improve the learning efficiency, when determining the loss weight based on the association parameter, the numerical value of the loss weight may be inversely related to the complexity of the identified translation mapping relationship.

The processing equipment can determine the word loss parameters according to the word difference and the corresponding loss weight, so that when model training is carried out based on the word loss parameters, the word segments with single corresponding translation mapping relation can be studied with high strength, and the word segments with complex corresponding translation mapping relation can be studied with low strength. Because the influence degree of the word segment with a single translation mapping relation on the meaning of the source language text is larger when the translation error occurs, and the difficulty of learning the word segment is lower, the lower tolerance can be set for the translation error of the word segment by setting higher loss weight, so that the accuracy of the translation result on the meaning of the source language text can be improved; because the influence degree of the word segment with complex translation mapping relation on the meaning of the source language text is relatively small when a translation error occurs, and the difficulty of learning the word segment is relatively high, the higher tolerance can be set for the translation error of the word segment by setting a lower loss weight, so that the learning difficulty of the model can be reduced to a certain extent on the premise of ensuring the accuracy of the translation result on the meaning of the source language text, and a relatively good improvement effect is achieved in the two aspects of the learning accuracy and the learning difficulty.

In order to determine the loss weight more accurately, the processing device may further introduce various hyper-parameters to adjust the associated parameters, and determine the loss weight based on the adjusted associated parameters. In one possible implementation, when determining the loss weight according to the associated parameter, the processing device may determine the loss weight according to a first hyperparameter used for scaling the associated parameter, a second hyperparameter used for determining a lower limit value of the loss weight, and the associated parameter. Therefore, through the first super-parameter and the second super-parameter, the processing equipment can adjust the numerical value of the loss weight more accurately, and further can enable the initial translation model to learn word segments more reasonably.

For example, as shown in the following formula, which is a formula for determining the loss weight:

w(yj)=S·BMI(X,yj)+B

wherein, w (y)j) As word segment yjCorresponding loss weight, BMI (X, y)j) As word segment yjThe correlation parameters in the text sample pair, X is a first text in the text sample pair to which the word segment belongs, and S is a first hyper-parameter, and is used for scaling the BMI value; b is a second hyperparameter for determining w (y)j) The lower limit value of (3).

For the word segments with larger BMI value, the initial translation model can consider that the translation mapping relation of the word segments is single, the loss of the word segments is amplified through larger loss weight, and learning with larger strength is carried out; for the word segments with smaller BMI values, the initial translation model can consider that the mapping relation of the word segments is more complex and the learning difficulty is higher, and the loss of the word segments can be reduced through smaller loss weight, so that the situation that the local optimization is caused by over-learning can be avoided. The processing device may determine the loss parameter by the following equation:

wherein the content of the first and second substances,a loss parameter corresponding to a second text in a certain text sample pair, wherein the second text comprises m word segments, w (y)j)·logp(yjY < j, X) denotes the jth word segment YjCorresponding word loss parameter, logp (y)jY < j, X) is corresponding to the word segment between the processing equipment and the jth word segmentAnd determining the accurate translation probability of the jth word segment according to the translation result, wherein the probability can reflect the word difference between the jth word segment and the model translation text. Through the loss parameters, the initial translation model can determine the learning strength of each word segment in the second text, so that the translation relation between the second text and the first text can be integrally learned, and the translation precision of the first text is improved.

The above description mainly provides a detailed description of how to determine the word loss parameters, and next, a description of how to determine the associated parameters corresponding to the word segments will be focused. As mentioned above, the correlation parameters may be used to represent a translation mapping relationship between the target word segment in the second text of the target text sample pair and the word segment in the first text of the target text sample pair, so that the more accurate the correlation parameters are, the more accurate the embodied translation mapping relationship is, the more reasonable the loss weight and other related parameters determined based on the correlation parameters are, and further, the more accurate translation model can be obtained through training. Based on this, in order to improve the accuracy of model training, when determining the association parameters, the processing device may combine co-occurrence frequency parameters of the target word segment and the word segments in the first text and word frequency parameters corresponding to the word segments, where the co-occurrence frequency is used to represent frequencies of two word segments appearing in the same text sample pair, and the word frequency parameters are used to represent the appearance frequencies of the word segments in the translation text training set, so that the co-occurrence frequency parameters and the word frequency parameters corresponding to the word segments can represent the difference between the frequency of the word segments alone or the frequency of the word segments appearing and the co-occurrence frequency of the word segments, and further represent the translation mapping relationship between the two word segments.

In one possible implementation, the first text of the target text sample pair may include n word segments, the second text may include m word segments, and the target word segment may be a jth word segment of the m word segments. For the jth word segment in the target text sample pair, when determining the association parameter of the word segment included in the second text in the text sample pair, the processing device may determine a co-occurrence frequency parameter of a segment pair, in a plurality of text sample pairs, of the jth word segment and the n word segments, respectively, where the co-occurrence frequency parameter is used to represent the occurrence frequency of the segment pair in the plurality of text sample pairs. Meanwhile, the processing device may further determine a first word frequency parameter of the n word segments in the plurality of text sample pairs, respectively, where the first word frequency parameter is used to represent the occurrence frequencies of the n word segments in the plurality of text sample pairs, respectively.

Based on the co-occurrence frequency parameter and the first word frequency parameter, the processing device may determine an association parameter of the jth word segment in the target text sample pair. It can be understood that, if the first word frequency parameter is closer to the co-occurrence frequency parameter, it indicates that there is a single translation mapping relationship between a word segment and the jth word segment in n word segments included in the first text, that is, when the word segment occurs in the first text included in the text sample pair, there is a higher probability that the jth word segment also occurs in the second text included in the text sample pair, and the jth word segment is a translation result mainly corresponding to the word segment. If the difference between the first word frequency parameter and the co-occurrence frequency parameter is large, it indicates that n word segments included in the first text and the jth word segment have a relatively complex translation mapping relationship, that is, the jth word segment may be only one of a plurality of translation results corresponding to a word segment among the n word segments included in the first text, so that the first word frequency parameter is larger than the co-occurrence frequency parameter. Therefore, the translation mapping relation between the jth word segment and the word segment included in the first text in the target text sample pair can be determined through the co-occurrence frequency parameter and the first word frequency parameter.

It can be seen from the above that, through the first word frequency parameter and the co-occurrence frequency parameter, it can be accurately reflected whether the jth word segment is one of the multiple translation results of the word segments in the first text. In order to reflect the translation mapping relationship from a richer dimension and further improve the accuracy of the translation mapping relationship, in a possible implementation manner, the processing device may further determine whether a plurality of word segments exist in the source language, and may use the jth word segment as a translation result, so that whether the jth word segment has a complex translation mapping relationship or not may be analyzed from another dimension.

In one possible implementation, the processing device may determine a second word frequency parameter of the jth word segment in the plurality of text sample pairs, where the second word frequency parameter is used to represent an occurrence frequency of the jth word segment in the plurality of text sample pairs. The processing device may determine an association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter, the first word frequency parameter, and the second word frequency parameter.

After the second word frequency parameter is combined, if the difference between the second word frequency parameter and the co-occurrence frequency parameter is small, it can be described to a certain extent that a main word segment which takes the jth word segment as a translation result in the fact that the word segments are source languages exists in the n word segments, that is, the jth word segment and the word segments in the n word segments have a relatively single translation mapping relation; if the difference between the second word frequency parameter and the co-occurrence frequency parameter is large, it indicates that there are other word segments in the source language except the word segment in the n word segments, which can use the jth word segment as a translation result, i.e. the jth word segment and the word segments in the n word segments have a more complex translation mapping relationship. Therefore, by combining the first word frequency parameter, the second word frequency parameter and the co-occurrence frequency parameter, whether the n word segments have multiple translation results in the target language can be reflected, whether the source language has multiple word segments can be reflected, the jth word segment can be used as a translation result, the translation mapping relation of the jth word segment in the target text sample pair can be reflected from two dimensions, the accuracy of the associated parameters is further improved, the rationality of model training is finally improved, and a more accurate translation model is obtained.

It is understood that when the jth word segment and the word segments in the n word segments are not in a one-to-one translation mapping relationship and are not unique translation results, the jth word segment may have different association parameters in different text sample pairs. For example, in a certain text sample pair, a word segment included in a first text may have multiple translation results in a target language, and the jth word segment included in a second text is only one of the multiple translation results, so that when determining that the jth word segment corresponds to the associated parameter of the text sample pair, the co-occurrence frequency parameter corresponding to the segment pair formed by the jth word segment and the word segment is much smaller than the first word frequency parameter; in another text sample pair, the jth word segment may be a unique translation result of a certain word segment in the first text in the target language, and when determining the associated parameter corresponding to the jth word segment in the text sample pair, the co-occurrence frequency parameter corresponding to the segment pair formed by the jth word segment and the word segment is closer to the first word frequency parameter corresponding to the word segment. It can be seen that the same jth word segment may correspond to different association parameters in different text sample pairs. Based on the method, model training is carried out on the associated parameters of the text sample pairs, so that the initial translation model can carry out targeted learning based on the translation mapping conditions of the same jth word segment and different first text word segments, and the translated text determined by the final translation model can be more fit with the text meaning of the source language text.

As shown in the following formula, the formula is a formula for determining a correlation parameter, which may be a mutual information parameter BMI:

wherein, BMI (X, y)j) Is the associated parameter of the jth word segment in the belonged text sample pair, n is the number of the word segments included in the first text in the text sample pair, f (x)i,yj) As a word segment yjAnd word segment xiCo-occurrence frequency parameter, f (x), of the pair of composed segmentsi) As word segment xiCorresponding first word frequency parameter, f (y)j) As word segment yjAnd K is the total number of the text sample pairs in the translation text training set. Besides the BMI value, other parameters capable of representing the translation mapping relationship between the word segments may also be used as the association parameter, for example, the association parameter may be determined according to the confidence between the first text word segment and the second text word segment, and the confidence may be used to identify the probability of obtaining the second text word segment when translating the first text word segment.

In addition, the mode of embodying the occurrence frequency of the segment pairs and the word segments by the co-occurrence frequency parameter and the word frequency parameter may also include various modes. Since the determination mode includes a plurality of parameters, in order to improve the reasonability of the associated parameters, the processing device may set a plurality of parameters appearing in the same mode as parameters under the same measurement standard. In a possible implementation manner, the co-occurrence frequency parameter may be used to identify the number of text sample pairs in a pair of commonly occurring segments in a plurality of text sample pairs, and for an ith word segment in n word segments, the first word frequency parameter may be used to identify the number of texts in the ith word segment in the plurality of text pairs, and for a jth word segment, the second word frequency parameter may be used to identify the number of texts in the jth word segment in the plurality of text pairs, so that the three parameters can determine the word frequency parameter by using the number of texts as a measurement standard, and the rationality of the associated parameter is improved.

In another possible implementation manner, the co-occurrence frequency parameter may be further configured to identify a number of times a segment pair occurs in a plurality of text sample pairs together, and for an ith word segment in the n word segments, the first word frequency parameter may be configured to identify a number of times the ith word segment occurs in the plurality of text sample pairs, and for the jth word segment, the second word frequency parameter may be configured to identify a number of times the jth word segment occurs in each of the plurality of text sample pairs. By the method, the repeated word segments in the same text pair can be counted in more detail, so that the obtained associated parameters can reflect the translation mapping relation more accurately and comprehensively. Meanwhile, the parameters are determined by taking the occurrence times of the word segments as a measurement standard, so that a plurality of parameters can be maintained in the same dimension, and the reasonability of the determined associated parameters is ensured. Meanwhile, based on different parameter determination modes, the processing equipment can also adaptively adjust the loss weight determined based on the associated parameter, so that the determined loss weight is more reasonable.

It is understood that there may be some word segments in the translated text training set that have too complex translation mapping relationships, i.e., one word segment may have a greater number of word segments corresponding to it in another language. At this time, for the initial translation model, the learning difficulty of such a word segment may be too high, and if the initial translation model excessively learns such a word segment, the learning efficiency may be poor, and it is difficult to obtain a good learning effect. As mentioned above, the more complex the translation mapping relationship is, the lower the influence of the word segment on the accuracy of the translation result is, and therefore, the influence of the word segment with the excessively complex translation mapping relationship on the translation result is also low.

If the associated parameters include a target associated parameter with a value smaller than a threshold, in the process of training the initial translation model according to the word loss parameter, the word loss parameter determined based on the target associated parameter may be ignored, for example, the word loss parameter determined by the target associated parameter may be set to 0, so that the influence of the initial translation model on the learning efficiency due to the over-learning of the word segments is avoided, and the rationality of the learning training is improved.

For example, in a specific model training process, as shown in fig. 5, fig. 5 shows a schematic diagram of an initial translation model, where the initial translation model includes an encoder (ENCODERS) and a decoder (DECODERS), the processing device may input french text in a text sample pair as a first text into the encoder, and then obtain translated english text as a model translation text through the decoder. First, to improve the training efficiency of the model, the processing device may train the model for a certain number of steps, for example, 10 ten thousand steps, by minimizing a cross-entropy loss function as follows:

wherein m is the number of word segments in the second text corresponding to the first text, logp (y)jAnd | Y < j, X) represents the word difference between the jth word segment of the second text and the corresponding word segment of the model translation text. By minimizing the cross entropy loss function, the processing device may train an initial translation model based on differences between word segments, enabling the initial translation model to learn how to approximate the determined translation result to a second text corresponding to the input first text. Subsequently, in order to improve the training precision, the processing device may determine a word loss parameter according to the associated parameter corresponding to the word segment included in the second text, and train the initial translation model for 10 ten thousand steps based on the word loss parameter. In the second training process, the processing device may set the loss weight corresponding to the word segment whose associated parameter is lower than 0.4 to 0, so as to avoid that the initial translation model spends a large amount of time when learning the word segment whose translation mapping relationship is too complex, and further improve the efficiency of model training.

Based on the text translation method provided by the foregoing embodiment, an embodiment of the present application further provides a text translation apparatus, referring to fig. 6, fig. 6 is a block diagram of a structure of a text translation apparatus 600 provided by the embodiment of the present application, where the apparatus 600 includes an obtaining unit 601, a first determining unit 602, a second determining unit 603, a third determining unit 604, a training unit 605, and a translating unit 606:

an obtaining unit 601, configured to obtain a translation text training set, where the translation text training set includes a plurality of text sample pairs, where the text sample pairs include a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language;

a first determining unit 602, configured to determine an association parameter of a word segment included in the second text in a text sample pair to which the second text belongs, where a target text sample pair is any one of the text sample pairs, and the association parameter is used to represent a translation mapping relationship between a target word segment in the second text of the target text sample pair and a word segment in the first text of the target text sample pair;

a second determining unit 603, configured to determine, according to the first text in the target text sample pair, a model translation text in the target language through an initial translation model;

a fourth determining unit 604, configured to determine, based on the corresponding associated parameters, word loss parameters corresponding to the model translation texts respectively for word segments included in the second text of the target text sample pair;

a training unit 605, configured to train the initial translation model according to the word loss parameter, so as to obtain a translation model;

a translating unit 606, configured to translate the text to be processed in the source language into the translated text in the target language through the translation model.

In a possible implementation manner, the fourth determining unit 604 is specifically configured to:

respectively determining word differences between the word segments included in the second text of the target text sample pair and the corresponding word of the model translation text by taking the word segments included in the second text of the target text sample pair as granularity;

determining loss weights according to the associated parameters respectively corresponding to the word segments included in the second text of the target text sample pair;

determining the term loss parameter as a function of the term difference and the corresponding loss weight, wherein a value of the loss weight is inversely related to a complexity of the identified translation mapping.

In a possible implementation manner, the fourth determining unit 604 is specifically configured to:

and determining the loss weight according to a first hyperparameter, a second hyperparameter and the associated parameter, wherein the first hyperparameter is used for scaling the associated parameter, and the second hyperparameter is used for determining a lower limit value of the loss weight.

In one possible implementation, the first text of the target text sample pair includes n word segments, the second text includes m word segments, and the target word segment is the jth word segment of the m word segments;

for the jth word segment in the target text sample pair, the first determining unit 602 is specifically configured to:

determining a co-occurrence frequency parameter of segment pairs formed by the jth word segment and the n word segments respectively in the plurality of text sample pairs;

determining a first word frequency parameter of the n word segments in the plurality of text sample pairs respectively;

and determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter and the first word frequency parameter.

In a possible implementation manner, the apparatus 600 further includes a fifth determining unit:

a fifth determining unit, configured to determine a second word frequency parameter of the jth word segment in the plurality of text sample pairs;

the first determining unit 602 is specifically configured to:

and determining the association parameter of the jth word segment in the target text sample pair according to the co-occurrence frequency parameter, the first word frequency parameter and the second word frequency parameter.

In one possible implementation, the co-occurrence frequency parameter is used to identify a number of pairs of text samples in the plurality of pairs of text samples in which the pair of segments occurs together;

for an ith word segment of the n word segments, the first word frequency parameter is used to identify a quantity of text in the plurality of text pairs for which the ith word segment occurs;

for the jth word segment, the second word frequency parameter is used to identify a number of texts in the plurality of text pairs in which the jth word segment respectively appears.

In one possible implementation, the co-occurrence frequency parameter is used to identify a number of times the segment pairs co-occur in the plurality of text sample pairs;

for an ith word segment of the n word segments, the first word frequency parameter is used to identify a number of times the ith word segment occurs in the plurality of text pairs;

for the jth word segment, the second word frequency parameter is used to identify a number of times the jth word segment occurs in each of the plurality of text pairs.

In one possible implementation, the training unit 605 is specifically configured to:

if the correlation parameters comprise target correlation parameters with numerical values smaller than a threshold value, neglecting the word loss parameters determined based on the target correlation parameters in the process of training the initial translation model according to the word loss parameters.

The embodiment of the application also provides computer equipment which is described in the following with reference to the attached drawings. Referring to fig. 7, an embodiment of the present application provides a device, which may also be a terminal device, where the terminal device may be any intelligent terminal including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a Point of Sales (POS), a vehicle-mounted computer, and the terminal device is taken as the mobile phone as an example:

fig. 7 is a block diagram illustrating a partial structure of a mobile phone related to a terminal device provided in an embodiment of the present application. Referring to fig. 7, the handset includes: radio Frequency (RF) circuit 710, memory 720, input unit 730, display unit 740, sensor 750, audio circuit 760, wireless fidelity (WiFi) module 770, processor 780, and power supply 790. Those skilled in the art will appreciate that the handset configuration shown in fig. 7 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 7:

the RF circuit 710 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 780; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuit 710 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 710 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.

The memory 720 may be used to store software programs and modules, and the processor 780 may execute various functional applications and data processing of the cellular phone by operating the software programs and modules stored in the memory 720. The memory 720 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 720 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 730 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 730 may include a touch panel 731 and other input devices 732. The touch panel 731, also referred to as a touch screen, can collect touch operations of a user (e.g. operations of the user on or near the touch panel 731 by using any suitable object or accessory such as a finger, a stylus, etc.) and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 731 may include two portions of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts it to touch point coordinates, and sends the touch point coordinates to the processor 780, and can receive and execute commands from the processor 780. In addition, the touch panel 731 may be implemented by various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 730 may include other input devices 732 in addition to the touch panel 731. In particular, other input devices 732 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 740 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The Display unit 740 may include a Display panel 741, and optionally, the Display panel 741 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 731 can cover the display panel 741, and when the touch panel 731 detects a touch operation on or near the touch panel 731, the touch operation is transmitted to the processor 780 to determine the type of the touch event, and then the processor 780 provides a corresponding visual output on the display panel 741 according to the type of the touch event. Although the touch panel 731 and the display panel 741 are two independent components in fig. 7 to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 731 and the display panel 741 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 750, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 741 according to the brightness of ambient light, and a proximity sensor that turns off the display panel 741 and/or a backlight when the mobile phone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 760, speaker 761, and microphone 762 may provide an audio interface between a user and a cell phone. The audio circuit 760 can transmit the electrical signal converted from the received audio data to the speaker 761, and the electrical signal is converted into a sound signal by the speaker 761 and output; on the other hand, the microphone 762 converts the collected sound signal into an electric signal, converts the electric signal into audio data after being received by the audio circuit 760, and then processes the audio data output processor 780, and then transmits the audio data to, for example, another cellular phone through the RF circuit 710, or outputs the audio data to the memory 720 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 770, and provides wireless broadband Internet access for the user. Although fig. 7 shows the WiFi module 770, it is understood that it does not belong to the essential constitution of the handset, and can be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 780 is a control center of the mobile phone, connects various parts of the entire mobile phone by using various interfaces and lines, and performs various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 720 and calling data stored in the memory 720, thereby integrally monitoring the mobile phone. Optionally, processor 780 may include one or more processing units; preferably, the processor 780 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 780.

The handset also includes a power supply 790 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 780 via a power management system, so that the power management system may be used to manage charging, discharging, and power consumption.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.

In this embodiment, the processor 780 included in the terminal device further has the following functions:

acquiring a translation text training set, wherein the translation text training set comprises a plurality of text sample pairs, the text sample pairs comprise a first text in a source language and a second text in a target language, and the second text is a translation text of the first text in the target language;

determining an association parameter of a word segment included in the second text in a text sample pair, wherein a target text sample pair is any one of the text sample pairs, and the association parameter is used for embodying a translation mapping relationship between a target word segment in the second text of the target text sample pair and a word segment in the first text of the target text sample pair;

determining a model translation text in the target language through an initial translation model according to a first text in the target text sample pair;

determining word loss parameters corresponding to the model translation texts respectively for word segments included in the second text of the target text sample pair based on the corresponding associated parameters;

training the initial translation model according to the word loss parameters to obtain a translation model;

and translating the text to be processed in the source language into a translated text in the target language through the translation model.

Referring to fig. 8, fig. 8 is a block diagram of a server 800 provided in this embodiment, and the server 800 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 822 (e.g., one or more processors) and a memory 832, and one or more storage media 830 (e.g., one or more mass storage devices) storing an application 842 or data 844. Memory 832 and storage medium 830 may be, among other things, transient or persistent storage. The program stored in the storage medium 830 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, a central processor 822 may be provided in communication with the storage medium 830 for executing a series of instruction operations in the storage medium 830 on the server 800.

The server 800 may also include one or more power supplies 826, one or more wired or wireless network interfaces 850, one or more input-output interfaces 858, and/or one or more operating systems 841, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 8.

The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program is used to execute any one implementation manner of the text translation method described in the foregoing embodiments.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium may be at least one of the following media: various media that can store program codes, such as read-only memory (ROM), RAM, magnetic disk, or optical disk.

It should be noted that, in the present specification, all the embodiments are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

27页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于翻译引擎的专业词汇的翻译方法、工具及电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!