One-key auxiliary translation method for multi-language conference and equipment with same

文档序号:105614 发布日期:2021-10-15 浏览:24次 中文

阅读说明:本技术 一种一键式多语言会议辅助翻译方法及具有该方法的设备 (One-key auxiliary translation method for multi-language conference and equipment with same ) 是由 孟强祥 田俊麟 宋昱 于 2021-05-28 设计创作,主要内容包括:本发明公开了一种一键式多语言会议辅助翻译方法及具有该方法的设备,涉及语言翻译技术领域,翻译方法包括以下步骤:收录并存储多种语种信息、获取可能的语种、获取待翻译语音、获取判断语种结果、执行翻译转换源语言到目标语言、输出目标语言的语音和文字,翻译设备包括语音接收模块、语种设置模块、语种判断模块、语言翻译模块和输出模块。本发明通过建立多语种数据库和语种判断模型,在接收用户输入的语音后,判断该用户说话的语种,按照预先设置的目标语言进行翻译,将源语言翻译至目标语言,在会议过程中,可自动进行实时辅助翻译,从而实现无需用户操作可以直接翻译,极大提升用户体验。(The invention discloses a one-key auxiliary translation method for a multi-language conference and equipment with the same, and relates to the technical field of language translation, wherein the translation method comprises the following steps: the translation equipment comprises a voice receiving module, a language setting module, a language judging module, a language translation module and an output module. According to the invention, the multilingual database and the language judgment model are established, after the voice input by the user is received, the language spoken by the user is judged, translation is carried out according to the preset target language, the source language is translated to the target language, and real-time auxiliary translation can be automatically carried out in the conference process, so that direct translation can be realized without user operation, and the user experience is greatly improved.)

1. A one-key auxiliary translation method for a multi-language conference is characterized by comprising the following steps:

step one, recording and storing multiple languages information, including synchronous comparison data of voice and characters, establishing a corresponding multilingual database, and establishing a language judgment model for language judgment;

step two, acquiring possible languages, selecting two or more languages according to the main used languages of the conference, using the selected languages as comparison languages, and using the comparison languages as judgment comparison of the language judgment model;

step three, acquiring the speech to be translated, enabling a user to speak normally, detecting and extracting the speech with translation spoken by the user in real time, and inputting the speech information to be translated into a language type judgment model;

step four, obtaining a language judgment result, and performing similarity comparison and scoring on the obtained speech to be translated and each previously input comparison language through a language judgment model to obtain the comparison language with the highest score, namely the language to which the input language belongs;

step five, performing translation to convert the source language into the target language, and selecting a corresponding translation engine to translate the language to be translated into the target language after a judgment result of the language to be translated is obtained;

and step six, outputting the voice and the characters of the target language, extracting the voice and the character information corresponding to the corresponding language from the translated target language, and then synchronously outputting the voice and the characters.

2. The one-button multi-language conference aided translation method according to claim 1, wherein: the language judgment model established in the step one comprises an n-gram model, is a probabilistic language model based on a Markov chain of (n-1) order, and infers the structure of the sentence according to the probability of the occurrence of n words. The language judgment module can convert the voice into characters or can directly use the voice without conversion. Depending on the method of pre-training the model. When the source speech is input into the judgment model, the result is given through scoring. The scoring model may employ bayesian inference,

wherein: i represents that a certain event is satisfied as a condition;

h represents a hypothesis;

e represents evidence;

p (H) a prior probability, which is the probability of H before E is observed;

p (H | E) posterior probability, probability of a given evidence E, H;

p (E | H) assumes that H holds true and a change in E is observed.

P (E) edge likelihood.

3. The one-button multi-language conference aided translation method according to claim 1, wherein: and in the conference process, repeating the third step to the sixth step, continuously performing input, language type judgment, translation and output until the conference conversation is finished, and stopping voice acquisition.

4. A one-button auxiliary translation device for a multi-language conference comprises a voice receiving module and is used for obtaining source voice to be translated, and the device is characterized in that: the language type judging module is mainly composed of language type judging models and used for storing and executing the language type judging models or corresponding computer programs, the language type judging module is connected with a language translation module and used for translating a source language into a target language, the language translation module is connected with an output module, and the output module outputs translation results in a text and voice mode.

5. The one-button multilingual conference auxiliary translation device of claim 4, wherein: and the language setting module is used for inputting and maintaining through a visual user interaction interface.

6. The one-button multilingual conference auxiliary translation device of claim 4, wherein: the source speech refers primarily to an audio signal containing the speaker's voice, which is transmitted digitally or otherwise processed through analog-to-digital conversion.

7. The one-button multilingual conference auxiliary translation device of claim 4, wherein: the translation system also comprises a translation result recording module, each language after language translation is identified, the languages are classified according to the identification, and simultaneously, characters and voice are named and stored in the translation result recording module after time stamps are added.

8. The one-button multilingual conference auxiliary translation device of claim 7, wherein: the translation result recording module comprises a language classification module, the language classification module is connected with the language translation module and is used for classifying the translated target language, and the language classification module is connected with a character storage module and a voice storage module and is respectively used for storing the classified translation language and independently storing character and voice information.

Technical Field

The invention relates to the technical field of language translation, in particular to a one-key auxiliary translation method for a multi-language conference and equipment with the method.

Background

With the rapid improvement of computer performance, the wide application of mobile internet and the rapid advancement of AI technology, various machine language translation products are widely used in industries such as tourism, meeting, education, self-media, etc. With the increasingly enhanced translation products of international communication being more widely and deeply applied in the conference field, various mobile and desktop machine translation product translators and translation earphones are widely used. However, these translation products are limited and cumbersome to use, and only a fixed language can be set for one party, and differences in spoken language and settings can lead to the recognition of unexpected results. For example, the translator has two trigger keys, one key a needs to be preset with its corresponding language as chinese, and the other key B needs to be set as english. To translate from Chinese to English, the key A is pressed and then speech begins, and after the speech is finished, the Chinese can be translated to English, and vice versa. Therefore, the operation is inconvenient in actual use, and once the language corresponding to the key is wrongly selected, the mistake is completely translated.

Disclosure of Invention

The invention aims to provide a one-key auxiliary translation method for a multi-language conference and equipment with the method, so as to solve the defects in the prior art.

In order to achieve the above purpose, the invention provides the following technical scheme: a one-key auxiliary translation method for a multi-language conference comprises the following steps:

step one, recording and storing multiple languages information, including synchronous comparison data of voice and characters, establishing a corresponding multilingual database, and establishing a language judgment model for language judgment;

step two, acquiring possible languages, selecting two or more languages according to the main used languages of the conference, using the selected languages as comparison languages, and using the comparison languages as judgment comparison of the language judgment model;

step three, acquiring the speech to be translated, enabling a user to speak normally, detecting and extracting the speech with translation spoken by the user in real time, and inputting the speech information to be translated into a language type judgment model;

step four, obtaining a language judgment result, and performing similarity comparison and scoring on the obtained speech to be translated and each previously input comparison language through a language judgment model to obtain the comparison language with the highest score, namely the language to which the input language belongs;

step five, performing translation to convert the source language into the target language, and selecting a corresponding translation engine to translate the language to be translated into the target language after a judgment result of the language to be translated is obtained;

and step six, outputting the voice and the characters of the target language, extracting the voice and the character information corresponding to the corresponding language from the translated target language, and then synchronously outputting the voice and the characters.

Preferably, the language judgment model established in the step one comprises an n-gram (n-gram) model, which is a probabilistic language model based on (n-1) order Markov chain, and the structure of the sentence is deduced by the probability of n words. The language judgment module can convert the voice into characters or can directly use the voice without conversion. Depending on the method of pre-training the model. When the source speech is input into the judgment model, the result is given through scoring. The scoring model may employ bayesian inference,

wherein: i represents that a certain event is satisfied as a condition;

h represents a hypothesis;

e represents evidence;

p (H) a prior probability, which is the probability of H before E is observed;

p (H | E) posterior probability, probability of a given evidence E, H;

p (E | H) assumes that H holds true and a change in E is observed.

P (E) edge likelihood.

Preferably, in the conference process, the third step to the sixth step are repeated, the input, the language type discrimination, the translation and the output are continuously carried out until the conference conversation is finished, and the voice acquisition is stopped.

The utility model provides a supplementary translation equipment of one key formula multilingual meeting, includes voice receiving module for acquire the source language sound of waiting to translate, voice receiving module is connected with the language setting module, the type of language needs to be judged with the record to the language setting module, the language setting module has concatenated the language judgment module, the language judgment module mainly comprises language judgment model for the computer program that storage and execution language judgment model or correspond, the language judgment module is connected with the language translation module, with the source language translation to the target language, the language translation module is connected with output module, output module passes through the translation result through the mode output translation result of characters and pronunciation.

Preferably, the language setting module is input and maintained through a visual user interaction interface.

Preferably, the source speech is primarily an audio signal containing the speaker's voice, which is digitally transmitted or processed by analog-to-digital conversion.

Preferably, the voice-assisted translation device further comprises a translation result recording module, each language after the language translation is identified, the languages are classified according to the identification, and the characters and the voice are named and stored in the translation result recording module after the time stamp is added.

Preferably, the translation result recording module includes a language classification module, the language classification module is connected to the language translation module and is configured to classify the translated target language, and the language classification module is connected to the text storage module and the voice storage module and is respectively configured to store the classified translated language and separately store text and voice information.

In the technical scheme, the invention provides the following technical effects and advantages:

the invention is through setting up multilingual database and language kind judging model, in the course of meeting, a key starts the auxiliary translation apparatus, choose the contrast language in advance, obtain the speech to be translated through the speech receiving module, judge the language kind of model to carry on the contrast of the degree of similarity to each contrast language input in advance to the speech to be translated obtained through the language kind of judging module of the language kind, obtain the highest contrast language kind of score and is the language that the input language belongs to, after the language judgement result to be translated is got, choose the corresponding translation engine to translate the language kind to be translated into the target language, then the target language translated, withdraw the speech and word information corresponding to corresponding language, then speech and word synchronous output, the whole course is circulated repeatedly, in the course of meeting, only need not operate after a key is opened and the contrast language is chosen, can carry on real-time auxiliary translation, therefore, direct translation can be realized without user operation, and user experience is greatly improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a flow chart of the present invention.

FIG. 2 is a system diagram of the speech-aided translation apparatus of the present invention.

FIG. 3 is a system diagram of the translation result recording module according to the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, those skilled in the art will now describe the present invention in further detail with reference to the accompanying drawings.

The invention provides a one-key auxiliary translation method for a multi-language conference, which comprises the following steps:

step one, recording and storing multiple languages information, including synchronous comparison data of voice and characters, establishing a corresponding multilingual database, and establishing a language judgment model for language judgment;

step two, acquiring possible languages, selecting two or more languages according to the main used languages of the conference, using the selected languages as comparison languages, and using the comparison languages as judgment comparison of the language judgment model;

step three, acquiring the speech to be translated, enabling a user to speak normally, detecting and extracting the speech with translation spoken by the user in real time, and inputting the speech information to be translated into a language type judgment model;

step four, obtaining a language judgment result, and performing similarity comparison and scoring on the obtained speech to be translated and each previously input comparison language through a language judgment model to obtain the comparison language with the highest score, namely the language to which the input language belongs;

step five, performing translation to convert the source language into the target language, and selecting a corresponding translation engine to translate the language to be translated into the target language after a judgment result of the language to be translated is obtained;

and step six, outputting the voice and the characters of the target language, extracting the voice and the character information corresponding to the corresponding language from the translated target language, and then synchronously outputting the voice and the characters.

Further, in the above technical solution, the language judgment model established in the first step includes an n-gram (n-gram) model, which is a probabilistic language model based on a (n-1) order markov chain, and infers the structure of the sentence by the probability of n words. The language judgment module can convert the voice into characters or can directly use the voice without conversion. Depending on the method of pre-training the model. When the source speech is input into the judgment model, the result is given through scoring. The scoring model may employ bayesian inference,

wherein: i represents that a certain event is satisfied as a condition;

h represents a hypothesis;

e represents evidence;

p (H) a prior probability, which is the probability of H before E is observed;

p (H | E) posterior probability, probability of a given evidence E, H;

p (E | H) assumes that H holds true and a change in E is observed.

P (E) edge likelihood.

The results may be in the following Json format:

confidence is the scoring result, language is represented by ISO standard double letters, and if en represents English and jp represents Japanese, English score is 39.18 higher than Japanese score 22.04. The input speech can be judged to be english. And simultaneously, the target language can be determined to be Japanese.

Further, in the above technical solution, in the conference process, the third to sixth steps are repeated, and the input, language type discrimination, translation and output are performed continuously until the conference session is finished, and the voice acquisition is stopped.

The utility model provides a supplementary translation equipment of one key formula multilingual meeting, includes voice receiving module for acquire the source language sound of waiting to translate, voice receiving module is connected with the language setting module, the type of language needs to be judged with the record to the language setting module, the language setting module has concatenated the language judgment module, the language judgment module mainly comprises language judgment model for the computer program that storage and execution language judgment model or correspond, the language judgment module is connected with the language translation module, with the source language translation to the target language, the language translation module is connected with output module, output module passes through the translation result through the mode output translation result of characters and pronunciation.

Further, in the above technical solution, the language setting module inputs and maintains the language setting through a visual user interaction interface.

Further, in the above technical solution, the source speech mainly refers to an audio signal containing a speaker's voice, the signal is digitally transmitted or processed through analog-to-digital conversion, and when the speech receiving module receives the voice and determines that the voice signal contains speech information, the signal is preprocessed, including but not limited to noise reduction, human voice enhancement, echo cancellation, and other methods.

Furthermore, in the above technical solution, the speech-assisted translation device further includes a translation result recording module, which identifies each language after the language translation, classifies the languages according to the identification, and names and stores the characters and the speech in the translation result recording module after adding the timestamp.

Further, in the above technical solution, the translation result recording module includes a language classification module, the language classification module is connected to the language translation module and is configured to classify a translated target language, and the language classification module is connected to the text storage module and the voice storage module and is respectively configured to store the classified translated language and perform separate storage of text and voice information;

the implementation mode is specifically as follows: by establishing a multilingual database and a language judgment model, in the conference process, an auxiliary translation device is started by one key, comparison languages are selected in advance, voices to be translated are obtained through a voice receiving module, the obtained voices to be translated and each comparison language input in advance are subjected to similarity comparison and scoring through the language judgment model of the language judgment module, the obtained language with the highest score is the language to which the input language belongs, when a judgment result of the languages to be translated is obtained, a corresponding translation engine is selected to translate the languages to be translated into a target language, the translated target language is extracted, voice and character information corresponding to the corresponding language is extracted, then the voice and the characters are synchronously output, the whole process is repeatedly and circularly carried out, only one key is started and the operation is not needed after the comparison language is selected in the conference process, the real-time auxiliary translation can be carried out, and the direct translation can be realized without the operation of a user, user experience is greatly improved.

While certain exemplary embodiments of the present invention have been described above by way of illustration only, it will be apparent to those of ordinary skill in the art that the described embodiments may be modified in various different ways without departing from the spirit and scope of the invention. Accordingly, the drawings and description are illustrative in nature and should not be construed as limiting the scope of the invention.

11页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:基于模型增强的语音翻译模型训练方法、系统及语音翻译方法和设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!