Auxiliary communication system based on intelligent wearable equipment

文档序号:1556964 发布日期:2020-01-21 浏览:9次 中文

阅读说明:本技术 基于智能可穿戴设备的辅助交流系统 (Auxiliary communication system based on intelligent wearable equipment ) 是由 李凌 辜嘉 张斌 于 2019-10-12 设计创作,主要内容包括:本发明公开了一种基于智能可穿戴设备的辅助交流系统,它涉及一种生物医学辅助交流器具。包括智能可穿戴设备交互界面、翻译和显示模块和智能可穿戴设备集成及应用示范终端模块,智能可穿戴设备交互界面上设置有翻译和显示模块,智能可穿戴设备交互界面通过翻译和显示模块与智能可穿戴设备集成及应用示范终端模块进行信息传输。本发明实现了一个将外界语音和手语动作翻译成文字翻译器,使计算机能够理解人做出的手语,并将手语翻译成为普通人能够理解的文本,为聋哑人和普通人之间的交流提供方便,以达到减少聋哑人与普通人沟通障碍的目的。(The invention discloses an auxiliary communication system based on intelligent wearable equipment, and relates to a biomedical auxiliary communication appliance. The intelligent wearable device interaction interface is provided with the translation and display module, and the intelligent wearable device interaction interface performs information transmission with the intelligent wearable device integration and application demonstration terminal module through the translation and display module. The invention realizes a translator for translating the external voice and sign language actions into characters, so that a computer can understand the sign language made by people and translate the sign language into a text which can be understood by ordinary people, thereby providing convenience for communication between deaf-mute and ordinary people and achieving the purpose of reducing communication obstacles between the deaf-mute and the ordinary people.)

1. The auxiliary communication system based on the intelligent wearable device is characterized by comprising an intelligent wearable device interaction interface (1), a translation and display module (2) and an intelligent wearable device integration and application demonstration terminal module (3), wherein the translation and display module (2) is arranged on the intelligent wearable device interaction interface (1), and the intelligent wearable device interaction interface (1) is in information transmission with the intelligent wearable device integration and application demonstration terminal module (3) through the translation and display module (2).

2. The auxiliary communication system based on the intelligent wearable device as claimed in claim 1, wherein the intelligent wearable device interaction interface (1) projects a virtual image onto AR glasses through a display, virtualizes an operation interface which does not exist actually, and records the interaction between a person and the virtual image through a camera so as to serve as the input of the device;

the translation and display module (2) comprises real-time conversion of auditory and visual information and entry, identification and conversion of sign language; the real-time conversion of the auditory and visual information comprises the following specific steps: (1) inputting source language sound, calling an online and offline speech recognition API through a Server to perform system decoding, outputting source language sentences, and projecting sentences which can be understood by hearing-impaired people onto a screen of AR glasses if the hearing-impaired people are not in the same language family; (2) if the hearing-impaired people are not in the same language family, calling an on-line off-line speech recognition API through a Server to perform system decoding, outputting a target language sentence, and finally projecting the sentence which can be understood by the hearing-impaired people onto a screen of AR glasses, namely finishing information communication.

3. The auxiliary communication system based on the intelligent wearable device as claimed in claim 2, wherein the sign language inputting, recognizing and converting words comprises the following steps:

(1) acquiring three-dimensional coordinates of human bones as original data by adopting a camera of the latest wearable glasses;

(2) and sign language identification data processing: firstly, a coordinate system is preliminarily established, and then a sign language template is generated through the conversion of original coordinates, so that the digitalization and serialization of the sign language are realized; therefore, the coordinates corresponding to the sign language can be conveniently input, a sign language template stored in a file in a digital form is formed, the templating of the sign language is realized, and the full preparation is made for sign language identification;

(3) and adopting an improved DTW recognition algorithm: the algorithm adopts the idea of dynamic programming, and solves the problem that the length of a voice template in recognition is different from the length of the voice to be detected; similarly, in sign language recognition, the problem that the length of a template sequence is different from that of a sequence to be detected exists, and the problem is well solved by the improved DTW algorithm;

(4) and 3D character animation production technology: the system character animation is mainly made by MikuMikuDance, the generated animation generates high-quality Avi animation through an Avi decoder, the character action is designed finely, the character image is lovely and lively, the picture is clear, and the communication with the deaf-mute is convenient;

(5) and Avi video lossless compression technology: the system firstly loads the existing sign language database during normal operation, and if the sign language database file is too large, the program is crashed, so that the Avi video lossless compression technology is used, the original sign language file is compressed in the form of Xvid coding by decoding the video file, and the space occupied by one sign language video is not more than 1 MB;

(6) the system integration technology comprises the following steps: the program can complete the functions of sign language input and recognition, voice recognition and conversion into characters, character input and video calling by using C # language programming on a Visual Studio platform through a computer.

4. The auxiliary communication system based on the intelligent wearable device as claimed in claim 1, wherein the intelligent wearable device integration and application demonstration terminal module (3) adopts a terminal, and integrates voice acquisition, system decoding, data uploading/downloading and text output; to the dysaudia crowd, only wearable glasses and a small-size terminal, the source language is being passed to the terminal, and the terminal analysis back shows the target language sentence on AR glasses, wears glasses, and the dysaudia crowd just can "hear" external sound.

Technical Field

The invention relates to a biomedical auxiliary communication appliance, in particular to an auxiliary communication system based on intelligent wearable equipment.

Background

Sign language is a main communication tool for deaf-mutes, and information is transmitted by using the actions of hands and bodies; because the deaf-mute people can not hear the sound, most of the ordinary people can not understand the sign language, and the communication between the deaf-mute people and the ordinary people is very difficult.

The artificial cochlea is an electronic device, and an external speech processor converts sound into an electric signal in a certain coding form, and directly excites auditory nerves through an electrode system implanted in a human body to recover or rebuild the auditory function of a deaf person; the artificial cochlea is used as a conventional method for treating severe deafness to total deafness, but is expensive and is suitable for patients with small-age deafness.

In conclusion, the auxiliary communication system based on the intelligent wearable device is simple in system structure and low in cost.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an auxiliary communication system based on intelligent wearable equipment, which realizes a translator for translating external voice and sign language actions into characters, so that a computer can understand the sign language made by people and translate the sign language into a text which can be understood by ordinary people, thereby providing convenience for communication between deaf-mute people and ordinary people and achieving the purpose of reducing communication obstacles between the deaf-mute people and the ordinary people.

In order to achieve the purpose, the invention is realized by the following technical scheme: the auxiliary communication system based on the intelligent wearable device comprises an intelligent wearable device interaction interface, a translation and display module and an intelligent wearable device integration and application demonstration terminal module, wherein the translation and display module is arranged on the intelligent wearable device interaction interface, and the intelligent wearable device interaction interface is in information transmission with the intelligent wearable device integration and application demonstration terminal module through the translation and display module.

The intelligent wearable device interactive interface 1 projects a virtual image to AR glasses through a display, virtualizes an operation interface which does not exist actually, and records the interaction between a person and the virtual image through a camera so as to be used as the input of the device.

The translation and display module 2 comprises real-time conversion of auditory and visual information and entry, identification and conversion of sign language; the real-time conversion of the auditory and visual information comprises the following specific steps: 1. inputting source language sound, calling an online and offline speech recognition API through a Server to perform system decoding, outputting source language sentences, and projecting sentences which can be understood by hearing-impaired people onto a screen of AR glasses if the hearing-impaired people are not in the same language family; 2. if the hearing-impaired people are not in the same language family, calling an on-line off-line speech recognition API through the Server to perform system decoding, outputting a target language sentence, and finally projecting the sentence which can be understood by the hearing-impaired people onto a screen of AR glasses, namely completing information communication.

The sign language input, identification and text conversion method comprises the following steps:

1. acquiring three-dimensional coordinates of human bones as original data by adopting a camera of the latest wearable glasses;

2. sign language identification data processing: firstly, a coordinate system is preliminarily established, and then a sign language template is generated through the conversion of original coordinates, so that the digitalization and serialization of the sign language are realized; therefore, the coordinates corresponding to the sign language can be conveniently input, a sign language template stored in a file in a digital form is formed, the templating of the sign language is realized, and the full preparation is made for sign language identification;

3. using a modified DTW recognition algorithm: the algorithm adopts the idea of dynamic programming, and solves the problem that the length of a voice template in recognition is different from the length of the voice to be detected; similarly, in sign language recognition, the problem that the length of a template sequence is different from that of a sequence to be detected exists, and the problem is well solved by the improved DTW algorithm;

4.3D character animation technology: the system character animation is mainly made by MikuMikuDance, the generated animation generates high-quality Avi animation through an Avi decoder, the character action is designed finely, the character image is lovely and lively, the picture is clear, and the communication with the deaf-mute is convenient;

5.avi video lossless compression technology: the system firstly loads the existing sign language database during normal operation, and if the sign language database file is too large, the program is crashed, so that the Avi video lossless compression technology is used, the original sign language file is compressed in the form of Xvid coding by decoding the video file, and the space occupied by one sign language video is not more than 1 MB;

6. the system integration technology comprises the following steps: the program can complete the functions of sign language input and recognition, voice recognition and conversion into characters, character input, video calling and the like by using C # language programming on a Visual Studio platform through a computer.

The intelligent wearable device integration and application demonstration terminal module 3 adopts a terminal and integrates voice acquisition, system decoding, data uploading/downloading and character output; to the dysaudia crowd, only wearable glasses and a small-size terminal, the source language is being passed to the terminal, and the terminal analysis back shows the target language sentence on AR glasses, wears glasses, and the dysaudia crowd just can "hear" external sound.

The invention has the following beneficial effects:

the system widens the channel for acquiring information for the crowds with hearing impairment, improves the learning ability, improves the life quality and helps the crowds integrate normal work and life; the system has simple structure and low cost compared with a surgical cochlea; the communication capacity of the hearing-impaired people is improved, the social communication cost can be effectively reduced, and the great goal of national health is achieved; in addition, the real-time speech translation and sign language application to human-computer interaction have great academic value and wide market application value.

Drawings

The invention is described in detail below with reference to the drawings and the detailed description;

FIG. 1 is a system architecture diagram of the present invention;

fig. 2 is a binocular disparity explanatory diagram of the present invention;

FIG. 3 is a technical framework diagram of the audio visual information transformation system of the present invention;

fig. 4 is a schematic view of an application scenario of the hearing-impaired people assistant communication device of the present invention.

Detailed Description

In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained by combining the specific embodiments;

referring to fig. 1 to 4, the following technical solutions are adopted in the present embodiment: the auxiliary communication system based on the intelligent wearable device comprises an intelligent wearable device interaction interface 1, a translation and display module 2 and an intelligent wearable device integration and application demonstration terminal module 3, wherein the translation and display module 2 is arranged on the intelligent wearable device interaction interface 1, and the intelligent wearable device interaction interface 1 is in information transmission with the intelligent wearable device integration and application demonstration terminal module 3 through the translation and display module 2.

Because the display of the current intelligent wearable equipment is generally small and cannot reach full-view coverage, the virtual image can be displayed in partial view only based on the existing equipment to assist human-computer interaction; with the development of display technology (such as OLED), a larger coverage of the visual field is achieved, thereby achieving a better virtual reality effect.

The layering of the distance of the objects viewed by the human eyes is caused by the parallax of the images seen by the left eye and the right eye, and as fig. 2(a) shows the image seen by the left eye and fig. 2(b) shows the image seen by the right eye, the degree of displacement is shown in fig. 2(c), it can be found that the near objects are greatly displaced, and the closer feeling is also given to people when the displacement is actually large, which is the basic principle of the current 3D film imaging; therefore, the double cameras and the double displays can better calculate and simulate the image parallax of the eyes.

If the virtual image is projected to the glasses through the display, an operation interface which does not exist actually is virtualized, and the interaction between the person and the virtual image is recorded through the camera so as to be used as the input of the equipment, so that the wearable equipment is more convenient.

Based on the current development situation of the existing wearable equipment, the image interaction research of a single camera is focused, the three-dimensional space of double cameras is expanded under the condition, and meanwhile, the double displays are applied to achieve the purpose that the augmented reality is applied to the intelligent wearable equipment to complete the human-computer interaction.

The premise that real-time communication of the people with hearing impairment is required is to convert the language of the other party into an acceptable information mode; the method adopts the scheme that the source language speech is converted into the source language sentence, and if the hearing-impaired people are in the same language family, the requirement can be met; if the languages are not the same language family, the source language sentences are converted into target language sentences; finally, projecting sentences which can be understood by the people with hearing impairment onto the AR screen; thus, the information exchange is completed; as illustrated in fig. 3.

Displaying real-time conversion of auditory and visual information of the hearing-impaired people, and programming a TRANSLATOR API by using an off-line/cloud technology to realize the function of a translation module from A language to A characters and from A language to A characters to B characters; through the display calculation, the A characters or the B characters which can be identified by the hearing-impaired person are displayed on the wearable intelligent glasses in real time.

The sign language input, recognition and conversion characters are programmed by using C # language on a Visual Studio platform through a computer, and a program can complete the input and recognition of the sign language, the voice input, the voice recognition and the conversion into characters, character input, video calling and the like; the basic idea of the system comprises a translation mode and a communication mode; the translation mode is used for showing how a single word is translated into a written form from a sign language form, wherein the gesture action information is acquired through a technology of acquiring action information by a camera carried by wearable glasses, and the action information is converted into codes and then the codes are converted into Chinese character information through the C # language programming; in the communication mode, the system uses virtual images to represent the normal hearing person to play sign language, and the virtual images are the agents of the normal hearing person; finally, the two modes are gathered together to form a system with rich functions; the specific technical scheme is as follows:

1. the system adopts a camera carried by the latest wearable glasses to acquire three-dimensional coordinates of human bones as original data, and completes sign language identification by matching with an algorithm; the mark identification of the intelligent algorithm distinguishes the glove identification history, so that the identification is more convenient and natural; the wearable glasses are provided with the light sources, so that the influence of the illumination intensity on the sign language recognition is compensated, the complexity of the algorithm is reduced, and the stability and reliability of the sign language recognition are improved;

2. sign language identification data processing technology: firstly, initially establishing a coordinate system, and then generating a sign language template through the conversion of original coordinates, thereby realizing the digitalization and serialization of the sign language; therefore, the coordinates corresponding to the sign language can be conveniently input, a sign language template stored in a file in a digital form is formed, the templating of the sign language is realized, and the full preparation is made for sign language identification;

3. improved DTW recognition algorithm: the algorithm adopts the idea of dynamic programming, and solves the problem that the length of a voice template in recognition is different from the length of the voice to be detected; similarly, in sign language recognition, the problem that the length of a template sequence is different from that of a sequence to be detected exists, and the problem is well solved by the improved DTW algorithm;

4.3D character animation technology: the character animation of the system is mainly made by MikuMikuDance, the generated animation generates high-quality Avi animation through an Avi decoder, the character action design is fine, the character image is lovely and lively, the picture is clear, and the communication with the deaf-mute is convenient;

avi video lossless compression technique: the system firstly loads the existing sign language database during normal operation, and if the sign language database file is too large, the program is crashed, so that the Avi video lossless compression technology is used, the original sign language file is compressed in the form of Xvid coding by decoding the video file, and the space occupied by one sign language video is not more than 1 MB;

6. the system integration technology comprises the following steps: the program can complete the functions of sign language input and recognition, voice recognition and conversion into characters, character input, video calling and the like by using C # language programming on a Visual Studio platform through a computer.

In practical application, wearable glasses need to be comfortable, stable and light; at present, in many schemes, a mobile phone is used as a terminal, voice is received at the mobile phone terminal, information is processed, target language characters are transmitted to the terminal of the intelligent glasses through Bluetooth or other communication modes, and the target language characters are projected to the AR intelligent glasses; according to the scheme, one terminal device is added, so that the use is inconvenient; in addition, too many intermediate links cause the problem of delay in display of the target language to be serious.

Adopting a terminal to integrate voice acquisition, system decoding, data uploading/downloading and text output; according to the scheme, only one wearable glasses and one small terminal are arranged for the crowd with hearing impairment, the source language is transmitted to the terminal, the target language sentence is displayed on the AR glasses after the terminal analyzes the target language sentence, and the crowd with hearing impairment can 'hear' the external sound when wearing the glasses; the usage scenario is shown in fig. 4.

The specific implementation mode widens the channels for acquiring information, improves learning ability, improves life quality and helps the people to integrate normal work and life by means of artificial intelligence; the system has simple structure and low cost compared with a surgical cochlea.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof; it will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, and that various changes and modifications may be made without departing from the spirit and scope of the invention as defined by the appended claims; the scope of the invention is defined by the appended claims and equivalents thereof.

8页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种译文重对齐的循环神经网络跨语言机器翻译方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!