It is used to help the SOS system of deaf-mute

文档序号：1775477 发布日期：2019-12-03 浏览：7次中文

阅读说明：本技术 用于帮助聋哑人的sos系统 (It is used to help the SOS system of deaf-mute ) 是由穆斯塔法艾扎达李业芃姚凯阿克巴于 2019-07-10 设计创作，主要内容包括：一种用于帮助聋哑人的SOS系统,包括两个单色红外摄像头、三个红外LED、用于语音识别的高清电容式麦克风、GPRS实时语音呼叫系统、显示屏和处理器,所述处理器连接两个单色红外摄像头、三个红外LED、用于语音识别的高清电容式麦克风、GPRS实时语音呼叫系统和显示屏,所述处理器中,将用户的标志转换为响应者的音频,并且响应者的语音可以作为实时服务同时为呼叫者签名。本发明能够帮助聋哑人快速无障碍求救。(A kind of SOS system being used to help deaf-mute, including two monochromatic infrared cameras, three infrared LEDs, the high definition Electret Condencer Microphone for speech recognition, GPRS real-time voice call system, display screen and processor, the processor connects two monochromatic infrared cameras, three infrared LEDs, the high definition Electret Condencer Microphone for speech recognition, GPRS real-time voice call system and display screen, in the processor, the mark of user is converted to the audio of respondent, and the voice of respondent can be used as real time service while be caller's signature.The present invention can help deaf-mute's quickly accessible emergency.)

1. a kind of SOS system for being used to help deaf-mute, which is characterized in that including two monochromatic infrared cameras, three infrared LED, the high definition Electret Condencer Microphone for speech recognition, GPRS real-time voice call system, display screen and processor, the place It manages device and connects two monochromatic infrared cameras, three infrared LEDs, the high definition Electret Condencer Microphone for speech recognition, GPRS reality When voice calling system and display screen, in the processor, the mark of user is converted to the audio of respondent, and respondent Voice can be used as real time service and meanwhile be caller signature.

2. being used to help the SOS system of deaf-mute as described in claim 1, which is characterized in that the SOS system is mobile logical Believe terminal, be suitable for Android and iOS operating system, setting number shakes mobile communication terminal starting SOS again and exhales after shaking It cries.

3. being used to help the SOS system of deaf-mute as claimed in claim 2, which is characterized in that the mobile communication terminal is also Including for using model trained in advance to carry out the module of sign language and audio conversion.

4. being used to help the SOS system of deaf-mute as claimed in claim 3, which is characterized in that the step of the model foundation It is as follows:

1. constructing 3D animation data collection；

2. analog to digital is converted；

3. unsupervised learning model；

4. repeating step 3 to reach local minimum error；

5. forwarding the internal representation of supervised learning model；

6. comparing the estimation and reality output result of supervised learning；

7. repeating step 5 accurately and effectively to be exported；

8. the sign language of translation is generated as the animation 3D role with label；

9. trained model to be exported to the App of autonomous device and mobile communication terminal；

After generating the sign language movement of different language, in memory by these sign languages movement storage.

5. being used to help the SOS system of deaf-mute as claimed in claim 4, which is characterized in that in the step 1, use two A monochrome thermal camera and three infrared LEDs simulate all alphabet sets of different sign languages, and store it in memory；

It in the step 2, using analog-digital converter, is modulated using 8 bit pulse lengths, generates numeral input and instructed to computer Practice, test and verification；

In the step 3, audio bitstream is forwarded to the input layer of autocoder neural network, starts unsupervised learning, leads to It crosses and input content encode and be decoded it to generate internal representation to reproduce it again；

Autocoder is a three-layer neural network: input, hides and output layer, input layer are come using instar learning algorithm Its weight is updated, as shown in formula (1):

Dw=lr*a* (p '-w) (1)

Output layer uses outstar learning algorithm, as shown in formula (2):

Dw=lr* (a-w) * p ' (2)

Wherein, dw is weight transformation matrices, and lr is learning rate, and it is the output vector of this layer, w that it, which is the adjustable non-negative factor, It is weight matrix, p ' is input vector；

In the step 4, step 2 is repeated, to reach local minimum error, by reducing mean absolute error MAE, such as formula (3) shown in:

Error=output vector-input cue (3)

Wherein, error is mean absolute error MAE, and output vector is output vector, and input cue is input vector；

In the step 5, after completing unsupervised learning, the internal representation of autocoder has been forwarded to quantum neural net Network is with the study that exercises supervision；

The network is three layers of quantum nerve network, uses the internal representation of formula (4) processing autocoder

U_NIt is the unitary matrix of qubit, σ_iIt is the Pauli matrix of i ∈ { 1,2,3 }, i.e. Pauli matrices, σ₀It is 2 × 2 unit matrixs, a_jIt is reality output；

Pauli matrix, also referred to as pauli spin matrix are the complex matrixes generated when Pauli processing spin in quantum mechanics, should Matrix is defined by equation (5)；

Wherein, σ₀It is 2 × 2 unit matrixs, σ_iIt is the Pauli matrix of i ∈ { 1,2,3 }；

In the step 6, while it is desirable to output be correlated inputs audio content 3D animation sign language, equation (6) and (7) use It is compared by gradient descent algorithm with reality output:

Wherein,It is variation renewal amount, lr is the learning rate in formula (2),It is real output value,It is cost letter Several local derviation values；

Wherein, C is the cost function defined using mean square error,For real output value, For desired output；

In the step 7, step 5 is repeated, to reach the smallest global error, by the mean square error for reducing equation (7)；

In the step 8, trained model is now ready for associating each sign language, and the sign language is as database and its Relevant input audio content stores together；

In the step 9, user can he customized 3D role, as autonomous device or mobile communication terminal application 3D animation object derived from program institute.

6. the SOS system for being used to help deaf-mute as described in one of Claims 1 to 4, which is characterized in that the processor is raspberry pi 3B +。

Technical field

The present invention relates to a kind of SOS systems, and be converted to any given audio content using artificial intelligence may customize by 3D The sign language of animation model, to help deaf-mute's quickly accessible emergency.

Background technique

Deaf-mute can not speak or listen attentively to, this is usually caused by disfluency or operation, and this inconvenience leads to deaf-mute It is more reluctant to speak in certain social lives.The damage of trachea cannula, tracheostomy or vocal cords or tracheae to disease or wound Patient may be made very dejected.According to statistics, 10, there is in 000 people the life of 8 man days deaf and dumb.But presently, there are more in world population Few deaf-mute does not have precise figure.Deaf and dumb the problem of being often as injured or related Broca brain region, is caused.

As soon as he suffers from hearing loss symptom when people's binaural listening threshold value is 25dB or higher." hearing barrier Hinder " refer to hearing loss from slightly to serious people.Deaf person has serious hearing loss mostly, it means that they almost without Hearing.There are about 4.66 hundred million people to suffer from hearing loss in the whole world, more than the 5% of world population.It is estimated that the year two thousand fifty, more than 900,000,000 people Or just there is a people to will suffer from hearing disability in every 10 people.They will have to exchange using sign language.

Therefore, when hearing-impaired people wants urgent call fireman, police or ambulance, the problem arises.Promptly In the case of seize every minute and second, be a matter of life and death sometimes.The people (deaf-mute) of many dysaudias or disfluency has found that oneself can not be Effective communication is carried out in immense pressure and panic situation.So should have the translation clothes that can save life as SOS system Business.It may be said in Britain someone, if those deaf-mutes can not link up, they can send short messages, and be contacted using eSMS SOS service centre；But actually claim according to urgent text service website (emergency text service website): " you (deaf-mute) needs about two minutes time to inform their emergence messages.If other side does not reply in three minutes, I Suggest that you send another a piece of news."

Although the average answering time of SOS calling is only 7 seconds or so, text service can not phase with voice-based service It mentions and mentioning in the same breath.Deaf-mute or any people with asthma and dyspnea symptom, when they want to send a telegraph first-aid centre notify police, When ambulance or Fire-Fighting Service, they require to help immediately.

On the other hand, SOS signal is one by three points, the continuous Morse code of three dashes and three points composition String, between there is no space or fullstop (... --- ...).Due in international morse code, three points be denoted as " S " and Three dashes are denoted as " O ", therefore for convenience, therefore the signal is referred to as " SOS ".

As component part huge and important in community, deaf-mute need especially service by sign language interpreter at audio with And by audio translation at sign language, to help them to understand the thing around occurred, especially in the public field comprising audio content Institute.Such as police office, hospital and fire-fighting and various emergencies；Either street and any crowded places, or It is other any places that emergency occurs and needs to immediately treat.

Summary of the invention

It, can be with the present invention provides one kind in order to overcome the shortcomings of that prior art deaf-mute can not realize that SOS is called in time Auxiliary deaf-mute realizes the SOS system for being used to help deaf-mute of SOS calling in time.

The technical solution adopted by the present invention to solve the technical problems is:

A kind of SOS system being used to help deaf-mute, including two monochromatic infrared cameras, three infrared LEDs, it is used for language High definition Electret Condencer Microphone, GPRS real-time voice call system, display screen and the processor of sound identification, the processor connection two A monochrome infrared camera, three infrared LEDs, the high definition Electret Condencer Microphone for speech recognition, GPRS real-time voice call The mark of user in the processor, is converted to the audio of respondent by system and display screen, and the voice of respondent can be with It is simultaneously caller's signature as real time service.

Further, the SOS system is mobile communication terminal, is suitable for Android and iOS operating system, sets number Mobile communication terminal starting SOS calling is shaken after shake again.

Further, the mobile communication terminal further includes for using model trained in advance to carry out sign language and audio conversion The module changed.

The step of model foundation, is as follows:

1. constructing 3D animation data collection；

2. analog to digital is converted；

3. unsupervised learning model；

4. repeating step 3 to reach local minimum error；

5. forwarding the internal representation of supervised learning model；

6. comparing the estimation and reality output result of supervised learning；

7. repeating step 5 accurately and effectively to be exported；

8. the sign language of translation is generated as the animation 3D role with label；

9. trained model to be exported to the App of autonomous device and mobile communication terminal.

After generating the sign language movement of different language, in memory by these sign languages movement storage；Above-mentioned preparatory training Model can by any input audio Content Transformation be sign language.

In the step 1, different sign languages are simulated (such as Arab using two monochromatic thermal cameras and three infrared LEDs Language, Chinese, English and Russian) all alphabet sets, and store it in memory.

In the step 2, using analog-digital converter, modulated using 8 bit pulse lengths, generate numeral input to computer into Row training, test and verification；

In the step 3, audio bitstream is forwarded to the input layer of autocoder neural network, starts unsupervised It practises, generates internal representation by encode and be decoded it to input content to reproduce it again；

Autocoder is a three-layer neural network: input, hides and output layer, input layer are calculated using instar study Method updates its weight, as shown in formula (1):

Dw=lr*a* (p '-w) (1)

Output layer uses outstar learning algorithm, as shown in formula (2):

Dw=lr* (a-w) * p ' (2)

Wherein, dw is weight transformation matrices, and lr is learning rate, it is the adjustable non-negative factor, a be this layer output to Amount, w is weight matrix, and p ' is input vector；

In the step 4, step 2 is repeated, it is such as public by reducing mean absolute error MAE to reach local minimum error Shown in formula (3):

Error=output vector-input cue (3)

Wherein, error is mean absolute error MAE, and output vector is output vector, input cue be input to Amount；

In the step 5, after completing unsupervised learning, the internal representation of autocoder has been forwarded to quantum mind Through network with the study that exercises supervision；

The network is three layers of quantum nerve network, uses the internal representation of formula (4) processing autocoder

U_NIt is the unitary matrix of qubit, σ_iIt is the Pauli matrix of i ∈ { 1,2,3 }, i.e. Pauli matrices, σ₀It is 2 × 2 units Matrix, a_jIt is reality output；

Pauli matrix, also referred to as pauli spin matrix are the complicated squares generated when Pauli processing spin in quantum mechanics Battle array, the matrix are defined by equation (5)；

Wherein, σ₀It is 2 × 2 unit matrixs, σ_iIt is the Pauli matrix of i ∈ { 1,2,3 }；

In the step 6, while it is desirable to output be correlated inputs audio content 3D animation sign language, equation (6) and (7) It is compared with reality output using gradient descent algorithm:

Wherein,It is variation renewal amount, lr is the learning rate in formula (2),It is real output value,It is into The local derviation value of this function；

Wherein, C is the cost function defined using mean square error,For real output value,For desired output；

In the step 7, step 5 is repeated, to reach the smallest global error, by the mean square error for reducing equation (7)；

In the step 8, trained model is now ready for associating each sign language, and the sign language is as database Relative input audio content stores together；

In the step 9, user can he customized 3D role's (colour of skin, the bodily form of dressing up, facial expression, style etc.), As 3D animation object derived from autonomous device or mobile communication terminal application program institute.

Further, the processor is raspberry pi 3B+.

Beneficial effects of the present invention are mainly manifested in: can help deaf-mute's quickly accessible emergency.

Detailed description of the invention

Fig. 1 is the schematic diagram of self-encoding encoder neural network (unsupervised void learning model).

Fig. 2 is the schematic diagram of quantum nerve network (supervised learning model).

Specific embodiment

The invention will be further described below in conjunction with the accompanying drawings.

Referring to Figures 1 and 2, a kind of SOS system being used to help deaf-mute, including two monochromatic infrared cameras, three Infrared LED, the high definition Electret Condencer Microphone for speech recognition, GPRS real-time voice call system, display screen and processor, institute State processor connect two monochromatic infrared cameras, three infrared LEDs, for speech recognition high definition Electret Condencer Microphone, The mark of user in the processor, is converted to the audio of respondent by GPRS real-time voice call system and display screen, and The voice of respondent can be used as real time service while be caller's signature.

Further, the SOS system is mobile communication terminal, is suitable for Android and iOS operating system, sets number Mobile communication terminal starting SOS calling is shaken after shake again.

Further, the mobile communication terminal further includes for using model trained in advance to carry out sign language and audio conversion The module changed.

The step of model foundation, is as follows:

1. constructing 3D animation data collection；

2. analog to digital is converted；

3. unsupervised learning model；

4. repeating step 3 to reach local minimum error；

5. forwarding the internal representation of supervised learning model；

6. comparing the estimation and reality output result of supervised learning；

7. repeating step 5 accurately and effectively to be exported；

8. the sign language of translation is generated as the animation 3D role with label；

9. trained model to be exported to the App of autonomous device and mobile communication terminal.

In the step 2, using analog-digital converter, modulated using 8 bit pulse lengths, generate numeral input to computer into Row training, test and verification.

In the step 3, audio bitstream is forwarded to the input layer (Fig. 1) of autocoder neural network, starts without prison Educational inspector practises, and generates internal representation by encode and be decoded it to input content to reproduce it again；

Autocoder is a three-layer neural network: input, hides and output layer, input layer are calculated using instar study Method updates its weight, as shown in formula (1):

Dw=lr*a* (p '-w) (1)

Output layer uses outstar learning algorithm, as shown in formula (2):

Dw=lr* (a-w) * p ' (2)

Wherein, dw is weight transformation matrices, and lr is learning rate, it is the adjustable non-negative factor, a be this layer output to Amount, w is weight matrix, and p ' is input vector；

In the step 4, step 2 is repeated, it is such as public by reducing mean absolute error MAE to reach local minimum error Shown in formula (3):

Error=output vector-input cue (3)

Wherein, error is mean absolute error MAE, and output vector is output vector, input cue be input to Amount；

In the step 5, after completing unsupervised learning, the internal representation of autocoder has been forwarded to quantum mind Through network (Fig. 2) with the study that exercises supervision；