Voice signal processing method and device and electronic device

文档序号：172457 发布日期：2021-10-29 浏览：41次中文

阅读说明：本技术 语音信号的处理方法和装置及电子装置 (Voice signal processing method and device and electronic device ) 是由孙宗明于 2021-06-29 设计创作，主要内容包括：本发明公开了一种语音信号的处理方法和装置及电子装置,其中,上述方法包括：接收语音指令信号；判断所述语音指令信号是否包括安全敏感信息,在判定所述语音指令信号包括所述安全敏感信息的情况下,判断所述语音指令信号是否合法；在判定所述语音指令信号为合法的情况下,响应所述语音指令信号。采用上述技术方案,解决了相关技术中,语音指令信号交互存在安全风险的问题,有效地避免了执行不合法的语音指令信息造成安全隐患的问题,有效地提高了语音指令信号交互的可靠性和安全性。(The invention discloses a method and a device for processing a voice signal and an electronic device, wherein the method comprises the following steps: receiving a voice instruction signal; judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information; and responding to the voice command signal under the condition that the voice command signal is judged to be legal. By adopting the technical scheme, the problem of safety risk in voice instruction signal interaction in the related technology is solved, the problem of potential safety hazard caused by executing illegal voice instruction information is effectively avoided, and the reliability and safety of voice instruction signal interaction are effectively improved.)

1. A method for processing a speech signal, comprising:

receiving a voice instruction signal;

judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information;

and responding to the voice command signal under the condition that the voice command signal is judged to be legal.

2. The method of processing a speech signal according to claim 1, wherein after determining whether the speech command signal includes security sensitive information, the method further comprises:

responding to the voice command signal if it is determined that the voice command signal does not include security sensitive information.

3. The method of claim 1, wherein determining whether the voice command signal includes security sensitive information comprises:

and under the condition that the voice instruction signal comprises at least one of entertainment information, weather information and time information, judging that the voice instruction signal does not comprise the safety sensitive information.

4. The method for processing the voice signal according to claim 1, wherein the determining whether the voice command signal is legal comprises:

acquiring image information of a requester who initiates the voice instruction signal, and judging whether the image information is consistent with pre-stored legal requester image information or not according to the image information;

and under the condition that the image information is consistent with the pre-stored legal requester image information, judging that the voice instruction signal is legal.

5. The method for processing the voice signal according to claim 4, wherein the determining whether the voice command signal is legal comprises:

receiving verification information under the condition that the image information is inconsistent with the image information of a pre-stored legal requester, and judging whether the verification information is consistent with the pre-stored verification information or not;

judging that the voice instruction signal is legal under the condition that the verification information is consistent with prestored verification information;

and under the condition that the verification information is inconsistent with the pre-stored verification information, judging that the voice instruction signal is illegal.

6. The method of claim 5, wherein determining whether the voice command signal is legitimate comprises determining whether the image information is consistent with pre-stored legitimate requester image information, judging that the voice command signal is illegal under the condition that the image information is inconsistent with the pre-stored image information of a legal requester, judging that the voice instruction signal is legal under the condition that the image information is consistent with the pre-stored legal requester image information, or, judging whether the voice command signal is legal or not, including judging whether the verification information is consistent with the pre-stored verification information or not, judging that the voice command signal is illegal under the condition that the verification information is inconsistent with the pre-stored verification information, and under the condition that the verification information is consistent with the pre-stored verification information, judging that the voice instruction signal is legal.

7. The method of processing a speech signal according to claim 1, wherein before determining whether the speech command signal includes security sensitive information, the method further comprises:

judging whether the voice instruction signal is known voiceprint information or not, and judging whether the voice instruction signal comprises safety sensitive information or not under the condition that the voice instruction signal is the known voiceprint information;

and directly refusing to respond to the voice command signal under the condition of judging that the voice command signal is not the known voiceprint information.

8. The method of processing a speech signal according to any one of claims 1 to 7, wherein the security-sensitive information comprises at least one of: financial security control information, spatial security control information, equipment security control information.

9. An apparatus for processing a speech signal, comprising:

the receiving module is used for receiving the voice instruction signal;

the judging module is used for judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information;

and the response module is used for responding to the voice instruction signal under the condition that the voice instruction signal is judged to be legal.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 8 by means of the computer program.

Technical Field

The present invention relates to the field of communications, and in particular, to a method and an apparatus for processing a voice signal, and an electronic apparatus.

Background

In the prior art, voiceprint recognition is determined through input audio analysis, whether real operation is carried out by a user cannot be determined, and certain safety risk exists. Voiceprint recognition requires that a user inputs certain audio information at a terminal, the system extracts voiceprint characteristics, after the input is completed, and when the user interacts next time, the newly input audio voiceprint characteristics of the background system are compared with those of the existing system, and then the user information is determined.

In the prior art, compared with the newly input audio voiceprint characteristics according to the voiceprint characteristics of the existing system, although the information of the input audio can be distinguished, whether the audio is really input by a user or is input in an analog mode through other modes cannot be determined, and therefore the intelligent home equipment has safety risks.

Disclosure of Invention

The embodiment of the invention provides a method and a device for processing a voice signal and an electronic device, which are used for at least solving the problems of safety risks existing in voice signal interaction in the related art.

According to an embodiment of the present invention, there is provided a method for processing a speech signal, including: receiving a voice instruction signal; judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information; and responding to the voice command signal under the condition that the voice command signal is judged to be legal.

In an exemplary embodiment, after determining whether the voice instruction signal includes security sensitive information, the method further comprises: responding to the voice command signal if it is determined that the voice command signal does not include security sensitive information.

In one exemplary embodiment, determining whether the voice instruction signal includes security sensitive information comprises: and under the condition that the voice instruction signal comprises at least one of entertainment information, weather information and time information, judging that the voice instruction signal does not comprise the safety sensitive information.

In an exemplary embodiment, determining whether the voice instruction signal is legitimate includes: acquiring image information of a requester who initiates the voice instruction signal, and judging whether the image information is consistent with pre-stored legal requester image information or not according to the image information; and under the condition that the image information is consistent with the pre-stored legal requester image information, judging that the voice instruction signal is legal.

In an exemplary embodiment, determining whether the voice instruction signal is legitimate includes: receiving verification information under the condition that the image information is inconsistent with the image information of a pre-stored legal requester, and judging whether the verification information is consistent with the pre-stored verification information or not; judging that the voice instruction signal is legal under the condition that the verification information is consistent with prestored verification information; and under the condition that the verification information is inconsistent with the pre-stored verification information, judging that the voice instruction signal is illegal.

In an exemplary embodiment, determining whether the voice instruction signal is legitimate includes: judging whether the image information is consistent with the image information of a pre-stored legal requester, judging whether the voice instruction signal is illegal under the condition that the image information is inconsistent with the image information of the pre-stored legal requester, judging whether the voice instruction signal is legal under the condition that the image information is consistent with the image information of the pre-stored legal requester, or judging whether the voice instruction signal is legal, wherein the judging comprises judging whether verification information is consistent with pre-stored verification information, judging that the voice instruction signal is illegal under the condition that the verification information is inconsistent with the pre-stored verification information, and judging that the voice instruction signal is legal under the condition that the verification information is consistent with the pre-stored verification information.

In one exemplary embodiment, prior to determining whether the voice instruction signal includes security sensitive information, the method further comprises: judging whether the voice instruction signal is known voiceprint information or not, and judging whether the voice instruction signal comprises safety sensitive information or not under the condition that the voice instruction signal is the known voiceprint information; and directly refusing to respond to the voice command signal under the condition of judging that the voice command signal is not the known voiceprint information.

In one exemplary embodiment, the security-sensitive information includes at least one of: financial security control information, spatial security control information, equipment security control information.

According to another embodiment of the present invention, there is also provided a processing apparatus of a voice signal, wherein the receiving module is configured to receive a voice instruction signal; the judging module is used for judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information; and the response module is used for responding to the voice instruction signal under the condition that the voice instruction signal is judged to be legal.

According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the processing method of the voice signal through the computer program.

In the embodiment of the invention, a voice instruction signal is received; judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information; the technical scheme is adopted, the problem that safety risks exist in voice instruction signal interaction in the related technology is solved, the problem of potential safety hazards caused by executing illegal voice instruction information is effectively avoided, and the reliability and safety of voice instruction signal interaction are effectively improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a block diagram of a hardware configuration of a computer terminal of a speech signal processing method according to an embodiment of the present invention;

fig. 2 is a flowchart of a processing method of a voice signal according to an embodiment of the present invention;

fig. 3 is a flowchart of a processing method of a validity judgment of a speech signal according to an embodiment of the present invention;

FIG. 4 is a flow chart of a method of voiceprint information determination of a speech signal according to an embodiment of the present invention;

fig. 5 is a block diagram of a speech signal processing apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The method provided by the embodiment of the application can be executed in a computer terminal, a computer terminal or a similar operation device. Taking the example of being operated on a computer terminal, fig. 1 is a block diagram of a hardware structure of the computer terminal of a processing method for processing a voice signal according to an embodiment of the present invention. As shown in fig. 1, the computer terminal may include one or more (only one shown in fig. 1) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and in an exemplary embodiment, may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration with equivalent functionality to that shown in FIG. 1 or with more functionality than that shown in FIG. 1.

The memory 104 can be used for storing computer programs, for example, software programs and modules of application software, such as a computer program corresponding to the processing method of the speech signal instruction processing in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In this embodiment, a method for processing a voice signal is provided, which is applied to the computer terminal, and fig. 2 is a flowchart of the method for processing a voice signal according to the embodiment of the present invention, where the flowchart includes the following steps:

step S102, receiving a voice command signal;

step S104, judging whether the voice instruction signal comprises security sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the security sensitive information;

and step S106, responding to the voice command signal under the condition that the voice command signal is judged to be legal.

In the method for processing the voice signal, firstly, a voice instruction signal is received, secondly, whether the voice instruction signal comprises safety sensitive information or not is judged, whether the voice instruction signal is legal or not is judged under the condition that the voice instruction signal comprises the safety sensitive information, and finally, the voice instruction signal is responded under the condition that the voice instruction signal is legal. By adopting the technical scheme, the problem of safety risk in voice command signal interaction in the related technology is solved, the problem of potential safety hazard caused by executing illegal voice information is effectively avoided, and the reliability and safety of voice command signal interaction are effectively improved.

In step 102, after detecting that the voice audio of the user is input, the smart home system uploads the voice audio file to the smart home operating system, and the smart home operating system analyzes the voice audio file and extracts the voiceprint features, as shown in fig. 4. The intelligent home system is provided with a plurality of terminals, and voice and audio of a user are input from other terminals. The intelligent home system comprises the intelligent household appliances, and the intelligent household appliances are added with AI functions on the basis of the functions of the traditional household appliance equipment, so that the intelligent voice household appliances have the functions of voice interaction, life entertainment and household appliance control. Taking an intelligent refrigerator as an example, the intelligent refrigerator can realize the function of improving the food storage environment of the traditional refrigerator, and can also realize the functions of weather inquiry, alarm clock reminding, audio-video entertainment and the like. Further, with the application and popularization of the internet of things, the mobile phone can realize the operation and control of the smart home equipment applying the internet of things, becomes a main operation control terminal for family life, can realize the functions of control input, data output and the like, and for example, voice and audio input of a user can be recorded by the mobile phone and uploaded to a smart home operation system. The intelligent household appliance or the intelligent household equipment mainly comprises an intelligent refrigerator, an intelligent mobile phone, an intelligent sound box, an intelligent door control and the like. And uploading data generated and received by all the intelligent household appliances to a server of the intelligent home operation system through the Internet of things, and analyzing and processing the data by the server of the intelligent home operation system. The user can log in the server through the network by using the mobile terminal of the mobile phone to acquire and view the data of the intelligent household appliance. A plurality of intelligent household appliances form a multi-terminal multi-mode intelligent home system.

In step 104, it is determined whether the voice command signal includes security sensitive information, which may be understood as determining whether the voice command signal includes information related to the property security and personal privacy of the user, and when there is a possibility that a response to such information may generate a security risk, it is determined whether the voice command signal is legitimate, and it may also be understood as whether information in the voice command signal or other signals generated by a requester of the voice command signal matches pre-stored information, thereby verifying whether the requester has legitimate right of use of the device.

In one exemplary embodiment, after determining whether the voice instruction signal includes security sensitive information, the method further comprises: and responding to the voice instruction signal under the condition that the result of judging whether the voice instruction signal comprises the security sensitive information is negative.

That is to say, the voice command signal is judged not to include the security sensitive information, and the response operation of the voice command signal does not generate the security risk to the user, and the smart home operating system sends the execution command to the terminal to respond to the voice command signal. Before judging whether the voice instruction signal comprises safety sensitive information or not, the intelligent home operating system performs voice recognition and semantic understanding on the recorded voice audio file, and analyzes the interaction intention of the user.

In one exemplary embodiment, the security-sensitive information includes at least one of payment information, door open information, gas open information.

It should be noted that the security-sensitive information may also include information such as personal privacy information and information that may cause damage to the smart home device. For example, when the intelligent microwave oven performs a heating task on an article with a sealed package, an explosion may be generated, and the intelligent microwave oven may be damaged, and the control information based on the response operation is the security sensitive information. Payment information may be understood as information contained in the voice command signal whose response operation relates to a security risk of payment. The door opening information can be understood as information that the voice command signal requests to open a hall entrance door of the smart home, or information that the voice command signal requests to open a door of a bedroom, or information that the voice command signal requests to open a door of a safe.

In one exemplary embodiment, determining whether the voice instruction signal includes security sensitive information comprises: and in the case that the voice instruction signal includes at least one of entertainment information, weather information and time information, determining that the voice instruction signal does not include security-sensitive information. Therefore, the voice instruction signal is directly responded to without involving music information, weather information, calendar information.

Specifically, the entertainment information may include music information, image information, and the time information may include alarm clock information, calendar information.

In one exemplary embodiment, determining whether the voice instruction signal is legitimate includes: acquiring image information of a requester who initiates a voice instruction signal, and judging whether the image information is consistent with pre-stored legal requester image information or not according to the image information; and under the condition that the image information is consistent with the pre-stored legal requester image information, judging that the voice instruction signal is legal.

In other words, it is determined whether the voice command signal is legal, the image information of the requester of the acquired voice command signal needs to be compared with the pre-stored image information, the voice command signal is judged to be legal and the voice command signal is responded only under the condition that the collected image information is consistent with the pre-stored image information of the legal requester, and if the collected image information is inconsistent with the pre-stored image information of the legal requester, the response to the voice command signal is denied and the denial of access is prompted. In this embodiment, the pre-stored image information of the legal requester is pre-entered and stored in the smart home system by a legal user of the smart home device. Wherein the image information includes human identification information.

Further, on the basis that the user inputs voiceprint information, the intelligent home operating system comprehensively judges the legality of the user interaction according to a human recognition function, other image recognition functions and an APP (application) terminal recognition function of the intelligent household appliance under the user account, if the user interaction is legal, the user interaction is executed, and if the user interaction is not legal, the user interaction is refused to be executed and the conversation is ended. According to the voice signal processing method, the multi-terminal multi-mode based safe voice interaction system is established, and the safety of voice interaction is guaranteed. It should be noted that human recognition mainly uses an infrared sensor to sense a human body, and other technologies combined with image recognition can increase recognition accuracy.

It should be noted that the image information of the requester who initiates the voice instruction signal may also be obtained from other terminals of the smart home system, so that the user operation is more convenient. For example, the intelligent gas stove is inconvenient to collect image information of a requester of the voice instruction signal, and the requester who initiates the voice instruction signal can perform image comparison through equipment such as the intelligent range hood and the like to further verify the validity of the interaction. Furthermore, the smart phone is used as a main operation control terminal of the operating system, and can realize the recording of voice audio files, the recording of image files of a requester initiating a voice instruction signal and the uploading of the audio files and the image files to a server of the smart home operating system.

In one exemplary embodiment, determining whether the voice instruction signal is legitimate includes: under the condition that the image information is inconsistent with the image information of the pre-stored legal requester, receiving verification information and judging whether the verification information is consistent with the pre-stored verification information or not; judging that the voice instruction signal is legal under the condition that the verification information is consistent with the pre-stored verification information; and under the condition that the verification information is inconsistent with the preset verification information, judging that the voice instruction signal is illegal.

Further, under the condition that the image information is inconsistent with the image information of the pre-stored legal requester, the smart home system sends out verification information, the requester of the obtained voice instruction signal can check the verification information at a terminal of the smart home system and feed back the verification information to the smart home system, the terminal comprises but is not limited to a smart phone, and the requester uses the phone to check the verification information and uploads the verification information to a server of the smart home operating system for verification through a network. The intelligent home system compares the verification information fed back by the requester with the pre-stored verification information, if the verification information and the pre-stored verification information are consistent, the obtained voice instruction signal is judged to be legal, and the intelligent home system responds to the voice instruction signal; if the two are not consistent, the obtained voice instruction signal is judged to be illegal, the intelligent home system refuses to respond to the voice instruction signal, and the refusal of access is prompted.

In one exemplary embodiment, determining whether the voice instruction signal is legitimate includes: judging whether the image information is consistent with the image information of a pre-stored legal requester, judging whether the voice instruction signal is illegal under the condition that the image information is inconsistent with the image information of the pre-stored legal requester, judging whether the voice instruction signal is legal under the condition that the image information is consistent with the image information of the pre-stored legal requester, or judging whether the voice instruction signal is legal, wherein the judging comprises judging whether the verification information is consistent with the pre-stored verification information, judging that the voice instruction signal is illegal under the condition that the verification information is inconsistent with the pre-stored verification information, and judging that the voice instruction signal is legal under the condition that the verification information is consistent with the pre-stored verification information.

Judging whether the voice instruction signal comprises known voiceprint information or not, wherein the judgment can be understood that the intelligent home operating system can analyze the voice instruction signal which is input by a terminal and stored in a server of the intelligent home operating system and extract the voiceprint information, comparing the voiceprint characteristics in the voiceprint information with the voiceprint characteristics in the known voiceprint information, and if the voiceprint characteristics in the voiceprint information are consistent with the voiceprint characteristics in the known voiceprint information, judging that the voice instruction signal comprises the known voiceprint information; if the two are not consistent, the voice command signal is judged not to include the known voiceprint information, and the response of the voice command signal is directly refused.

Specifically, the financial security control information may include payment information, the spatial security control information may include door opening information, and the device security control information may include gas range switching information.

In order to better understand the process of the processing method of the voice signal instruction processing, the following describes a flow of the implementation method of the voice signal instruction processing with reference to an alternative embodiment, but the flow is not limited to the technical solution of the embodiment of the present invention.

In this embodiment, a method for processing a speech signal is provided, and fig. 3 is a schematic diagram of a method for processing a speech signal according to an embodiment of the present invention, as shown in fig. 3, the following steps are specifically performed:

step 301: receiving a voice instruction signal;

step 302: it is determined whether the voice command signal includes known voiceprint information. If not, go to step 303, if yes, go to step 304;

step 303: prompting unknown users and refusing responses;

step 304: and judging whether the voice instruction signal comprises security sensitive information. If not, go to step 305, if yes, go to step 306;

in other words, the smart home operating system completes voice recognition and semantic understanding according to the voice instruction signal obtained by analyzing the recorded audio file, analyzes the interaction intention of the requester who initiates the voice instruction signal, and can directly execute the functions which do not relate to safety risks, such as listening to songs, checking weather, checking calendar and the like; if the intention is that the payment, the door opening, the gas starting and the like involve property and personal safety risks, the next step of judgment is carried out.

Step 305: responding the voice command signal normally;

step 306: acquiring image information of a requester who initiates the voice instruction signal;

step 307: and judging whether the image information is consistent with the image information of the pre-stored legal requester. If not, go to step 308, if yes, go to step 309;

step 308: responding the voice command signal normally;

step 309: receiving verification information;

the verification information is sent by the intelligent home system, and is checked by a requester who sends a voice instruction signal and fed back to the server of the intelligent home operating system for verification.

Step 310: judging whether the image information is consistent with the pre-stored verification information, if not, executing step 311, and if so, executing step 312;

step 311: responding the voice command signal normally;

step 312: prompting unknown users and refusing responses;

further, when step 312 is executed in 3 consecutive cycles, the smart home system may perform a locking operation and issue an inquiry ticket to the formulating terminal.

In the method for processing the voice signal, whether the voice instruction signal comprises voiceprint information or not is judged firstly, legal login of a user is guaranteed, whether the voice instruction signal comprises security sensitive information or not is judged secondly, the voice instruction signal is directly responded under the condition that the security sensitive information is not included, finally, legality judgment is carried out on the voice instruction signal comprising the security sensitive information, if the voice instruction signal is legal, normal response is carried out, and if the voice instruction signal is illegal, response is refused and prompt is carried out. According to the technical scheme, the voice command signal related to the safety risk is prevented from being directly responded, so that the personal safety and benefits of a user are prevented from being damaged.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The embodiment of the invention also provides a processing device of the voice instruction signal, which comprises a receiving module, a processing module and a processing module, wherein the receiving module is used for receiving the voice instruction signal; the judging module is used for judging whether the voice instruction signal comprises safety sensitive information or not and judging whether the voice instruction signal is legal or not; and the response module is used for responding the voice instruction signal under the condition that the judgment module judges that the voice instruction signal is legal.

Fig. 5 is a block diagram of a processing apparatus of a voice instruction signal according to an embodiment of the present invention; as shown in fig. 5, includes:

a receiving module 42, configured to receive a voice instruction signal;

the receiving module 42 includes, but is not limited to, a sound card, a microphone, and the like, the receiving module 42 may be disposed on each terminal of the smart home system, and a user may add new equipment to a user account of the smart home system to continuously add terminals of the smart home system.

A judging module 44, configured to judge whether the voice instruction signal includes security sensitive information, and under the condition that it is judged that the voice instruction signal includes the security sensitive information, judge whether the voice instruction signal is legal;

and the response module 46 responds to the voice instruction signal under the condition that the voice instruction signal is judged to be legal.

Receiving a voice command signal through the device; judging whether the voice instruction signal comprises safety sensitive information or not, and judging whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information; and responding to the voice command signal under the condition that the voice command signal is legal. The problem that in the related art, whether the voice command signal is responded or not is judged only through input audio analysis is solved through judging the legality of the voice command signal, and the problem that in the prior art, whether the user operates really or not cannot be judged, and therefore the intelligent home system has certain safety risks is solved. By adopting the technical scheme of the specific embodiment of the application, the safety of voice interaction can be effectively improved.

In one exemplary embodiment, a processing device includes: and the image processing module is used for acquiring the image information of the requester sending the voice instruction signal.

It should be noted that the image processing module may be disposed in different terminals and uploaded to the server of the smart home operating system in a unified manner, and the image processing module includes an image capture device. It should be noted that the image capturing device includes, but is not limited to, a camera, a digital camera, a radar, and an image capturing card with a video capturing function.

In an exemplary embodiment, the determining module is further configured to determine whether the voice instruction signal includes known voiceprint information.

That is, the determining module further analyzes the voice command signal, compares the analyzed voiceprint information with the known voiceprint information, and determines that the voice command signal contains the known voiceprint information if the analyzed voiceprint information is matched with the known voiceprint information. Known voiceprint information is stored in the smart home system.

Embodiments of the invention also provide a storage medium comprising a stored program, wherein the program performs any of the above methods when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store program codes for performing the following steps:

s1, receiving a voice command signal;

s2, judging whether the voice instruction signal includes security sensitive information, and judging whether the voice instruction signal is legal under the condition that the voice instruction signal includes the security sensitive information;

s3, when the voice command signal is legal, the apparatus responds to the voice command signal.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, receiving a voice command signal;

s3, when the voice command signal is legal, the apparatus responds to the voice command signal.

Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

The embodiment of the invention also provides an intelligent home system which comprises an electronic device, wherein the electronic device is the electronic device of any one of the electronic devices.

Furthermore, wireless technologies which can be adopted by the intelligent home system comprise wireless communication technologies such as Wi-Fi, ZigBee and Bluetooth, wherein the ZigBee-based wireless communication has the advantages of low power consumption, high capacity, strong anti-interference capability, high Wi-Fi-based wireless communication data transmission efficiency and long transmission distance. The functions of the intelligent home system can include intelligent home control, entertainment audio and video, environment control, home security, doorbell access, health monitoring and other special convenient applications.

For example, the smart home system includes a smart gas range. The method comprises the steps that a user sends a voice instruction, an intelligent gas stove or other intelligent home system terminals perform input and upload to a server of an intelligent home operating system, the intelligent home operating system performs voice recognition on an audio file, extracts voice information comprising voiceprint information, compares known voice information prestored in the intelligent home system, and judges that the voice information does not comprise the known voiceprint information if the comparison result shows that the voice information is not matched, and the intelligent gas stove can prompt the user to refuse access and refuse response through the voice system or a display system; if the comparison result shows that the voice information comprises the known voiceprint information, the intelligent home system judges the next step, namely the intelligent home system judges whether the voice information comprises safety sensitive information or not, the intelligent home system performs voice analysis and semantic understanding on the voice information to obtain the current interaction intention of the user through analysis, if the response operation of the voice information comprises the operation of entertainment audio and video, weather inquiry, calendar inquiry and the like which can not cause personal safety risk, the voice information of the voice instruction signal does not comprise the safety sensitive information, and other terminals of the intelligent gas stove or the intelligent home system can immediately respond to the voice instruction signal; if the response operation of the voice message includes operations that may cause personal safety risks such as turning on a switch and continuing heating, then judging that the voice information of the voice instruction signal comprises safety sensitive information, then starting image acquisition equipment to acquire images of the requester who initiates the voice instruction signal by other terminals of the intelligent gas stove or the intelligent home system, and the image file is uploaded to an intelligent home system, the intelligent home system performs image recognition on the image file, and extracts the image information of the requester, the smart home system compares the image information with the pre-stored legal requester image information stored in the smart home system, if the image information is matched with the pre-stored legal requester image information stored in the smart home system, if the image information is consistent with the image information of the pre-stored legal requester, other terminals of the intelligent gas stove or the intelligent home system immediately respond to the voice instruction signal; if the image information is not matched with the image information of the pre-stored legal requester stored in the intelligent home system, judging that the image information is not consistent with the image information of the pre-stored legal requester, the intelligent home system sends verification information, a requester who initiates a voice instruction signal can check the verification information at a certain terminal (such as an intelligent bracelet, an intelligent mobile phone and intelligent glasses) of the intelligent home system and feed the verification information back to the intelligent home system, and if the verification information fed back by a user is matched with the pre-stored verification information stored in the intelligent home system, judging that the verification information is consistent with the pre-stored verification information stored in the intelligent home system, and other terminals of the intelligent gas stove or the intelligent home system immediately respond to the voice instruction signal; if the verification information fed back by the user is matched with the pre-stored verification information stored in the intelligent home system, the verification information is judged to be inconsistent with the pre-stored verification information stored in the intelligent home system, and other terminals of the intelligent gas stove or the intelligent home system prompt requesters who initiate voice instruction signals to reject access and reject to respond to the voice instruction signals through the display assembly or the sound box assembly.

Further, the intelligent gas stove comprises a receiving module, a judging module, a response module and an image processing module. The receiving module is used for receiving the voice instruction signal; the judging module is used for judging whether the voice instruction signal comprises safety sensitive information or not and judging whether the voice instruction signal is legal or not; the response module is used for responding the voice instruction signal under the condition that the judgment module judges that the voice instruction signal is legal; the image processing module is used for acquiring the image information of the requester sending the voice instruction signal. The receiving module of the intelligent gas stove can be a microphone or a sound card.

When the intelligent home system comprises the intelligent refrigerator, the intelligent refrigerator firstly receives a voice instruction signal, then judges whether the voice instruction signal comprises safety sensitive information or not, judges whether the voice instruction signal is legal or not under the condition that the voice instruction signal comprises the safety sensitive information, and finally responds to the voice instruction signal under the condition that the voice instruction signal is legal. The intelligent refrigerator comprises a receiving module, a judging module, a response module and an image processing module. The receiving module is used for receiving the voice instruction signal; the judging module is used for judging whether the voice instruction signal comprises safety sensitive information or not and judging whether the voice instruction signal is legal or not; the response module is used for responding the voice instruction signal under the condition that the judgment module judges that the voice instruction signal is legal; the image processing module is used for acquiring the image information of the requester sending the voice instruction signal. The response module of the intelligent refrigerator comprises an evaporator, a compressor and the like.

The smart home system can add new home devices to a user account of the smart home system, so that more and more devices are connected into the system, and inevitably, more operation data, such as temperature and clock data of an air conditioner, on-off state data of an indoor window, gas meter data and the like, are generated, and form unprecedented association degree with privacy of individuals and families. If the data protection is careless, not only can the data of extremely its privacy reveal such as individual habit, at the data of relation family safety, data reveal such as door state can directly harm family safety, and the intelligent home system of this application can carry out the legitimacy to the pronunciation command signal that involves privacy disclosure risk and judge to the risk that has greatly reduced the data and revealed.

The intelligent home system can further comprise an intelligent security terminal, and the functions of the intelligent security terminal mainly comprise video monitoring, an intercom system, an entrance guard all-in-one card, emergency help seeking, smoke detection alarm, gas leakage alarm, broken glass detection alarm and the like. And the data generated by the intelligent security terminal is uploaded to a server of the intelligent home system for storage.

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

17页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：场景处理方法、装置、智能网关及处理器

Voice signal processing method and device and electronic device

相关技术

网友询问留言