Man-machine interaction method, device, equipment and storage medium

文档序号：1800790 发布日期：2021-11-05 浏览：12次中文

阅读说明：本技术 人机交互方法、装置、设备以及存储介质 (Man-machine interaction method, device, equipment and storage medium ) 是由侯在鹏于 2021-07-30 设计创作，主要内容包括：本公开提供了人机交互方法、装置、设备以及存储介质,涉及人工智能技术领域。具体实现方案为：响应于满足预设条件,在预设时长内监听用户的语音指令；对语音指令进行分析,确定是否为无效指令；响应于确定语音指令为无效指令,忽略语音指令,继续监听用户的语音指令。本实现方式可以在预设条件满足时,进入连续对话模式,避免用户在人机交互过程中的多次唤醒,提高人机交互的效率和用户体验。(The disclosure provides a man-machine interaction method, a man-machine interaction device, equipment and a storage medium, and relates to the technical field of artificial intelligence. The specific implementation scheme is as follows: in response to the preset condition being met, monitoring a voice instruction of a user within a preset time length; analyzing the voice command to determine whether the voice command is an invalid command; and in response to determining that the voice command is an invalid command, ignoring the voice command and continuing to monitor the voice command of the user. The realization mode can enter a continuous conversation mode when the preset condition is met, so that the user is prevented from being awakened for many times in the human-computer interaction process, and the human-computer interaction efficiency and the user experience are improved.)

1. A human-computer interaction method, comprising:

in response to the preset condition being met, monitoring a voice instruction of a user within a preset time length;

analyzing the voice command to determine whether the voice command is an invalid command;

and in response to determining that the voice command is an invalid command, ignoring the voice command and continuing to monitor the voice command of the user.

2. The method of claim 1, wherein the method further comprises:

and determining that the preset condition is met in response to the fact that the current scene belongs to a preset scene set.

3. The method of claim 1, wherein the method further comprises:

and responding to the received preset voice instruction sent by the user, and determining that the preset condition is met.

4. The method of claim 1, wherein said analyzing said voice command to determine if it is an invalid command comprises:

performing intention recognition on the voice instruction, and determining user intention;

in response to determining that the user intent does not belong to a preset set of intentions, determining that the voice instruction is an invalid instruction.

5. The method of claim 1, wherein the method further comprises:

in response to determining that the voice instruction is a valid instruction, outputting response information.

6. The method of claim 4, wherein the outputting response information comprises:

extracting words corresponding to the slot positions in the reply template from the voice command;

filling the words into the slot positions to obtain a reply text;

and generating and outputting response information based on the reply text.

7. The method of claim 4, wherein the performing intent recognition on the voice instruction, determining user intent, comprises:

performing intention recognition on the voice command by using a pre-trained intention recognition model to determine the intention of the user;

the method further comprises the following steps:

and generating a training sample according to the voice instruction in response to receiving negative feedback information aiming at the response information, and training the intention recognition model again.

8. The method of claim 7, wherein the generating training samples from the voice instructions, retraining the intent recognition model, comprises:

taking the voice instruction as sample voice, and taking the intention identified by the intention identification model as an error intention corresponding to the sample voice;

training the intention recognition model as a negative sample from the sample speech and the erroneous intention.

9. The method of claim 1, wherein the method further comprises:

and entering a dormant state in response to the fact that the voice command is not monitored in the preset duration.

10. A human-computer interaction device, comprising:

the instruction monitoring unit is configured to monitor a voice instruction of a user within a preset time length in response to a preset condition being met;

the instruction analysis unit is configured to analyze the voice instruction and determine whether the voice instruction is an invalid instruction;

and the continuous monitoring unit is configured to respond to the voice instruction determined to be an invalid instruction, ignore the voice instruction and continue monitoring the voice instruction of the user.

11. The apparatus of claim 10, wherein the apparatus further comprises a condition determining unit configured to:

and determining that the preset condition is met in response to the fact that the current scene belongs to a preset scene set.

12. The apparatus of claim 10, wherein the apparatus further comprises a condition determining unit configured to:

and responding to the received preset voice instruction sent by the user, and determining that the preset condition is met.

13. The apparatus of claim 10, wherein the instruction analysis unit is further configured to:

performing intention recognition on the voice instruction, and determining user intention;

in response to determining that the user intent does not belong to a preset set of intentions, determining that the voice instruction is an invalid instruction.

14. The apparatus of claim 10, wherein the apparatus further comprises a response output unit configured to:

in response to determining that the voice instruction is a valid instruction, outputting response information.

15. The apparatus of claim 14, wherein the apparatus further comprises a response output unit further configured to:

extracting words corresponding to the slot positions in the reply template from the voice command;

filling the words into the slot positions to obtain a reply text;

and generating and outputting response information based on the reply text.

16. The apparatus of claim 13, wherein the instruction analysis unit is further configured to:

performing intention recognition on the voice command by using a pre-trained intention recognition model to determine the intention of the user;

the apparatus further comprises a model training unit configured to:

17. The apparatus of claim 16, wherein the model training unit is further configured to:

taking the voice instruction as sample voice, and taking the intention identified by the intention identification model as an error intention corresponding to the sample voice;

training the intention recognition model as a negative sample from the sample speech and the erroneous intention.

18. The apparatus of claim 10, wherein the apparatus further comprises a sleep unit configured to:

and entering a dormant state in response to the fact that the voice command is not monitored in the preset duration.

19. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.

20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-9.

21. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-9.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a human-computer interaction method, apparatus, device, and storage medium.

Background

At present, two interaction modes, namely single-round interaction and multi-round interaction, exist in voice on the smart television basically. Single round interaction, i.e. one wake-up, one interaction. And multiple rounds of interaction, namely one-time awakening and multiple interactions. The voice pickup can be carried out after the voice is called up every time in a single round of interaction, and the awakening-free multi-round voice input can be supported only for limited times after the voice is called up even though the voice is not required to be activated every time in the multi-round of interaction. When a user uses the television, the user needs to frequently call the voice by activating words to input a new voice command, so that the user cannot operate the television continuously and smoothly by the voice without obstacles.

Disclosure of Invention

The disclosure provides a human-computer interaction method, a human-computer interaction device, human-computer interaction equipment and a storage medium.

According to a first aspect, there is provided a human-computer interaction method, comprising: in response to the preset condition being met, monitoring a voice instruction of a user within a preset time length; analyzing the voice command to determine whether the voice command is an invalid command; and in response to determining that the voice command is an invalid command, ignoring the voice command and continuing to monitor the voice command of the user.

According to a second aspect, there is provided a human-computer interaction device comprising: the instruction monitoring unit is configured to monitor a voice instruction of a user within a preset time length in response to a preset condition being met; the instruction analysis unit is configured to analyze the voice instruction and determine whether the voice instruction is an invalid instruction; and the continuous monitoring unit is configured to respond to the voice instruction determined to be an invalid instruction, ignore the voice instruction and continue monitoring the voice instruction of the user.

According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect.

According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method as described in the first aspect.

According to a fifth aspect, a computer program product comprising a computer program which, when executed by a processor, implements the method as described in the first aspect.

According to the technology disclosed by the invention, the continuous conversation mode can be entered when the preset condition is met, so that the user is prevented from being awakened for many times in the human-computer interaction process, and the human-computer interaction efficiency and the user experience are improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a human-machine interaction method according to the present disclosure;

FIG. 3 is a schematic diagram of one application scenario of a human-computer interaction method according to the present disclosure;

FIG. 4 is a flow diagram of another embodiment of a human-machine interaction method according to the present disclosure;

FIG. 5 is a schematic diagram of an embodiment of a human-computer interaction device according to the present disclosure;

fig. 6 is a block diagram of an electronic device for implementing a human-computer interaction method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the human-computer interaction method or human-computer interaction device of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include intelligent end devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the intelligent terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the intelligent terminal device 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as a speech recognition application, a speech generation application, etc., may be installed on the intelligent terminal devices 101, 102, 103. The intelligent terminal devices 101, 102, 103 may also be equipped with an image acquisition device, a microphone array, a speaker, etc.

The intelligent terminal devices 101, 102, 103 may be hardware or software. When the smart terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, electronic book readers, car computers, laptop portable computers, desktop computers, and the like. When the smart terminal 101, 102, 103 is software, it can be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server providing various services, such as a background server providing support on the intelligent terminal devices 101, 102, 103. The background server may provide the speech processing model to the intelligent terminal device 101, 102, 103, obtain a processing result, and feed back the processing result to the intelligent terminal device 101, 102, 103.

The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the human-computer interaction method provided by the embodiment of the present disclosure is generally executed by the intelligent terminal devices 101, 102, and 103. Accordingly, the human-computer interaction device is generally disposed in the intelligent terminal apparatus 101, 102, 103.

It should be understood that the number of intelligent end devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of intelligent end devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a human-machine interaction method in accordance with the present disclosure is shown. The man-machine interaction method of the embodiment comprises the following steps:

step 201, in response to meeting a preset condition, monitoring a voice instruction of a user within a preset time length.

In this embodiment, the execution subject of the human-computer interaction method may detect whether the preset condition is satisfied in real time. Here, the preset condition may refer to a condition for opening a continuous dialog, which may include, but is not limited to: the user speaks a preset keyword, and the current scene belongs to a preset scene set. If the preset condition is met, the execution main body can monitor the voice instruction of the user within the preset time length. The preset time period may be set by a technician according to an actual application scenario, and may be, for example, 60 seconds. The execution main body can monitor the voice command of the user through the sound acquisition device connected with the communication.

Step 202, analyzing the voice command to determine whether the voice command is an invalid command.

In this embodiment, if the execution main body monitors the voice command of the user within the preset duration, it may be determined whether the voice command is an invalid command. Specifically, the execution subject may first determine an effective duration of the voice instruction, and if the effective duration is less than a preset duration threshold, the voice instruction is considered as an invalid instruction. Here, the valid duration may be a duration between a start point and an end point of a Voice command determined only after VAD (Voice Activity Detection) is performed on the Voice command. Or, the execution main body can also perform voice recognition on the voice command and judge whether the obtained text is a disease sentence. If the instruction is a disease sentence, the instruction is considered to be an invalid instruction.

And step 203, in response to the fact that the voice instruction is determined to be an invalid instruction, ignoring the voice instruction and continuing to monitor the voice instruction of the user.

If the execution subject determines that the voice command is an invalid command, the voice command can be ignored, and the voice command of the user is monitored continuously. Here, the monitoring may be continued for a preset time period, or may be continued for a preset time period since the valid command was monitored last time.

With continued reference to FIG. 3, a schematic diagram of one application scenario of a human-computer interaction method in accordance with the present disclosure is shown. In the application scenario of fig. 3, the user searches for a song through the smart television, and the smart television recognizes that a preset condition is currently satisfied, and starts a continuous conversation mode. And the intelligent television monitors the voice instruction of the user within 60 seconds after the continuous conversation mode is started. The user speaks "Play song XXX". The smart tv replies "good, play song AA of XXX for you". The user continues to say "louder" during the playing of the song. The intelligent television continuously increases the volume and displays the volume value.

According to the man-machine interaction method provided by the embodiment of the disclosure, the continuous conversation mode can be entered when the preset condition is met, so that the user is prevented from being awakened for many times in the man-machine interaction process, and the man-machine interaction efficiency and the user experience are improved.

With continued reference to FIG. 4, a flow 400 of another embodiment of a human-machine interaction method according to the present disclosure is shown. As shown in fig. 4, the method of the present embodiment may include the following steps:

step 401a, in response to that the current scene belongs to a preset scene set, determining that a preset condition is met.

In this embodiment, the execution subject may determine whether the current scene belongs to a preset scene set. And if so, considering that the preset condition is met. Here, the scenes in the preset scene set may include, but are not limited to: television searches, song searches, volume adjustments, vocal content searches, and the like. Executing the main body defaults to the above scene generally requires multiple rounds of interaction by the user to realize the playing of the final content, so the continuous conversation mode is opened in the above scene by default.

Step 401b, in response to receiving a preset voice instruction sent by a user, determining that a preset condition is met.

In this embodiment, the execution main body may further determine whether a preset voice instruction sent by the user is received. If the execution main body receives the voice command, the execution main body determines that the preset condition is met, and can enter a continuous conversation mode. The preset voice instruction may be a voice instruction for opening a continuous conversation (for example, "open a continuous conversation mode"), or a voice instruction for entering each scene in a preset scene set (for example, "i want to search for a tv show").

And 402, responding to the preset condition, and monitoring the voice instruction of the user within a preset time length.

Step 403, performing intention identification on the voice command, and determining the intention of the user; in response to determining that the user intent does not belong to a preset set of intentions, determining the voice instruction to be an invalid instruction.

In this embodiment, the execution subject may perform intent recognition on the instruction, and determine the user intent. In particular, the executing agent may utilize a pre-trained intent recognition model that may be used to characterize the correspondence of the voice instructions to the user's intent. The intention recognition model may be a convolutional neural network or the like. After determining the user intention, the execution subject may determine whether the user intention belongs to a preset intention set. And if the voice command belongs to the preset intention set, the voice command is considered to belong to the valid command. And if the voice command does not belong to the preset intention set, the voice command is considered to belong to an invalid command. Here, the preset intention set may be an intention corresponding to various services that the execution subject may provide, and may include, but is not limited to: watching tv drama, listening to songs, listening to photos, looking up weather, etc. Intents that do not belong to the set of intents may include, but are not limited to: chatty intent, unclear intent, and the like.

And step 404, in response to determining that the voice command is an invalid command, ignoring the voice command and continuing to monitor the voice command of the user.

Step 405, in response to determining that the voice command is a valid command, outputting response information.

In this embodiment, if the execution subject determines that the voice command is a valid command, the response information may be output. The response information may include pictures, audio, video, interfaces, and the like. The execution body may use a pre-stored voice as the response information, or may dynamically generate and output the response information according to the voice instruction.

In some optional implementations of this embodiment, the execution subject may generate the response information by the following steps not shown in fig. 4: extracting words corresponding to the slot positions in the reply template from the voice command; filling the words into the slot positions to obtain a reply text; based on the reply text, response information is generated and output.

In this implementation, a reply template may be provided inside the execution main body. The reply template may be "play YYY of XXX for you, you can switch or control play with voice". The above-mentioned "XXX" and "YYY" in the reply template may be understood as slots. The execution subject may first perform speech recognition on the speech instruction to obtain a corresponding text. And then, inputting the text into a pre-trained model, and determining the slot position label. Then, the slot position tag is used as a word corresponding to the slot position. And filling the words into the slot positions to obtain a reply text. Finally, response information is generated based on the reply text. In particular, the executing entity may generate speech of the reply text, also known as speech synthesis. And uses the synthesized voice as response information.

The input of speech recognition is a speech signal in a time domain, which is represented by encoding as a vector, and the output is the corresponding text. After a segment of audio input, before beginning speech recognition, it is sometimes necessary to cut off the silence at the beginning and end to reduce interference to subsequent steps, so that Voice Activity Detection (VAD) is needed. Through voice activation detection, the execution body may determine a starting point and an ending point of a voice instruction. The execution body may then digitize the speech between the starting point and the ending point and perform feature extraction. The audio is digitized to obtain Mel Frequency Cepstral Coefficients (MFCC) speech features. The execution body can input the acquired MFCC voice features into a WaveNet network for processing. The WaveNet model fuses the scaled CNN, the residual network, CTC and LSTM. The scaled CNN model can increase the receptive field of the convolution kernel and utilize context information with longer distance. Finally, decoding is performed by a decoder (decoder), and a final recognition result is output.

In intent recognition and slot detection, the executing body may utilize slotreine model (an article from EMNLP2020, paper) for intent detection and slot filling. The model integrates two tasks of intention detection and slot (slot) filling, can achieve better effect on intention understanding, and is superior to other existing models in decoding speed.

In speech synthesis, the execution agent may utilize a ClariNet model. The ClariNet model uses an Attention-based (Attention) encoder-decoder module to learn the alignment relationship between text characters and spectral frames. The hidden states (hidden states) of the decoder are fed to the Bridge-net for timing information processing and upsampling. The final Bridge-net hidden state is fed to an audio waveform generation module (Vocoder) for synthesizing an audio waveform. And finally, outputting voice corresponding to the response text by using the audio waveform. ClariNet breaks through end-to-end output from a text to an original audio waveform, realizes joint optimization of the whole TTS system, and greatly improves the naturalness of speech synthesis compared with other models. In addition, ClariNet is a full-volume model, with performance superior to other RNN-based models.

In some alternative implementations of the present embodiment, the executing agent may perform intent recognition on the voice instruction using a pre-trained intent recognition model to determine the user intent.

And 406, in response to receiving the negative feedback information aiming at the response information, generating a training sample according to the voice instruction, and training the intention recognition model again.

In this embodiment, if the execution main body receives negative feedback information for the response information. Training samples can be generated according to the voice instructions, and the intention recognition model is trained again to improve the accuracy of the intention recognition model. The negative feedback information may be "returned" or "wrong" voice spoken by the user. The executive subject can judge whether the emotion is positive or negative by performing emotion analysis on the feedback information of the user. If the direction is negative, the intention of the user to identify the current recognition is deemed inaccurate. That is, the output result of the intention recognition model is inaccurate. In this case, the execution subject may generate training samples according to the voice instruction to train the intention recognition model again. In particular, the execution subject may use the voice instruction and the recognized user intention as a training sample.

In some optional implementations of this embodiment, the executing agent may generate training samples by the following steps not shown in fig. 4, train the intention recognition model again: taking the voice instruction as sample voice, and taking the intention identified by the intention identification model as an error intention corresponding to the sample voice; and training an intention recognition model as a negative sample according to the sample voice and the error intention.

In this implementation, the execution agent may use the voice command as a sample voice and use the user intention output by the intention recognition model as an erroneous intention corresponding to the sample voice. Thus, a negative sample is obtained. In training, the intent recognition model may be trained using negative examples.

Step 407, in response to that the voice command is not monitored within the preset time period, entering a sleep state.

In this embodiment, if the execution main body does not monitor the voice instruction within the preset duration, and it can be determined that the current user does not need to operate the smart device such as the smart television, the execution main body may enter the sleep state, or end the continuous conversation mode. When a user needs to operate intelligent equipment such as an intelligent television, the intelligent equipment can be awakened again.

The man-machine interaction method provided by the embodiment of the disclosure can respond in time when the instruction is identified as an effective instruction. And when the recognition intention is wrong, the intention recognition model is trained again by using the generated negative sample, so that the accuracy is improved. And the user can enter a dormant state when not responding in time, so that the electric energy is saved.

With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present disclosure provides an embodiment of a human-computer interaction device, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied to various electronic devices.

As shown in fig. 5, the human-computer interaction device 500 of the present embodiment includes: an instruction snoop unit 501, an instruction analysis unit 502, and a continuous snoop unit 503.

The instruction monitoring unit 501 is configured to monitor a voice instruction of a user for a preset time duration in response to a preset condition being met.

The instruction analysis unit 502 is configured to analyze the voice instruction and determine whether the voice instruction is an invalid instruction.

And a continuous monitoring unit 503 configured to, in response to determining that the voice instruction is an invalid instruction, ignore the voice instruction and continue monitoring the voice instruction of the user.

In some optional implementations of this embodiment, the apparatus 500 may further include a condition determining unit, not shown in fig. 5, configured to: and determining that a preset condition is met in response to the fact that the current scene belongs to a preset scene set.

In some optional implementations of this embodiment, the apparatus 500 may further include a condition determining unit, not shown in fig. 5, configured to: and in response to receiving a preset voice instruction sent by a user, determining that a preset condition is met.

In some optional implementations of this embodiment, the instruction analysis unit 502 may be further configured to: performing intention recognition on the voice instruction, and determining the intention of a user; in response to determining that the user intent does not belong to a preset set of intentions, determining the voice instruction to be an invalid instruction.

In some optional implementations of this embodiment, the apparatus 500 may further include a response output unit, not shown in fig. 5, configured to: responsive to determining that the voice instruction is a valid instruction, outputting response information.

In some optional implementations of this embodiment, the response output unit is further configured to: extracting words corresponding to the slot positions in the reply template from the voice command; filling the words into the slot positions to obtain a reply text; based on the reply text, response information is generated and output.

In some optional implementations of this embodiment, the instruction analysis unit 502 may be further configured to: and performing intention recognition on the voice command by using a pre-trained intention recognition model to determine the user intention. Correspondingly, the apparatus 500 may further comprise a model training unit, not shown in fig. 5, configured to: and in response to receiving negative feedback information aiming at the response information, generating a training sample according to the voice instruction, and training the intention recognition model again.

In some optional implementations of this embodiment, the model training unit is further configured to: taking the voice instruction as sample voice, and taking the intention identified by the intention identification model as an error intention corresponding to the sample voice; and training an intention recognition model as a negative sample according to the sample voice and the error intention.

In some optional implementations of this embodiment, the apparatus 500 may further include a sleep unit, not shown in fig. 5, configured to: and entering a dormant state in response to not monitoring the voice command within the preset time length.

It should be understood that the units 501 to 503 described in the human-computer interaction device 500 respectively correspond to the respective steps in the method described with reference to fig. 2. Thus, the operations and features described above for the human-computer interaction method are also applicable to the apparatus 500 and the units included therein, and are not described herein again.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to an embodiment of the present disclosure.

Fig. 6 shows a block diagram of an electronic device 600 that performs a human-computer interaction method according to an embodiment of the disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the electronic device 600 includes a processor 601 that may perform various suitable actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)602 or a computer program loaded from a memory 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 can also be stored. The processor 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An I/O interface (input/output interface) 605 is also connected to the bus 604.

Various components in the electronic device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a memory 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the electronic device 600 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Processor 601 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of processor 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 601 performs the various methods and processes described above, such as a human-computer interaction method. For example, in some embodiments, the human-computer interaction method may be implemented as a computer software program tangibly embodied in a machine-readable storage medium, such as memory 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 600 via the ROM 602 and/or the communication unit 609. When loaded into RAM603 and executed by processor 601, a computer program may perform one or more of the steps of the human-computer interaction method described above. Alternatively, in other embodiments, the processor 601 may be configured to perform the human-machine interaction method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. The program code described above may be packaged as a computer program product. These program code or computer program products may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor 601, causes the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable storage medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable storage medium may be a machine-readable signal storage medium or a machine-readable storage medium. A machine-readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions of the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

15页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：音效处理方法、装置、车载娱乐系统及汽车

Man-machine interaction method, device, equipment and storage medium

相关技术

网友询问留言