AI intelligence alarm host computer based on speech recognition and image recognition technique

文档序号：170592 发布日期：2021-10-29 浏览：54次中文

阅读说明：本技术 一种基于语音识别和图像识别技术的ai智能报警主机 (AI intelligence alarm host computer based on speech recognition and image recognition technique ) 是由吴景华曾刚于 2021-07-28 设计创作，主要内容包括：本发明公开了一种基于语音识别和图像识别技术的AI智能报警主机,属于AI智能主机技术领域,其中包括报警主机和AI摄像头,其中AI摄像头将采集到的的音视频数据传输到报警主机中,报警主机包括中央处理模块、警示灯模块、报警模块、通信模块、喇叭模块、拾音器模块和AI摄像头模块,AI摄像头模块能智能识别危险场景,主动上报险情,同时其内设的算法模型可以是被语音识别呼救声和尖叫声的,其内设的算法模型还能图像识别特殊场景,这样能在第一时间报警,有效避免在遇到突发事件,伤员无法触发报警主机的情况,同时摄像头识别结果以及报警前后时间的录像可以保存在本地或者云端,以便后续调查取证。(The invention discloses an AI intelligent alarm host based on voice recognition and image recognition technology, belonging to the technical field of AI intelligent hosts, comprising an alarm host and an AI camera, wherein the AI camera transmits collected audio and video data to the alarm host, the alarm host comprises a central processing module, a warning lamp module, an alarm module, a communication module, a loudspeaker module, a sound pick-up module and an AI camera module, the AI camera module can intelligently recognize dangerous scenes and actively report dangerous situations, meanwhile, an algorithm model arranged in the alarm host can be recognized by voice for calling for help and calling for a scream, the algorithm model arranged in the alarm host can also recognize special scenes by images, thus alarming can be carried out at the first time, the condition that a wounded can not trigger the alarm host when encountering an emergency is effectively avoided, meanwhile, the camera recognition result and video records of the time before and after alarming can be stored locally or in a cloud, for subsequent investigation and evidence collection.)

1. The utility model provides an AI intelligence alarm host computer based on speech recognition and image recognition technique, a serial communication port, including alarm host computer and AI camera, wherein the audio and video data transmission that the AI camera will gather to the alarm host computer in, the alarm host computer includes central processing module, warning light module, alarm module, communication module, loudspeaker module, adapter module and AI camera module, warning light module, alarm module, communication module, loudspeaker module and adapter module all are connected with central processing module's output, and central processing module still is connected with AI camera module electricity, and built-in plug flow module of AI camera module, image recognition module and audio recognition module.

2. The AI intelligent alarm host according to claim 1, wherein the audio recognition module comprises an audio capture module, an audio conversion module, an audio encoder and an NN module, an input terminal of the audio capture module is connected to the AI camera module, an output terminal of the audio capture module is connected to the audio conversion module, and the audio conversion module outputs the main audio and the alternate audio through the audio encoder.

3. The AI intelligent alarm host computer based on voice recognition and image recognition technology of claim 2, wherein the output end of the audio capture module is further connected with the NN module and the audio encoder through Queue for audio recording.

4. The AI intelligent alarm host computer based on voice recognition and image recognition technology of claim 1, wherein the image recognition module comprises a video capture module, an FR channel, a DS1 channel, an image buffer module, a distortion correction module, a 2D graphics operation module, a 2D graphics rendering module and a coding module, the input end of the video capture module is connected with the AI camera module, the output end of the video capture module is connected with the distortion correction module through the FR channel in combination with the image buffer module, the output end of the distortion correction module is connected with the 2D graphics operation module in combination with the image buffer module, the 2D graphics operation module is electrically connected with the 2D graphics rendering module, the 2D graphics operation module is further connected with the coding module through Queue, and the output end of the coding module is connected with the push module.

5. The AI intelligent alarm host computer based on the voice recognition and image recognition technology of claim 4, wherein the 2D graphics operation module and the 2D graphics rendering module are electrically connected with a 2D graphics rendering module, and the audio recognition module is connected with the 2D graphics rendering module through a Queue.

6. The AI intelligent alarm host computer based on voice recognition and image recognition technology as claimed in claim 4, wherein the output of said video capture module is further connected to the 2D graphics manipulation module through DS1 channel in connection with the image buffer module, and the output of the 2D graphics manipulation module is connected to the NN module through Queue.

7. The AI intelligent alarm host computer based on speech recognition and image recognition technology of claim 6 wherein the FR channel and DS1 channel connected to the output of the video capture module are different, wherein the FR channel is the channel with the largest resolution among the video frames captured by the video capture module, the image frames flowing through the FR channel are used for video push stream, and the DS1 channel is used for NN module for recognition.

8. The AI intelligent alarm host based on voice recognition and image recognition technology of claim 1, wherein the stream pushing module not only pushes the image and audio scene, but also pushes the obtained image and audio to the network.

9. The AI intelligent alarm host computer based on voice recognition and image recognition technology of claim 1, wherein the AI camera is built-in with voice recognition algorithm and image recognition algorithm.

Technical Field

The invention relates to the technical field of AI intelligent hosts, in particular to an AI intelligent alarm host based on voice recognition and image recognition technologies.

Background

The alarm system is a system which automatically detects invasion actions in a defense deployment monitoring area by a physical method or an electronic technology, generates an alarm signal, prompts an operator on duty of the alarm area and displays possible countermeasures. The alarm host is an important facility for preventing accidents such as robbery, theft and the like. The existing alarm host mainly takes a one-key alarm host as a main part, only simple operation buttons are arranged on equipment, and when an emergency occurs, related buttons need to be pressed down, and the host immediately dials a telephone number set in advance.

The existing alarm host can meet part of alarm requirements. It has two drawbacks:

firstly, the alarm can be realized only by manually and actively pressing a button, and when an emergency occurs, a threatened person may not be able to press the button;

secondly, the existing alarm host is generally not provided with a camera, only voice talkback can be carried out after a dangerous case occurs, the field condition cannot be seen, evidence obtaining records of a case scene cannot be obtained at the first time, and certain difficulty is indirectly brought to subsequent rescue and examination.

Disclosure of Invention

The invention aims to provide an AI intelligent alarm host based on voice recognition and image recognition technologies, which has the advantages of alarming at the first time and facilitating subsequent investigation and evidence collection of video recording of a camera recognition result and time before and after alarming so as to solve the problems in the background technology.

In order to achieve the purpose, the invention provides the following technical scheme:

the utility model provides an AI intelligence alarm host computer based on speech recognition and image recognition technique, a serial communication port, including alarm host computer and AI camera, wherein the audio and video data transmission that the AI camera will gather to the alarm host computer in, the alarm host computer includes central processing module, warning light module, alarm module, communication module, loudspeaker module, adapter module and AI camera module, warning light module, alarm module, communication module, loudspeaker module and adapter module all are connected with central processing module's output, and central processing module still is connected with AI camera module electricity, and built-in plug flow module of AI camera module, image recognition module and audio recognition module.

Furthermore, the audio recognition module comprises an audio capture module, an audio conversion module, an audio encoder and an NN module, wherein the input end of the audio capture module is connected with the AI camera module, the output end of the audio capture module is connected with the audio conversion module, and the audio conversion module outputs main audio and alternative audio through the audio encoder.

Further, the output end of the audio capturing module is connected with the NN module and the audio encoder through a Queue, and audio recording is performed.

Furthermore, the image recognition module comprises a video capture module, an FR channel, a DS1 channel, an image buffer module, a distortion correction module, a 2D graphics operation module, a 2D graphics rendering module and a coding module, wherein the input end of the video capture module is connected with the AI camera module, the output end of the video capture module is connected with the distortion correction module through the FR channel and the image buffer module, the output end of the distortion correction module is connected with the 2D graphics operation module through the FR channel and the image buffer module, the 2D graphics operation module is electrically connected with the 2D graphics rendering module, the 2D graphics operation module is further connected with the coding module through a Queue, and the output end of the coding module is connected with the pushing module.

Furthermore, the 2D graph operation module and the 2D graph rendering module are electrically connected with a 2D graph drawing module, and the audio identification module is connected with the 2D graph drawing module through a Queue.

Furthermore, the output end of the video capture module is also connected with the 2D graphics operation module through a DS1 channel joint image buffer module, and meanwhile, the output end of the 2D graphics operation module is connected with the NN module through a Queue.

Furthermore, the FR channel and the DS1 channel connected to the output end of the video capture module are different, where the FR channel is the channel with the largest resolution among video frames captured by the video capture module, image frames flowing through the FR channel are used for video push stream, and the DS1 channel is used for the NN module to perform recognition.

Furthermore, the stream pushing module not only pushes the image and the audio on site, but also pushes the acquired image and audio to the network.

Further, a voice recognition algorithm and an image recognition algorithm are built in the AI camera.

Compared with the prior art, the invention has the beneficial effects that: the AI intelligent alarm host machine based on the voice recognition and image recognition technology comprises an alarm host machine and an AI camera, wherein the AI camera transmits the collected audio and video data to an alarm host, the alarm host comprises a central processing module, a warning lamp module, an alarm module, a communication module, a loudspeaker module, a sound pick-up module and an AI camera module, the AI camera module can intelligently identify dangerous scenes and actively report dangerous situations, and simultaneously, the built-in algorithm model can be used for identifying distress sounds and scream sounds by voice, the built-in algorithm model can also recognize special scenes by images, thus being capable of alarming at the first time, effectively avoiding the situation that the wounded can not trigger the alarm host when encountering emergency, meanwhile, the camera identification result and the videos of the time before and after alarming can be stored in the local or cloud end, so that the follow-up investigation and evidence collection can be facilitated.

Drawings

FIG. 1 is a block diagram of the overall structure of the present invention;

FIG. 2 is a schematic block diagram of an AI camera of the present invention;

FIG. 3 is a functional block diagram of an audio recognition module of the present invention;

FIG. 4 is a functional block diagram of an image recognition module of the present invention;

FIG. 5 is a functional block diagram of the active alarm of the present invention;

FIG. 6 is a block diagram of the encoding and push stream of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-6:

an AI intelligent alarm host based on voice recognition and image recognition technology comprises an alarm host and an AI camera, wherein the AI camera transmits collected audio and video data to the alarm host, the alarm host comprises a central processing module, a warning lamp module, an alarm module, a communication module, a loudspeaker module, a sound pick-up module and an AI camera module, the warning lamp module, the alarm module, the communication module, the loudspeaker module and the sound pick-up module are all connected with the output end of the central processing module, the central processing module is also electrically connected with the AI camera module, a current push module, an image recognition module and an audio recognition module are arranged in the AI camera module, the current push module not only pushes images and audio fields, but also pushes the obtained images and audio to a network; as shown in figure 1, the alarm host takes a central processing module as a center, the central processing module is connected with a warning light module, an alarm module, a communication module, a horn module, a sound pick-up module and a camera module, wherein the central processing module is mainly responsible for scheduling each module, for example, an AI camera detects related dangers and then informs the central processing module, the central processing module calls the communication module and the alarm module to inform related personnel, wherein the warning light module displays the current state of equipment, for example, red indicates that the equipment is giving an alarm, yellow indicates that the equipment is in failure, the alarm module is activated, namely, the alarm module gives an alarm or contacts related personnel on duty after the danger is identified by a key alarm or the AI camera, the communication module mainly comprises an Ethernet, a 4G communication module or a 5G communication module and provides network transmission service for the upper layer, wherein the horn module is mainly used for playing related warning voice when giving an alarm, the sound pickup module is used for collecting sound on site, the AI camera module is internally provided with a relevant recognition algorithm for detecting images and voice on site, once danger is found, the central processing module is immediately informed to call the alarm module for alarm, and meanwhile, images on site can be uploaded to a network in real time. When a button is pressed down or a camera detects a dangerous scene, the AI camera transmits a starting signal to the central processing module, and after receiving the starting signal, the warning lamp, the loudspeaker and the sound pickup are turned on; the communication module transmits the alarm signal and the audio-video signal to the client, the client immediately pops up the window to prompt the alarm, and the camera is turned up to monitor the field condition in all directions.

As shown in fig. 3, the audio recognition module includes an audio capture module, an audio conversion module, an audio encoder, and an NN module, an input end of the audio capture module is connected to the AI camera module, an output end of the audio capture module is connected to the audio conversion module, and the audio conversion module outputs a main audio and a substitute audio through the audio encoder, wherein the audio capture module is configured to capture audio PCM data from an audio capture device, and the audio conversion module performs audio resampling on the input audio; the output end of the audio capturing module is also connected with the NN module and the audio encoder through the Queue, and audio recording is carried out.

As shown in fig. 4, the image recognition module includes a video capture module, an FR channel, a DS1 channel, an image buffer module, a distortion correction module, a 2D graphics operation module, a 2D graphics rendering module, and an encoding module, an input end of the video capture module is connected to the AI camera module, an output end of the video capture module is connected to the distortion correction module through the FR channel in combination with the image buffer module, an output end of the distortion correction module is connected to the 2D graphics operation module in combination with the image buffer module, the 2D graphics operation module is electrically connected to the 2D graphics rendering module, the 2D graphics operation module is further connected to the encoding module through a Queue, an output end of the encoding module is connected to the push module, wherein the video capture module captures a video frame in the AI camera, wherein the distortion correction module can perform geometric distortion correction on an input image, wherein the 2D graphics operation module provides 2D graphics operation, the method comprises rectangle filling, bitmap copying, image scaling and image mixing, wherein a 2D graphics rendering module provides a simple 2D graphics API (application programming interface), drawing, filling, font rendering and image loading are included, a coding module can code a video frame into an H264/H265 code stream, and an NN (neural network) module calls an NPU (non-uniform processing unit) at the bottom layer to identify an input image frame in a specified format after a specified neural network model is loaded. The 2D graph operation module and the 2D graph rendering module are electrically connected with a 2D graph drawing module, and the audio recognition module is connected with the 2D graph drawing module through a Queue. The output end of the video capture module is also connected with the 2D graphic operation module through the DS1 channel joint image buffer module, and the output end of the 2D graphic operation module is connected with the NN module through the Queue.

The FR channel and the DS1 channel connected with the output end of the video capture module are different, wherein the FR channel is the channel with the largest resolution in video frames captured by the video capture module, image frames flowing through the FR channel are used for video plug flow, and the DS1 channel is used for the NN module to perform identification.

The AI camera is internally provided with a voice recognition algorithm and an image recognition algorithm, wherein the voice recognition algorithm can recognize calling for help and screaming voice, the image recognition algorithm can recognize special scenes and high-risk figure face recognition, once a relevant dangerous scene is detected, a starting signal is transmitted to the central processing module, meanwhile, an alarm signal and an audio-visual signal are transmitted to a client through the communication module, and in addition, the recognized image and relevant alarm video can be stored to the local or cloud so as to be convenient for follow-up evidence taking.

The AI intelligent alarm host based on the voice recognition and image recognition technology comprises an alarm host and an AI camera, wherein the AI camera transmits collected audio and video data to the alarm host, the alarm host comprises a central processing module, a warning lamp module, an alarm module, a communication module, a loudspeaker module, a sound pick-up module and an AI camera module, a plug-flow module, an image recognition module and an audio recognition module are arranged in the AI camera module, the plug-flow module not only pushes images and audio fields, but also pushes the obtained images and audio to a network, and a voice recognition algorithm and an image recognition algorithm are arranged in the AI camera, wherein the voice recognition algorithm can recognize distress sounds and screaming sounds, the image recognition algorithm can recognize special scenes and high-risk figure face recognition, and once a relevant dangerous scene is detected, a starting signal is transmitted to the central processing module, meanwhile, the alarm signal and the audio-video signal are transmitted to the client through the communication module, and in addition, the identified image and the alarm related video can be stored to the local or cloud end for later evidence collection; the alarm host machine takes a central processing module as a center, the central processing module is connected with a warning lamp module, an alarm module, a communication module, a loudspeaker module, a sound pick-up module and a camera module, wherein the central processing module is mainly responsible for scheduling of the modules, if the AI camera detects related dangers, the central processing module is informed of the related dangers, the central processing module calls the communication module and the alarm module to inform related personnel, the AI camera module is internally provided with a related recognition algorithm to detect field images and voices, once the dangers are found, the central processing module is immediately informed of calling the alarm module to alarm, and meanwhile, the field images are uploaded to a network in real time. The audio identification module comprises an audio capture module, an audio conversion module, an audio encoder and an NN module, and the audio identification module has the working principle that after audio PCM data are captured by the audio capture module from an audio capture device, the audio PCM data are input into the audio conversion module for audio resampling, then the resampled data are input into a Queue to wait for the identification of the NN module, the resampled data are input into the audio encoder for encoding after the identification is confirmed, and the encoded data are pushed to a network through a main audio module. The image identification module comprises a video capture module, an FR (fast Fourier transform) channel, a DS1 channel, an image buffer module, a distortion correction module, a 2D (two-dimensional) graphic operation module, a 2D graphic rendering module and an encoding module, and the working principle of the image identification module is that after an AI (advanced intelligent) camera captures a video frame through the video capture module, the video capture module inputs the image frame into the distortion correction module for geometric distortion correction, the corrected data frame is input into the 2D graphic operation module for rotation, then the data frame is input into the 2D graphic rendering module for operations such as drawing, font rendering and the like, then the data frame in the last step is input into the 2D graphic operation module again for scaling, the scaled data frame is stored into a Queue, the encoding module is waited to encode the data frame into an H264/H265 code stream, and finally the code stream is pushed into a network.

The AI intelligent alarm host based on the voice recognition and image recognition technology has the active alarm principle that when a dangerous case occurs, a video capture module captures a video frame, then the image frame in an FR channel is input into a distortion correction module for geometric distortion correction, the corrected data frame is input into a 2D graphic operation module for rotation, then the result identified by an NN module is integrated in a 2D graphic rendering module, the result is zoomed by the 2D graphic operation module and is stored in a Queue of Queue to wait for the coding module to code the data, the coded data is pushed to a network, meanwhile, the image frame in a DS1 channel is zoomed and rotated by the 2D graphic operation module, then the image frame is stored in the Queue of Queue and is input into the NN module for recognition, the result identified by the NN module is input into the 2D graphic drawing module for drawing, and voice recognition is also performed in the process of image recognition, specifically, after capturing audio PCM data from an audio capturing device, an audio capturing module performs audio resampling through an audio conversion module, inputs the audio resample data to an audio encoder for encoding, and finally pushes the encoded data to a network. This AI intelligence alarm host computer combines AI intelligent camera, dangerous scene of intelligent recognition, the dangerous situation is reported initiatively, the algorithm model of establishing simultaneously can be by speech recognition calling for help sound and scream sound, the algorithm model of establishing can also the special scene of image recognition, can report to the police in the very first time like this, effectively avoid meetting emergency, the wounded can't trigger the condition of alarm host computer, camera identification result and the video recording of time before and after reporting to the police can be preserved in local or high in the clouds simultaneously, so that follow-up investigation is collected evidence later.

In summary, the AI intelligent alarm host based on the voice recognition and image recognition technology provided by the invention comprises an alarm host and an AI camera, wherein the AI camera transmits the collected audio and video data to an alarm host, the alarm host comprises a central processing module, a warning lamp module, an alarm module, a communication module, a loudspeaker module, a sound pick-up module and an AI camera module, the AI camera module can intelligently identify dangerous scenes and actively report dangerous situations, and simultaneously, the built-in algorithm model can be used for identifying distress sounds and scream sounds by voice, the built-in algorithm model can also recognize special scenes by images, thus being capable of alarming at the first time, effectively avoiding the situation that the wounded can not trigger the alarm host when encountering emergency, meanwhile, the camera identification result and the videos of the time before and after alarming can be stored in the local or cloud end, so that the follow-up investigation and evidence collection can be facilitated.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.

11页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种驱兽室外无人变电站系统

AI intelligence alarm host computer based on speech recognition and image recognition technique

相关技术

网友询问留言