Sound source positioning-based shooting method and device, electronic equipment and storage medium

文档序号:1111964 发布日期:2020-09-29 浏览:19次 中文

阅读说明:本技术 基于声源定位的拍摄方法、装置、电子设备及存储介质 (Sound source positioning-based shooting method and device, electronic equipment and storage medium ) 是由 邹芳 谢树家 张亚男 于 2020-06-30 设计创作,主要内容包括:本发明涉及人工智能,提供一种基于声源定位的拍摄方法,该方法包括:接收拍摄指令,采集声音信息;对所述声音信息进行声源定位,得到备选声源位置信息;对所述备选声源位置信息进行分析,确定目标声源位置信息;获取初始声源位置信息,基于所述目标声源位置信息及所述初始声源位置信息计算所述摄像单元的调整角度;基于所述调整角度调整所述摄像单元的拍摄角度;控制所述摄像单元进行拍摄。此外,本发明还涉及区块链技术,所述声音信息可存储于区块链中。本发明还提供一种拍摄装置、电子设备及计算机可读存储介质。利用本发明,可提高声源定位准确性及拍摄效率。(The invention relates to artificial intelligence, and provides a shooting method based on sound source positioning, which comprises the following steps: receiving a shooting instruction and collecting sound information; carrying out sound source positioning on the sound information to obtain alternative sound source position information; analyzing the position information of the alternative sound source to determine the position information of a target sound source; acquiring initial sound source position information, and calculating an adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information; adjusting a shooting angle of the camera unit based on the adjustment angle; and controlling the camera unit to shoot. Furthermore, the invention relates to a block chain technique, wherein the sound information can be stored in the block chain. The invention also provides a shooting device, electronic equipment and a computer readable storage medium. By using the invention, the sound source positioning accuracy and the shooting efficiency can be improved.)

1. A shooting method based on sound source positioning is applied to electronic equipment, and is characterized in that the electronic equipment is in communication connection with a sound acquisition unit and a camera unit, and the method comprises the following steps:

receiving a shooting instruction, and controlling the sound acquisition unit to acquire sound information;

positioning and analyzing the sound information by using a sound source positioning algorithm to obtain alternative sound source position information corresponding to the sound information;

analyzing the position information of the alternative sound source, and determining the position information of a target sound source corresponding to the sound information;

acquiring initial sound source position information, and calculating an adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information;

when the adjusting angle is larger than or equal to a preset angle threshold value, adjusting the shooting angle of the camera shooting unit based on the adjusting angle; and

and controlling the camera unit to shoot.

2. The sound source localization-based photographing method according to claim 1, wherein the analyzing the candidate sound source position information and determining target sound source position information corresponding to the sound information comprises:

when a plurality of candidate sound source position information exists, acquiring distances corresponding to the plurality of candidate sound source position information; and

and using the sound source position information with the distance smaller than or equal to a preset distance threshold value as the target sound source position information.

3. The sound source localization-based photographing method according to claim 2, wherein the analyzing the candidate sound source position information to determine target sound source position information corresponding to the sound information further comprises:

when a plurality of pieces of alternative sound source position information with the distance smaller than or equal to a preset threshold exist, extracting sound features from the sound information, judging identity information corresponding to the sound features, and recording the identity information as first identity information;

acquiring real-time images of the multiple pieces of alternative sound source position information with the distance smaller than or equal to a preset threshold, identifying a face area from the real-time images, and performing identity identification to obtain identity information corresponding to the multiple pieces of alternative sound source position information, and recording the identity information as second identity information; and

and judging whether identity information matched with the first identity information exists in the second identity information or not, and taking the alternative sound source position information corresponding to the second identity information matched with the first identity information as target sound source position information.

4. The sound source localization-based shooting method according to claim 3, wherein the shooting instruction includes target identity information, and the analyzing of the candidate sound source position information to determine the target sound source position information corresponding to the sound information is performed before the acquiring of the real-time images of the candidate sound source position information with the distances smaller than or equal to a preset threshold, further includes:

acquiring the target identity information, and judging whether the first identity information is consistent with the target identity information; and

and when the first identity information is judged to be consistent with the target identity information, acquiring real-time images of the plurality of alternative sound source position information with the distances smaller than or equal to a preset threshold value.

5. The sound source localization-based photographing method according to claim 1, wherein the calculating of the adjustment angle of the image capturing unit based on the target sound source position information and the initial sound source position information includes:

acquiring the position coordinates of the camera unit;

respectively constructing straight lines of the camera shooting unit, the target sound source position information and the initial sound source position information; and

and calculating an angle between the constructed straight lines as the adjustment angle.

6. The sound source localization-based shooting method according to any one of claims 1 to 5, wherein the electronic device is in communication connection with a light sensing unit and a light supplementing unit, and before the control of the image capturing unit to perform shooting, the method further comprises:

acquiring the ambient illuminance sent by the photosensitive unit in real time;

when the ambient illuminance is less than or equal to a preset illuminance threshold, calculating an adjustment angle of the light supplementing unit based on the target sound source position information and the initial sound source position information; and

and adjusting the illumination angle of the light supplementing unit based on the adjustment angle, and adjusting the illumination brightness of the light supplementing unit based on the ambient light illumination.

7. A camera device, characterized in that the device comprises:

the voice acquisition module is used for receiving a shooting instruction and controlling the voice acquisition unit to acquire voice information;

the first positioning module is used for positioning and analyzing the sound information by utilizing a sound source positioning algorithm to obtain alternative sound source position information corresponding to the sound information;

the second positioning module is used for analyzing the position information of the alternative sound source and determining the position information of a target sound source corresponding to the sound information;

the calculation module is used for acquiring initial sound source position information and calculating the adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information;

the adjusting module is used for adjusting the shooting angle of the camera shooting unit based on the adjusting angle when the adjusting angle is larger than or equal to a preset angle threshold; and

and the shooting module is used for controlling the shooting unit to shoot.

8. The camera of claim 7, wherein the adjustment module is further configured to:

acquiring the ambient illuminance sent by the photosensitive unit in real time;

when the ambient illuminance is less than or equal to a preset illuminance threshold, calculating an adjustment angle of the light supplementing unit based on the target sound source position information and the initial sound source position information; and

and adjusting the illumination angle of the light supplementing unit based on the adjustment angle, and adjusting the illumination brightness of the light supplementing unit based on the ambient light illumination.

9. An electronic device, comprising a memory and a processor, wherein the memory stores a sound source localization-based shooting program executable on the processor, and wherein the sound source localization-based shooting program when executed by the processor implements the steps of the sound source localization-based shooting method according to any one of claims 1 to 6.

10. A computer-readable storage medium characterized by comprising a stored data area storing data created according to use of a block chain node and a stored program area storing a computer program which, when executed by a processor, implements the sound source localization-based photographing method according to any one of claims 1 to 6.

Technical Field

The present invention relates to the field of artificial intelligence and sound source localization technologies, and in particular, to a sound source localization-based shooting method and apparatus, an electronic device, and a computer-readable storage medium.

Background

In the application scene of remote video teaching, the current teaching room of a remote training classroom has a fixed camera and a fixed light irradiation angle, but when a photographed target, particularly a teacher walks, the camera and the light irradiation angle cannot be changed, so that the quality of a photographed video image is poor.

Disclosure of Invention

In view of the above, the present invention provides a sound source localization-based photographing method, apparatus, electronic device, and computer-readable storage medium, which mainly aim to improve photographing efficiency by improving accuracy of sound source localization.

In order to achieve the above object, the present invention provides a sound source localization-based photographing method, including:

receiving a shooting instruction, and controlling the sound acquisition unit to acquire sound information;

positioning and analyzing the sound information by using a sound source positioning algorithm to obtain alternative sound source position information corresponding to the sound information;

analyzing the position information of the alternative sound source, and determining the position information of a target sound source corresponding to the sound information;

acquiring initial sound source position information, and calculating an adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information;

when the adjusting angle is larger than or equal to a preset angle threshold value, adjusting the shooting angle of the camera shooting unit based on the adjusting angle; and

and controlling the camera unit to shoot.

Further, to achieve the above object, the present invention also provides a photographing apparatus including:

the voice acquisition module is used for receiving a shooting instruction and controlling the voice acquisition unit to acquire voice information;

the first positioning module is used for positioning and analyzing the sound information by utilizing a sound source positioning algorithm to obtain alternative sound source position information corresponding to the sound information;

the second positioning module is used for analyzing the position information of the alternative sound source and determining the position information of a target sound source corresponding to the sound information;

the calculation module is used for acquiring initial sound source position information and calculating the adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information;

the adjusting module is used for adjusting the shooting angle of the camera shooting unit based on the adjusting angle when the adjusting angle is larger than or equal to a preset angle threshold; and

and the shooting module is used for controlling the shooting unit to shoot.

In addition, to achieve the above object, the present invention also provides an electronic device including: the system comprises a storage and a processor, wherein the storage stores a sound source positioning-based shooting program which can run on the processor, and the sound source positioning-based shooting program can realize any step in the sound source positioning-based shooting method when being executed by the processor.

Further, to achieve the above object, the present invention also provides a computer-readable storage medium including a sound source localization-based photographing program, which when executed by a processor, can implement any of the steps in the sound source localization-based photographing method as described above.

The invention provides a sound source positioning-based shooting method, a sound source positioning-based shooting device, electronic equipment and a computer-readable storage medium, which are used for receiving a shooting instruction and acquiring sound information; carrying out sound source positioning on the sound information to obtain alternative sound source position information; analyzing the position information of the alternative sound source to determine the position information of a target sound source; acquiring initial sound source position information, and calculating an adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information; adjusting a shooting angle of the camera unit based on the adjustment angle; and controlling the camera unit to shoot. After the alternative sound source position information is identified, the target sound source position information is screened from the alternative sound source position information, so that the accuracy of sound source positioning is improved; considering the shooting range of the camera, if and only when the target sound source position information and the initial sound source position information meet certain conditions, the angle of the camera is adjusted, but not the camera is adjusted when the sound source position information changes, so that the energy consumption is reduced, and the shooting is more intelligent.

Drawings

FIG. 1 is a flowchart illustrating steps of an embodiment of a sound source localization-based photographing method according to the present invention;

FIG. 2 is a block diagram of a camera according to the present invention;

FIG. 3 is a diagram of an alternative hardware architecture of the electronic device of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

The invention provides a shooting method based on sound source positioning. The method may be performed by an electronic device, which may be implemented by software and/or hardware.

Referring to fig. 1, a flowchart illustrating an embodiment of a sound source localization-based photographing method according to the present invention is shown.

In this embodiment, the sound source localization-based photographing method includes: step S1-step S6.

Step S1, receiving a shooting instruction, and controlling the sound acquisition unit to acquire sound information;

the present solution is explained by taking remote video teaching as an example. Because current camera and light irradiation angle are fixed, when the target object (for example, the teacher of giving lessons) that is shot by the camera walked the position and change great, the camera can't follow the target object and remove, the light that shines the angle and fix can't follow and treat that the discernment target removes and intelligent light filling, leads to the video image quality of shooing not good, for example the people's face shade appears, influences the whole effect of long-range teaching, has reduced teaching quality.

Therefore, in this embodiment, the sound source positioning algorithm is used to determine the position information of the target object, and the camera angle and the deflection angle of the light supplement lamp are adjusted in real time according to the position information of the target object, so that the camera and the light supplement lamp always face the target to be recognized, and the problem that the teaching quality is affected by the angle and the light of the video image shot by the camera is avoided.

In this embodiment, the shooting instruction may be issued by the client through the object to be shot, or may be issued by another person other than the object to be shot through the client.

The sound collecting unit is used for collecting sound information, and the sound information comprises sound signals and time difference of receiving the sound signals by different sound collecting modules.

The sound collecting unit can be an electret microphone array or an MEMS microphone array. The microphone array can be an array formed by arranging a group of acoustic sensors at different positions in space according to a certain shape rule, and is a device for carrying out spatial sampling on a sound signal which propagates in space. The shape arrangement rule formed by arranging the acoustic sensors in the microphone array may be referred to as a topology of the microphone array, and the microphone array may be divided into a linear microphone array, a planar microphone array and a stereo microphone array according to the topology of the microphone array. As an example, a linear microphone array may indicate that the centers of the array elements of the microphone array are located on the same straight line, such as a horizontal array; the planar microphone array can represent that the centers of array elements of the microphone array are distributed on a plane, such as a triangular array, a circular array, a T-shaped array, an L-shaped array, a square array and the like; the stereo microphone array can represent that the centers of the array elements of the microphone array are distributed in a stereo space, such as a polyhedral array, a spherical array and the like.

After sound signals are collected, the receiving time of the sound signals received by each sound collecting unit is respectively obtained, the time difference of the sound signals received by each sound collecting unit is respectively calculated, for example, when the microphone picks up the sound emitted by people in the surrounding environment, the voice of the person is recorded and the time at the moment is obtained, and it is conceivable that the time difference exists because the microphone is installed at different positions and the arrival time of the sound is different.

Step S2, positioning and analyzing the sound information by using a sound source positioning algorithm to obtain alternative sound source position information corresponding to the sound information;

in this embodiment, a plurality of microphones may be installed in a certain space (e.g., a classroom, a conference room, a studio, etc.), the position information of the sound source is calculated by obtaining sound signals obtained by at least two microphones and time differences of the sound signals received by the microphones respectively, and obtaining pre-stored position information and sound propagation speed, where the installation position information of the microphones is known, and the spatial position coordinates of the microphones are recorded in a preset storage path, for example: when a speaker makes a sound in the space, the microphones a and b in the space receive the sound respectively, and the difference t exists between the received times, the pre-stored spatial coordinate values of the microphones a and b are obtained, the sound velocity is fixed at V, and the spatial coordinate of the sound making position, that is, the current position of the speaker, can be calculated through an algorithm, and the specific calculation mode is not repeated herein.

Step S3, analyzing the candidate sound source position information, and determining the target sound source position information corresponding to the sound information;

it can be understood that, most of the sound sources are dense and numerous in the environment, and most of the sound signals collected by the sound collection unit are sound signals returned by detecting mixed sound containing abnormal sound signals, for example, the abnormal sound signals include car horns outside the space, non-target person speaking or other environmental noises, and therefore, the positioning result may include a plurality of sound source location information, that is, the above-mentioned candidate sound source location information may include a plurality of sound source location information. Therefore, accurate target sound source position information needs to be screened out from the plurality of candidate sound source position information.

In this embodiment, the analyzing the candidate sound source position information to determine the target sound source position information corresponding to the sound information includes:

a1. when a plurality of candidate sound source position information exists, acquiring distances corresponding to the plurality of candidate sound source position information; and

a2. and using the sound source position information with the distance smaller than or equal to a preset distance threshold value as the target sound source position information.

And the distance corresponding to the position information of the sound sources is the distance between the sound sources and the center of the sound acquisition unit. It is understood that the target sound source location information should be within a spatial range, that is, the distance of the sound source from the center of the sound collection module is less than a preset distance threshold (i.e., the maximum distance from the center of the sound collection unit to the current spatial range). By excluding the sound source position information whose distance is greater than the preset threshold, the influence of a sound source outside the space (e.g., whistling, non-indoor personnel) is effectively avoided.

In another embodiment, the analyzing the candidate sound source position information to determine target sound source position information corresponding to the sound information further includes:

b1. when a plurality of pieces of alternative sound source position information with the distance smaller than or equal to a preset threshold exist, extracting sound features from the sound information, judging identity information corresponding to the sound features, and recording the identity information as first identity information;

b2. acquiring real-time images of the multiple pieces of alternative sound source position information with the distance smaller than or equal to a preset threshold, identifying a face area from the real-time images, and performing identity identification to obtain identity information corresponding to the multiple pieces of alternative sound source position information, and recording the identity information as second identity information; and

b3. and judging whether identity information matched with the first identity information exists in the second identity information or not, and taking the alternative sound source position information corresponding to the second identity information matched with the first identity information as target sound source position information.

For example, after the position information of the candidate sound sources in a plurality of current spaces is determined, the identity of the target to be recognized is recognized by collecting real-time images corresponding to the position information of the candidate sound sources, sound features are extracted from the sound information, whether the identity corresponding to the sound features is the identity of the target to be recognized is judged, if yes, the sound signals are determined to correspond to the sound sources, and the position information of the target sound sources can be determined. Face recognition, voice feature extraction, identity recognition and the like are all the prior art and are not described herein. The embodiment avoids the situation that the place where other indoor speakers are located is taken as the position information of the target sound source by mistake through identity recognition and matching.

In other embodiments, when there is no second identity information matching the first identity information (e.g., no face is recognized), it indicates that the sound source location is incorrect. For example, a voice signal emitted by a speaker is reflected by a background wall or other reflection source and then picked up by a voice collecting unit. At this time, the sound source positioning of the sound signal picked up by the sound collecting unit is finished, and when the speaker makes a sound again, the sound collecting unit reproduces the sound signal to perform the sound source positioning.

In other embodiments, the shooting instruction includes identity information of the target speaker, and the identity information is recorded as target identity information; before step b2, the analyzing the candidate sound source position information to determine the target sound source position information corresponding to the sound information further includes:

acquiring target identity information in the shooting instruction, and judging whether the first identity information is consistent with the target identity information;

when the first identity information is judged to be consistent with the target identity information, executing step b 2; or

And when the first identity is judged to be inconsistent with the target identity information, judging that sound source positioning fails.

By adding the judgment, invalid data calculation can be avoided, and the sound source positioning efficiency is improved.

Step S4 of acquiring initial sound source position information, and calculating an adjustment angle of the image pickup unit based on the target sound source position information and the initial sound source position information;

here, the initial sound source position information may be default position information, that is, preset start information. The camera unit may be a camera for capturing a video or image of a person to be photographed.

Taking a classroom as an example, the initial sound source position information is a podium center.

In other embodiments, the initial sound source information may also be position information of a last located target sound source; or the center coordinates of the shooting area of the camera at that time.

In this embodiment, the calculating an adjustment angle of the image capturing unit based on the target sound source position information and the initial sound source position information includes:

c1. acquiring the position coordinates of the camera unit;

c2. respectively constructing straight lines of the camera shooting unit, the target sound source position information and the initial sound source position information; and

c3. and calculating an angle between the constructed straight lines as the adjustment angle.

It should be noted that the installation position coordinates of the camera are known and stored in the preset storage path, the position coordinates of the camera are obtained, based on the position coordinates of the camera, the initial sound source position information and the target sound source position information, a straight line L1 between the camera and the initial sound source and a straight line L2 between the camera and the target sound source position information are respectively determined, and the angle between L1 and L2 is calculated as the adjustment angle of the camera.

Step S5, when the adjusting angle is larger than or equal to a preset angle threshold value, adjusting the shooting angle of the camera shooting unit based on the adjusting angle;

and a step S6 of controlling the image pickup unit to perform shooting.

It can be understood that the camera can shoot a certain area range aiming at an angle, and the area range comprises a certain angle range, so that if the calculated adjustment angle is smaller, the image of the speaker at the sound source can be shot without adjusting the range of the camera. And if and only if the calculated adjusting angle is larger than or equal to the preset angle threshold value, adjusting the shooting angle of the camera, and shooting the area corresponding to the target sound source position information.

In other embodiments, the step S5 further includes:

d1. acquiring the ambient illuminance sent by the photosensitive unit in real time;

d2. when the ambient illuminance is less than or equal to a preset illuminance threshold, calculating an adjustment angle of the light supplement unit (e.g., a light supplement lamp) based on the target sound source position information and the initial sound source position information; and

d3. and adjusting the illumination angle of the light supplementing unit based on the adjustment angle, and adjusting the illumination brightness of the light supplementing unit based on the ambient light illumination.

And the position coordinates of the light supplement lamp are predetermined and stored in a preset storage path. And if and only if the ambient light illumination is less than or equal to the preset light illumination threshold value, adjusting the angle and the brightness of the light supplement lamp. The step of calculating the adjustment angle of the fill light is substantially the same as the adjustment scheduling of the camera, and is not described herein again.

According to the shooting method based on sound source positioning, after the alternative sound source position information is identified, the target sound source position information is screened from the alternative sound source position information, and the accuracy of sound source positioning is improved; considering the shooting range of the camera, if and only when the target sound source position information and the initial sound source position information meet certain conditions, the angle of the camera is adjusted, but not the camera is adjusted when the sound source position information changes, so that the energy consumption is reduced, and the shooting is more intelligent.

The invention also provides a shooting device.

Fig. 2 is a block diagram of a camera according to an embodiment of the present invention.

The shooting device 10 according to the embodiment can include: module 110-module 160. A module according to the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the sound acquisition module 110 is configured to receive a shooting instruction and control the sound acquisition unit to acquire sound information;

in this embodiment, the shooting instruction may be issued by the client through the object to be shot, or may be issued by another person other than the object to be shot through the client.

The sound information comprises sound signals and time differences of the sound signals received by different sound acquisition modules.

The sound collection unit may be a microphone array. The microphone array can be an array formed by arranging a group of acoustic sensors at different positions in space according to a certain shape rule, and is a device for carrying out spatial sampling on a sound signal which propagates in space. The shape arrangement rule formed by arranging the acoustic sensors in the microphone array may be referred to as a topology of the microphone array, and the microphone array may be divided into a linear microphone array, a planar microphone array and a stereo microphone array according to the topology of the microphone array.

After the sound signal is collected by the sound collection module 110, the receiving time of the sound signal received by each sound collection unit is respectively obtained, and the time difference of the sound signal received by each sound collection unit is respectively calculated, for example, when the microphone picks up the sound emitted by the people in the surrounding environment, the voice of the person is recorded and the time at this moment is obtained.

A first positioning module 120, configured to perform positioning analysis on the sound information by using a sound source positioning algorithm to obtain alternative sound source position information corresponding to the sound information;

in this embodiment, a plurality of microphones may be installed in a certain space (e.g., a classroom, a conference room, a studio, etc.), the position information of the sound source is calculated by obtaining sound signals obtained by at least two microphones and time differences of the sound signals received by the microphones respectively, and obtaining pre-stored position information and sound propagation speed, where the installation position information of the microphones is known, and the spatial position coordinates of the microphones are recorded in a preset storage path, for example: when a speaker makes a sound in the space, the microphones a and b in the space receive the sound respectively, and the received time has a difference t, and obtain the pre-stored spatial coordinate values of the microphones a and b, and the sound velocity is fixed at V, the first positioning module 120 can calculate the spatial coordinate of the sound making position, that is, the current position of the speaker, through an algorithm, and the specific calculation method is not described herein.

A second positioning module 130, configured to analyze the candidate sound source position information, and determine target sound source position information corresponding to the sound information;

it can be understood that, most of the sound sources are dense and numerous in the environment, and most of the sound signals collected by the sound collection unit are sound signals returned by detecting mixed sound containing abnormal sound signals, for example, the abnormal sound signals include car horns outside the space, non-target person speaking or other environmental noises, and therefore, the positioning result may include a plurality of sound source location information, that is, the above-mentioned candidate sound source location information may include a plurality of sound source location information. Therefore, the second positioning module 130 needs to screen out accurate target sound source position information from a plurality of candidate sound source position information.

In this embodiment, the analyzing the candidate sound source position information to determine the target sound source position information corresponding to the sound information includes:

a1. when a plurality of candidate sound source position information exists, acquiring distances corresponding to the plurality of candidate sound source position information; and

a2. and using the sound source position information with the distance smaller than or equal to a preset distance threshold value as the target sound source position information.

And the distance corresponding to the position information of the sound sources is the distance between the sound sources and the center of the sound acquisition unit. It is understood that the target sound source location information should be within a spatial range, that is, the distance of the sound source from the center of the sound collection module is less than a preset distance threshold (i.e., the maximum distance from the center of the sound collection unit to the current spatial range). By excluding the sound source position information whose distance is greater than the preset threshold, the influence of a sound source outside the space (e.g., whistling, non-indoor personnel) is effectively avoided.

In another embodiment, the analyzing the candidate sound source position information to determine target sound source position information corresponding to the sound information further includes:

b1. when a plurality of pieces of alternative sound source position information with the distance smaller than or equal to a preset threshold exist, extracting sound features from the sound information, judging identity information corresponding to the sound features, and recording the identity information as first identity information;

b2. acquiring real-time images of the multiple pieces of alternative sound source position information with the distance smaller than or equal to a preset threshold, identifying a face area from the real-time images, and performing identity identification to obtain identity information corresponding to the multiple pieces of alternative sound source position information, and recording the identity information as second identity information; and

b3. and judging whether identity information matched with the first identity information exists in the second identity information or not, and taking the alternative sound source position information corresponding to the second identity information matched with the first identity information as target sound source position information.

For example, after the position information of the candidate sound sources in a plurality of current spaces is determined, the identity of the target to be recognized is recognized by collecting real-time images corresponding to the position information of the candidate sound sources, sound features are extracted from the sound information, whether the identity corresponding to the sound features is the identity of the target to be recognized is judged, if yes, the sound signals are determined to correspond to the sound sources, and the position information of the target sound sources can be determined. Face recognition, voice feature extraction, identity recognition and the like are all the prior art and are not described herein. The embodiment avoids the situation that the place where other indoor speakers are located is taken as the position information of the target sound source by mistake through identity recognition and matching.

In other embodiments, when there is no second identity information matching the first identity information (e.g., no face is recognized), it indicates that the sound source location is incorrect. For example, a voice signal emitted by a speaker is reflected by a background wall or other reflection source and then picked up by a voice collecting unit. At this time, the sound source positioning of the sound signal picked up by the sound collecting unit is finished, and when the speaker makes a sound again, the sound collecting unit reproduces the sound signal to perform the sound source positioning.

In other embodiments, the shooting instruction includes identity information of the target speaker, and the identity information is recorded as target identity information; before step b2, the analyzing the candidate sound source position information to determine the target sound source position information corresponding to the sound information further includes:

acquiring target identity information in the shooting instruction, and judging whether the first identity information is consistent with the target identity information;

when the first identity information is judged to be consistent with the target identity information, executing step b 2; or

And when the first identity is judged to be inconsistent with the target identity information, the sound source positioning fails.

By adding the judgment, invalid data calculation can be avoided, and the sound source positioning efficiency is improved.

A calculating module 140, configured to obtain initial sound source position information, and calculate an adjustment angle of the camera unit based on the target sound source position information and the initial sound source position information;

here, the initial sound source position information may be default position information, that is, preset start information. The camera unit may be a camera for capturing a video or image of a person to be photographed.

Taking a classroom as an example, the initial sound source position information is a podium center.

In other embodiments, the initial sound source information may also be position information of a last located target sound source; or the center coordinates of the shooting area of the camera at that time.

In this embodiment, the calculating an adjustment angle of the image capturing unit based on the target sound source position information and the initial sound source position information includes:

c1. acquiring the position coordinates of the camera unit;

c2. respectively constructing straight lines of the camera shooting unit, the target sound source position information and the initial sound source position information; and

c3. and calculating an angle between the constructed straight lines as the adjustment angle.

It should be noted that the installation position coordinates of the camera are known and stored in the preset storage path, the position coordinates of the camera are obtained, the calculation module 140 determines a straight line L1 between the camera and the initial sound source and a straight line L2 between the camera and the target sound source position information respectively based on the position coordinates of the camera, the initial sound source position information and the target sound source position information, and calculates an angle between L1 and L2 as an adjustment angle of the camera.

The adjusting module 150 is configured to adjust a shooting angle of the camera unit based on the adjustment angle when the adjustment angle is greater than or equal to a preset angle threshold;

and a shooting module 160 for controlling the camera unit to shoot.

It can be understood that the camera can shoot a certain area range aiming at an angle, and the area range comprises a certain angle range, so that if the calculated adjustment angle is smaller, the image of the speaker at the sound source can be shot without adjusting the range of the camera. If and only if the calculated adjustment angle is greater than or equal to the preset angle threshold, the adjustment module 150 adjusts the shooting angle of the camera, and the shooting module 160 controls the camera unit to shoot the region corresponding to the target sound source position information.

In other embodiments, the adjusting module 150 is further configured to:

d1. acquiring the ambient illuminance sent by the photosensitive unit in real time;

d2. when the ambient illuminance is less than or equal to a preset illuminance threshold, calculating an adjustment angle of the light supplement unit (e.g., a light supplement lamp) based on the target sound source position information and the initial sound source position information; and

d3. and adjusting the illumination angle of the light supplementing unit based on the adjustment angle, and adjusting the illumination brightness of the light supplementing unit based on the ambient light illumination.

And the position coordinates of the light supplement lamp are predetermined and stored in a preset storage path. The adjusting module 150 adjusts the angle and the brightness of the fill-in light if and only if the ambient illuminance is less than or equal to the preset illuminance threshold. The step of calculating the adjustment angle of the fill light is substantially the same as the adjustment scheduling of the camera, and is not described herein again.

The embodiment of the invention also provides the electronic equipment.

Referring to fig. 3, a diagram of an alternative hardware architecture of the electronic device of the present invention is shown.

In the embodiment, the application electronic device 1 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus.

The memory 11 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, e.g. a hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1.

The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as the sound source localization-based photographing program 110, but also to temporarily store data that has been output or is to be output.

The processor 12 may be, in some embodiments, a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip for executing program code stored in the memory 11 or Processing data, such as the sound source localization-based camera 110.

The network interface 13 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), typically used for establishing a communication connection between the electronic device 1 and another electronic device, e.g. a client terminal (not shown).

It is noted that fig. 3 only shows the electronic device 1 with components 11-13, and that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components, as will be appreciated by a person skilled in the art.

Optionally, the electronic device 1 may further comprise a user interface, the user interface may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may further comprise a standard wired interface, a wireless interface.

Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch screen, or the like. The display, which may also be referred to as a display screen or display unit, is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.

In the embodiment of the electronic device 1 shown in fig. 3, when the program code of the sound source localization-based photographing program 110 is stored in the memory 11 as a kind of computer storage medium and the processor 12 executes the program code of the sound source localization-based photographing program 10, any steps of the sound source localization-based photographing method as described in the above embodiment can be implemented.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, for example, the sound source localization-based photographing program 110; the storage data area may store data created according to the use of the blockchain node, and the like. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

15页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:雷达安装参数计算方法以及装置

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!