Audio mixing method for teaching recording and broadcasting

文档序号:1186294 发布日期:2020-09-22 浏览:21次 中文

阅读说明:本技术 一种用于教学录播的音频混音方法 (Audio mixing method for teaching recording and broadcasting ) 是由 任军军 罗浩 孙云云 孙旭光 于 2020-06-15 设计创作,主要内容包括:本发明公开了一种用于教学录播的音频混音方法,首先采集参与混音的n路拾音器的当前音频帧信号并计算能量有效值,然后根据能量有效值的大小排序并进行混音权重的更新和调整,最后对混音输出进行混音处理,计算得到混音的最终输出信号。本发明提出的混音方法可以将拾音器采集信号、无线麦采集信号和课件声信号三者准确的混合在一起,混音后的音频信号具有清晰度高,音质好的特点。本发明可以准确判断教学录播场景下的多种教学场景,并根据实际情况对混音权重因子进行实时更新,从而使得混音后音频信号具有出色自然的音质。(The invention discloses an audio mixing method for teaching recording and broadcasting, which comprises the steps of firstly collecting current audio frame signals of n sound pickups participating in audio mixing, calculating an energy effective value, then sequencing according to the energy effective value, updating and adjusting audio mixing weight, finally carrying out audio mixing processing on audio mixing output, and calculating to obtain a final output signal of the audio mixing. The audio mixing method provided by the invention can accurately mix the pickup collected signal, the wireless microphone collected signal and the courseware sound signal together, and the audio signal after audio mixing has the characteristics of high definition and good sound quality. The method can accurately judge various teaching scenes under the teaching recording and broadcasting scenes, and update the audio mixing weight factor in real time according to the actual situation, so that the audio signal after audio mixing has excellent and natural tone quality.)

1. An audio mixing method for teaching recording and broadcasting is characterized by comprising the following specific steps:

step 1: collecting current audio frame signals of n-path sound pickup participating in sound mixing and recording the signals as xiWherein i is 1,2, … …, n; the wireless microphone audio frame signal is w, and the courseware audio frame signal is c;

step 2: respectively calculating the energy effective values of the acquired n paths of sound pickup audio frame signals; wherein, the energy effective value of the kth path audio frame signal is:

Figure FDA0002540086660000011

wherein, N is the total number of sampling points of the audio frame signal, and i is the serial number of the sampling points;

and step 3: sorting the energy effective values of the n paths of audio frame signals obtained by calculation in the step 2, selecting a sound pickup with the maximum energy effective value of the current frame signal, and recording the energy effective value of the sound pickup as SmWherein m is the serial number corresponding to the sound pickup with the maximum energy effective value;

and 4, step 4: let the mixing weight of the previous frame signal of n-path sound pickup be QiWherein i is 1,2, … …, n; when the current frame is subjected to sound mixing, the sound mixing weight Q is calculated according to the effective value of energyiUpdating is carried out; mixing weight QiThe update process of (2) is as follows:

Figure FDA0002540086660000012

step is a step factor for updating the sound mixing weight, and the value range is 0-1;

and 5: mixing weight QiThe value range of (1) is 0-1, and the sound mixing weight Q is calculated according to the step (4)iUpdated result, if QiIs out of the specified rangeEnclose, then pair QiThe following adjustments were made:

step 6: when the current frame data of the audio frame signals collected by the n-path sound pickup are subjected to sound mixing, the sound mixing weight Q adjusted in the step 5 is adoptediAt this time, the mixed sound of the n-way sound pickup is output as

And 7: carry out the audio mixing to wireless microphone audio frequency frame signal w, courseware audio frequency frame signal c and adapter audio mixing output y and handle, concrete audio mixing process is: respectively carrying out silence detection on the wireless microphone audio frame signal w and the courseware audio frame signal c, gradually closing the sound mixing output y of the sound pick-up when detecting that the w or the c has sound, namely updating the sound mixing weight Q of the sound mixing output y of the sound pick-upyGradually decrease until Qy0. Wherein the mixing weight QyThe value range of (1) is 0-1; when detecting that w or c are both mute, updating the sound mixing weight Q of the sound pickup output yyGradually enlarge it until Qy=1;

And 8: the mixing weight Q calculated according to the step 7yMixing the wireless microphone audio frame signal w, the courseware audio frame signal c and the sound mixer output y, wherein the sound mixing output of the wireless microphone audio frame signal w, the courseware audio frame signal c and the sound mixer output y is

z=w+c+Qy·y

Where z is the final output signal of the audio mix.

2. The audio mixing method of claim 1, wherein the length of a frame signal of an audio frame is 20ms, the sampling rate is 32kHz, and the total number of sampling points N is 640.

3. The audio mixing method according to claim 1, wherein in step 7, the silence detection is implemented by a counter, and the energy value of the current frame is first calculated, and then the current frame and the previous frame are smoothed to obtain a smoothed energy value; comparing the smoothed energy value with a set threshold value T, judging that the current frame is a talk segment when the smoothed energy value is greater than the threshold value T, and resetting the counter X; and if the energy value after smoothing is smaller than the threshold value T, adding 1 to the counter X, comparing the counter X with the set minimum detection frame number C, if the counter X is larger than C, judging that the speech segment is absent, otherwise, judging that the speech segment is present.

4. The audio mixing method for teaching recording and playing according to claim 3, wherein the threshold T is selected as an energy value corresponding to the ambient sound with a volume of about 60dB, and the minimum detection frame number C is 5-10.

Technical Field

The invention relates to the technical field of audio recording and broadcasting for teaching, in particular to an audio mixing method for recording and broadcasting for teaching.

Background

The teaching recording and broadcasting system records images, sounds, courseware and the like of a lesson into a standard network format, and carries out synchronous live broadcasting and later-stage on-demand broadcasting through a network, thereby realizing effective sharing of excellent teaching resources. In a recording and playing system, the quality of audio is very important, and the quality of recorded and played audio is directly affected by the sound mixing technology.

Most recording and broadcasting systems at present require recording the teaching voice of teachers and recording the interaction conditions of teachers and students in class, namely recording the voice of the students answering questions. In addition, it is also necessary to record the courseware sound played in class. In an actual class, a teacher usually wears wireless microphones such as a neck microphone and a head microphone to record and broadcast and locally amplify sound; the sound of the student is collected by a plurality of sound collectors or microphones; the sound of courseware is collected through the linear output of the computer.

As is well known, recording and broadcasting systems for teaching require recording of the sound of teachers, students and courseware, and traditionally have one or more microphones installed in a classroom. Each microphone captures ambient sounds from the installation site including the teacher's speech, the student's response to questions, and the courseware sounds played from speakers in the classroom along with local amplification by the wireless microphone. Because the sound source is more, the reverberation phenomenon can be caused inevitably, for example, the reverberation caused by the direct sound of the teacher speaking and the sound amplified through the loudspeaker, in addition, the sound played by the courseware is collected by the sound pick-up after being played through the loudspeaker, the restoring degree of the sound is also reduced greatly, and the definition of the recorded sound is reduced due to the factors. In addition, if a direct mixing operation is performed by collecting sounds through a plurality of sound collectors at the same time, a "comb filter effect" is generally caused, resulting in distortion of the mixed sounds due to phase differences of the sounds arriving at the respective sound collectors from the same sound source.

Disclosure of Invention

The invention aims to provide an audio mixing method which can improve the quality of recorded audio and is used for teaching recording and broadcasting aiming at the defects of the prior art.

The purpose of the invention is realized by the following technical scheme: an audio mixing method for teaching recording and broadcasting comprises the following specific steps:

step 1: collecting current audio frame signals of n-path sound pickup participating in sound mixing and recording the signals as xiWherein i is 1,2, … …, n; the wireless microphone audio frame signal is w, and the courseware audio frame signal is c;

step 2: respectively calculating the energy effective values of the acquired n paths of sound pickup audio frame signals; wherein, the energy effective value of the kth path audio frame signal is:

wherein, N is the total number of sampling points of the audio frame signal, and i is the serial number of the sampling points;

and step 3: sorting the energy effective values of the n paths of audio frame signals obtained by calculation in the step 2, selecting a sound pickup with the maximum energy effective value of the current frame signal, and recording the energy effective value of the sound pickup as SmWherein m is the serial number corresponding to the sound pickup with the maximum energy effective value;

and 4, step 4: let the mixing weight of the previous frame signal of n-path sound pickup be QiWherein i is 1,2, … …, n; when the current frame is subjected to sound mixing, the sound mixing weight Q is calculated according to the effective value of energyiUpdating is carried out; mixing weight QiThe update process of (2) is as follows:

step is a step factor for updating the sound mixing weight, and the value range is 0-1;

and 5: mixing weight QiThe value range of (1) is 0-1, and the sound mixing weight Q is calculated according to the step (4)iUpdated result, if QiIf the value of (2) exceeds the specified range, Q is selectediThe following adjustments were made:

step 6: when the current frame data of the audio frame signals collected by the n-path sound pickup are subjected to sound mixing, the sound mixing weight Q adjusted in the step 5 is adoptediAt this time, the mixed sound of the n-way sound pickup is output as

i=1,2,……,n

And 7: carry out the audio mixing to wireless microphone audio frequency frame signal w, courseware audio frequency frame signal c and adapter audio mixing output y and handle, concrete audio mixing process is: respectively carrying out silence detection on the wireless microphone audio frame signal w and the courseware audio frame signal c, gradually closing the sound mixing output y of the sound pick-up when detecting that the w or the c has sound, namely updating the sound mixing weight Q of the sound mixing output y of the sound pick-upyGradually decrease until Qy0. Wherein the mixing weight QyThe value range of (1) is 0-1; when detecting that w or c are both mute, updating the sound mixing weight Q of the sound pickup output yyGradually enlarge it until Qy=1;

And 8: the mixing weight Q calculated according to the step 7yMixing the wireless microphone audio frame signal w, the courseware audio frame signal c and the sound mixer output y, wherein the sound mixing output of the wireless microphone audio frame signal w, the courseware audio frame signal c and the sound mixer output y is

z=w+c+Qy·y

Where z is the final output signal of the audio mix.

Further, the length of a frame signal of an audio frame is 20ms, the sampling rate is 32kHz, and the total number of sampling points N is 640.

Furthermore, in step 7, the silence detection is implemented by a counter, and the energy value of the current frame is calculated first, and then the signals of the current frame and the previous frame are smoothed to obtain a smoothed energy value; comparing the smoothed energy value with a set threshold value T, judging that the current frame is a talk segment when the smoothed energy value is greater than the threshold value T, and resetting the counter X; and if the energy value after smoothing is smaller than the threshold value T, adding 1 to the counter X, comparing the counter X with the set minimum detection frame number C, if the counter X is larger than C, judging that the speech segment is absent, otherwise, judging that the speech segment is present.

Further, the threshold T is selected to be an energy value corresponding to the ambient sound about 60dB, and the minimum detection frame number C is 5-10.

The invention has the beneficial effects that: the audio mixing method provided by the invention can accurately mix the signals collected by the pickup, the signals collected by the wireless microphone and the courseware sound signals together, and the audio signals after audio mixing have the characteristics of high definition and good sound quality. The method can accurately judge various teaching scenes under the teaching recording and broadcasting scenes, and update the audio mixing weight factor in real time according to the actual situation, so that the audio signal after audio mixing has excellent and natural tone quality.

Drawings

FIG. 1 is a block diagram of the mixing method according to the present invention;

FIG. 2 is a flowchart illustrating the mixing method according to the present invention;

FIG. 3 is a flow chart of the silence detection unit according to the present invention;

fig. 4 is a structural diagram of the audio system for teaching recording and playing of the present invention.

Detailed Description

The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.

The structure block diagram of the mixing method of the present invention is shown in fig. 1, and includes: the sound pick-up unit is used for collecting audio signals of all paths participating in sound mixing; the sound mixing weight is connected with the sound pickup unit and used for weighting the collected signals of each sound pickup so as to avoid the over-obvious switching of the output audio signals after sound mixing; the audio mixer is connected with the audio mixing weight and used for audio mixing of the weighted audio signals of each path; a mixing weight connected to the mixer for weighting the first mixed output signal; the wireless microphone unit is used for acquiring wireless microphone signals required by secondary sound mixing; the courseware unit is used for acquiring courseware signals required by the second audio mixing; the silence detection unit is connected with the wireless microphone unit and the courseware unit and is used for judging whether the audio signals have voice signals or not; and the sound mixer connected with the silence detection is used for mixing sound of an output signal, a wireless microphone signal and a courseware signal after sound mixing of the sound pickup and outputting a final mixed sound signal.

The work flow chart of the sound mixing method of the invention is shown in fig. 2, and the concrete implementation steps are as follows:

step 1: collecting current audio frame signals of n-path sound pickup participating in sound mixing and recording the signals as xiWhere i is 1,2, … …, n. The wireless microphone audio frame signal is w, and the courseware audio frame signal is c.

Step 2: and respectively calculating the energy effective values of the acquired n paths of sound pickup audio frame signals. Wherein, the energy effective value of the kth path audio frame signal is:

Figure BDA0002540086670000031

wherein, N is the total number of sampling points of the audio frame signal, i is the serial number of the sampling points, the length of a frame signal in the invention is 20ms, the sampling rate is 32kHz, so the total number of sampling points N is 640;

and step 3: sorting the energy effective values of the n paths of audio signals obtained by calculation in the step 2, selecting a sound pickup with the maximum energy effective value of the current frame signal, and recording the energy effective value of the sound pickup as SmAnd m is the serial number corresponding to the sound pickup with the maximum energy effective value.

And 4, step 4: setting the sound mixing weight of n-path sound pickup at the current frame as QiWhere i is 1,2, … …, n. When the current frame is subjected to sound mixing, the sound mixing weight Q is calculated according to the effective value of energyiAnd (6) updating. Mixing weight QiThe update process of (2) is as follows:

Figure BDA0002540086670000041

step is a step factor for updating the sound mixing weight, and the value range is 0-1. The value of step can be selected according to the actual situation.

And 5: mixing weight QiThe value range of (1) is 0-1, and the step (4) is carried outMixing weight QiUpdated result, if QiIf the value of (2) exceeds the specified range, Q is selectediThe following adjustments were made:

step 6: when the current frame data of the audio frame signals collected by the n-path sound pickup are subjected to sound mixing, the updated sound mixing weight Q is adoptediAt this time, the mixed sound of the n-way sound pickup is output as

i=1,2,……,n

And 7: and carrying out sound mixing processing on the wireless microphone audio frame signal w, the courseware audio frame signal c and the sound pick-up sound mixing output y. The concrete mixing process is as follows: respectively carrying out mute detection on the wireless microphone audio frame signal w and the courseware audio frame signal c, carrying out mute processing on the sound mixing output y of the sound collector when detecting that w or c has sound, and updating the sound mixing weight Q of the sound mixing output y of the sound collectoryIt is made gradually smaller to prevent the occurrence of a noticeable sound switching phenomenon. Wherein the mixing weight QyThe value range of (1) is 0-1. When detecting that w or c are both mute, updating the sound mixing weight Q of the sound pickup output yyMaking it progressively larger.

And 8: the mixing weight Q calculated according to the step 7yMixing the audio frame signal w of the wireless microphone, the audio frame signal c of the courseware and the sound mixing output y of the sound pick-up, wherein the sound mixing output of the three is

z=w+c+Qy·y

Where z is the final output signal.

The silence detection calculation method in the present invention is shown in fig. 3. Firstly, calculating the energy value of the current frame, and then smoothing the current frame and the previous frame to obtain the smoothed energy value. And comparing the smoothed energy value with a set threshold value T, judging that the current frame is a session segment when the smoothed energy value is greater than the threshold value T, and clearing the counter X. And when the energy value after smoothing is less than the threshold value T, adding 1 to the counter X, comparing the counter X with the set minimum detection frame number C, if the counter X is more than C, judging as a non-speech section, otherwise, judging as a speech section. The threshold T is selected as an energy value corresponding to the ambient sound of about 60dB, and the minimum detection frame number C is 5-10.

The effects achieved by the present invention will be further described with reference to examples.

FIG. 4 shows the basic components of a teaching recorded broadcast audio system in which a wireless microphone captures the voice of a teacher speaking; the sound pickup 1 and the sound pickup 2 are respectively arranged at the front part and the rear part of a student area, the sound pickup 1 collects the sound of students in the front row, and the sound pickup 2 collects the sound of students in the rear row; the courseware is audio output of a computer and is an audio signal played by the computer courseware used by a teacher in class; the audio signals are all connected to the audio mixer, and output to the recording and broadcasting host after audio mixing, wherein the signals of the wireless microphone and the courseware are also output to the sound box for local sound amplification.

Table 1 shows the content of the teaching activities in the classroom during a period of time, in terms of time, table 2 shows the content contained in the sound signals collected by the microphone during the period of time, and table 3 shows the comparison of the output audio signals of the conventional mixing method and the mixing method used in the present invention during the period of time.

Teaching contents in the time period of table 10-t 3

Time period sequence number Starting time End time Teaching content
1 0 t1 Teacher speaking
2 t1 t2 Student speaks
3 t2 t3 Playing courseware

Sound content collected by pickup in time period of table 20-t 3

Comparison of mixing outputs of different mixing methods in the time period of Table 30-t 3

Compared with the traditional mixing method, the output of the mixing method reduces the number of sound sources for mixing, reduces the environmental noise in the output and improves the definition of sound in any teaching scene. Under two scenes of speaking by a teacher and playing courseware, the environmental noise is completely removed; under the scene of speaking by students, only the signal of the sound pickup at a short distance from the speaker is output, the number of sound sources for sound mixing is also reduced, the superposition and reverberation of environmental noise are reduced, and the reduction of the speech definition caused by the comb filter effect is eliminated.

The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

13页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种多角色智能音箱伴侣系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!