Music beat detection method and system

文档序号:1298237 发布日期:2020-08-07 浏览:28次 中文

阅读说明:本技术 音乐节拍检测方法和系统 (Music beat detection method and system ) 是由 邓海峰 林立 于 2020-04-14 设计创作,主要内容包括:本发明提供一种音乐节拍检测方法和系统,接收音乐后,将双通道音频进行左右声道相加,得到音乐PCM类型的数据序列,对数据序列进行分组相加,求取平均值以进行简化,得到简化后音频信号;对简化后音频信号进行自相关函数计算,再进行波峰检测得到所有波峰;对简化后音频信号进行包络提取,得到音频包络信号,对音频包络信号进行一阶差分处理、半波整流处理和加权补偿处理,得到音频峰值信号,利用动态阈值波峰检测输出所有节拍点;利用所有波峰和所有节拍点,通过多路径搜索得到最优节拍路径,记录最优节拍路径对应的音频节奏和节拍点。解决音乐节奏检测和具体的节奏点精准定位问题,通过寻找可能的节拍值,便于实现的自相关函数来估算节拍值。(The invention provides a music beat detection method and a system, wherein after music is received, left and right sound channels of double-channel audio are added to obtain a data sequence of a music PCM type, the data sequence is added in groups, and an average value is calculated to simplify the data sequence to obtain a simplified audio signal; performing autocorrelation function calculation on the simplified audio signal, and performing peak detection to obtain all peaks; envelope extraction is carried out on the simplified audio signal to obtain an audio envelope signal, first-order difference processing, half-wave rectification processing and weighting compensation processing are carried out on the audio envelope signal to obtain an audio peak value signal, and all beat points are output by utilizing dynamic threshold peak detection; and obtaining an optimal beat path by utilizing all wave crests and all beat points through multi-path search, and recording the audio rhythm and the beat points corresponding to the optimal beat path. The method solves the problems of music rhythm detection and accurate positioning of specific rhythm points, and estimates the rhythm value through searching possible rhythm values and the autocorrelation function which is convenient to realize.)

1. A music tempo detection method is characterized by comprising the following steps:

audio segmentation step: after receiving music, adding left and right sound channels of the two-channel audio to obtain a data sequence of a music PCM type, performing grouped addition on the data sequence, and solving an average value to simplify the data sequence to obtain a simplified audio signal;

a step of detecting beats: performing autocorrelation function calculation on the simplified audio signal, and then performing peak detection to obtain all peaks;

a peak value extraction step: envelope extraction is carried out on the simplified audio signal to obtain an audio envelope signal, first-order difference processing, half-wave rectification processing and weighting compensation processing are carried out on the audio envelope signal to obtain an audio peak value signal, and all beat points are output by utilizing dynamic threshold peak detection;

and (3) multipath searching: and obtaining an optimal beat path by utilizing all wave crests and all beat points through multi-path search, and recording the audio rhythm and the beat points corresponding to the optimal beat path.

2. The music beat detection method according to claim 1, characterized in that the music PCM type data sequence x [ n ] represents a one-dimensional discrete-time signal, with the music binaural L [ n ], R [ n ], x [ n ] ═ L [ n ] + R [ n ];

a group of audio packets are summed every 20ms and averagedm represents the number of audio samples of 20ms length, m is a constant, i is greater than or equal to 0 and less than m.

3. The method for detecting music tempo according to claim 2, wherein said extracting the envelope of the simplified audio signal is to perform low-pass filtering with a cut-off frequency of 20Hz on the simplified audio signal, and then perform processing with a fast-charging and slow-releasing envelope detector to retain the protrusions of the envelope, and output the signals as an audio envelope signal xenv[n];

Let t [ n ]]Is xdeci[n]The output of the low-pass filter with the cut-off frequency of 20Hz is output as the envelope detector with fast charging and slow discharging

4. A music beat detection method according to claim 3, characterized by letting the audio envelope signal xenv[n]Performing first-order difference, half-wave rectification and weighting compensation, highlighting peak point in envelope, outputting processed highlighted audio peak signal, and expressing as audio peak signal xs[n],xs[n]Is xenv[n]Obtaining a peak signal after first-order difference, half-wave rectification and weighting compensation;

the first order difference dx [ n ]]=xenv[n]-xenv[n-1]Half-wave rectification hx [ n ]]=max(dx[n]0), weighted compensation xs[n]=0.5*31.5*hx[n]+0.5*dx[n];

Reuse of dynamic threshold based peak detector detection xs[n]Output all possibilitiesThe beat point E [ n ]]。

5. The method of claim 2, wherein the calculating the autocorrelation function for the simplified audio signal is for xdeci[n]Calling autocorrelation function and outputting autocorrelation sequence ac [ n ]]For the autocorrelation sequence ac [ n ]]Calling a wave crest detector to find out all wave crests EacPeak[n]Each EacPeak[n]The possible beat values are saved.

6. A music tempo detection system, characterized by comprising the following modules:

an audio segmentation module: after receiving music, adding left and right sound channels of the two-channel audio to obtain a data sequence of a music PCM type, performing grouped addition on the data sequence, and solving an average value to simplify the data sequence to obtain a simplified audio signal;

a beat detection module: performing autocorrelation function calculation on the simplified audio signal, and then performing peak detection to obtain all peaks;

a peak value extraction module: envelope extraction is carried out on the simplified audio signal to obtain an audio envelope signal, first-order difference processing, half-wave rectification processing and weighting compensation processing are carried out on the audio envelope signal to obtain an audio peak value signal, and all beat points are output by utilizing dynamic threshold peak detection;

a multipath searching module: and obtaining an optimal beat path by utilizing all wave crests and all beat points through multi-path search, and recording the audio rhythm and the beat points corresponding to the optimal beat path.

7. The music beat detection system according to claim 6, characterized in that the music PCM type data sequence x [ n ] represents a one-dimensional discrete-time signal, with music binaural L [ n ], R [ n ], x [ n ] + L [ n ] + R [ n ];

a group of audio packets are summed every 20ms and averagedm represents the number of audio samples of 20ms length, m is a constant, i is greater than or equal to 0 and less than m.

8. The system for detecting music tempo according to claim 7, wherein said extracting the envelope of the simplified audio signal is to perform a low-pass filtering process with a cut-off frequency of 20Hz on the simplified audio signal, and then perform a fast-charging and slow-releasing envelope detector to process the envelope-preserving protrusions, and output the processed envelope signal as an audio envelope signal xenv[n];

Let t [ n ]]Is xdeci[n]The output of the low-pass filter with the cut-off frequency of 20Hz is output as the envelope detector with fast charging and slow discharging

9. The music beat detection system according to claim 8, characterized in that let the audio envelope signal xenv[n]Performing first-order difference, half-wave rectification and weighting compensation, highlighting peak point in envelope, outputting processed highlighted audio peak signal, and expressing as audio peak signal xs[n],xs[n]Is xenv[n]Obtaining a peak signal after first-order difference, half-wave rectification and weighting compensation;

the first order difference dx [ n ]]=xenv[n]-xenv[n-1]Half-wave rectification hx [ n ]]=max(dx[n]0), weighted compensation xs[n]=0.5*31.5*hx[n]+0.5*dx[n];

Reuse of dynamic threshold based peak detector detection xs[n]Outputs all possible beat points E [ n ]]。

10. The music beat detection system of claim 7, wherein the autocorrelation function calculation on the reduced audio signal is for xdeci[n]Calling autocorrelation function and outputting autocorrelation sequence ac [ n ]]For the autocorrelation sequence ac [ n ]]Calling a wave crest detector to find out all wave crests EacPeak[n]Each EacPeak[n]Save possibilitiesThe beat value.

Technical Field

The invention relates to the technical field of audio data processing, in particular to a music beat detection method and a music beat detection system.

Background

The music rhythm detection of music is crucial to music understanding and visualization, and how to detect the music rhythm and reasonably identify the accurate position of the music rhythm so as to realize that anyone can beat along with a section of music. With the rapid development of computers and multimedia technologies, many researches on intelligent rhythm detection, rhythm tracking and the like are currently performed, for example, a single-tone music rhythm extraction method based on the bayesian theory is adopted, a bayesian rhythm model is introduced, and then a sequence monte carlo method based on the bayesian theory is adopted to infer the positions of bars and beats of music fragments. Or the music rhythm extraction method based on the internal and external probability algorithm analyzes the music elements, defines a probability context-free grammar system to describe relatively independent music rhythm elements so as to convert the music rhythm into formalized grammar sentences, obtains the probability value of each grammar sentence used in the grammar system by adopting the internal and external probability algorithm, and guides a computer to generate the music rhythm by utilizing the probability context-free grammar. The above researches are based on complex probability and related theories to derive and calculate the rhythm values in the music pieces, and all have the defects of high calculation complexity and low accuracy.

The prior art related to the present application is patent document CN 107103917a, which discloses a music tempo detection method and a system thereof, the method comprising: acquiring audio data of music; sequentially acquiring an audio frame from the audio data as a current audio frame, taking the difference value of the spectral energy sum of the current audio frame and the previous audio frame as the energy difference value of the current audio frame, and storing the energy difference value; determining an energy threshold corresponding to the current audio frame; acquiring energy difference values of a current audio frame and more than two adjacent continuous audio frames before the current audio frame to obtain energy difference values of more than three audio frames; and if the energy difference value of the more than three audio frames has a peak value, and the peak value is larger than the energy threshold value corresponding to the current audio frame, marking the audio frame corresponding to the peak value as a rhythm point.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a music beat detection method and a music beat detection system.

The music beat detection method provided by the invention comprises the following steps:

audio segmentation step: after receiving music, adding left and right sound channels of the two-channel audio to obtain a data sequence of a music PCM type, performing grouped addition on the data sequence, and solving an average value to simplify the data sequence to obtain a simplified audio signal;

a step of detecting beats: performing autocorrelation function calculation on the simplified audio signal, and then performing peak detection to obtain all peaks;

a peak value extraction step: envelope extraction is carried out on the simplified audio signal to obtain an audio envelope signal, first-order difference processing, half-wave rectification processing and weighting compensation processing are carried out on the audio envelope signal to obtain an audio peak value signal, and all beat points are output by utilizing dynamic threshold peak detection;

and (3) multipath searching: and obtaining an optimal beat path by utilizing all wave crests and all beat points through multi-path search, and recording the audio rhythm and the beat points corresponding to the optimal beat path.

Preferably, the music PCM type data sequence x [ n ] represents a one-dimensional discrete-time signal, and the music binaural signals are L [ n ], R [ n ], x [ n ] ═ L [ n ] + R [ n ];

a group of audio packets are summed every 20ms and averagedm represents the number of audio samples of 20ms length, m is a constant, i is greater than or equal to 0 and less than m.

Preferably, the step of extracting the envelope of the simplified audio signal is to perform low-pass filtering processing with a cut-off frequency of 20Hz on the simplified audio signal, then process the processed simplified audio signal with an envelope detector with fast charging and slow discharging to retain the protrusions of the envelope, and output the processed simplified audio signal as an audio envelope signal xenv[n];

Let t [ n ]]Is xdeci[n]The output of the low-pass filter with the cut-off frequency of 20Hz is output as the envelope detector with fast charging and slow discharging

Preferably, let the audio envelope signal xenv[n]Performing first-order difference, half-wave rectification, weighting compensationOutputting the peak point in the envelope, outputting the processed prominent audio peak signal, and expressing as the audio peak signal xs[n],xs[n]Is xenv[n]Obtaining a peak signal after first-order difference, half-wave rectification and weighting compensation;

the first order difference dx [ n ]]=xenv[n]-xenv[n-1]Half-wave rectification hx [ n ]]=max(dx[n]0), weighted compensation xs[n]=0.5*31.5*hx[n]+0.5*dx[n];

Reuse of dynamic threshold based peak detector detection xs[n]Outputs all possible beat points E [ n ]]。

Preferably, the autocorrelation function calculation on the simplified audio signal is for xdeci[n]Calling autocorrelation function and outputting autocorrelation sequence ac [ n ]]For the autocorrelation sequence ac [ n ]]Calling a wave crest detector to find out all wave crests EacPeak[n]Each EacPeak[n]The possible beat values are saved.

The invention provides a music beat detection system, which comprises the following modules:

an audio segmentation module: after receiving music, adding left and right sound channels of the two-channel audio to obtain a data sequence of a music PCM type, performing grouped addition on the data sequence, and solving an average value to simplify the data sequence to obtain a simplified audio signal;

a beat detection module: performing autocorrelation function calculation on the simplified audio signal, and then performing peak detection to obtain all peaks;

a peak value extraction module: envelope extraction is carried out on the simplified audio signal to obtain an audio envelope signal, first-order difference processing, half-wave rectification processing and weighting compensation processing are carried out on the audio envelope signal to obtain an audio peak value signal, and all beat points are output by utilizing dynamic threshold peak detection;

a multipath searching module: and obtaining an optimal beat path by utilizing all wave crests and all beat points through multi-path search, and recording the audio rhythm and the beat points corresponding to the optimal beat path.

Compared with the prior art, the invention has the following beneficial effects:

1. the invention solves the problems of music rhythm detection and accurate positioning of specific rhythm points.

2. The invention estimates the beat value by searching possible beat values and the autocorrelation function which is convenient to realize.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a process flow diagram of the present invention;

FIG. 2 is a flow chart of the beat detector of the present invention;

FIG. 3 is a flow chart of the peak detector of the present invention;

FIG. 4 is a flow chart of an envelope extractor of the present invention;

FIG. 5 is a flow chart of a peak extractor according to the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

The invention is a 'peak detection algorithm' based on time domain to find out all possible beat points; and extracting the bpm value of the music through an autocorrelation function, and searching the positions of all the beats by utilizing multi-path search according to all the possible beat points and the bpm value. Wherein, the peak detection algorithm is mainly used for detecting the peaks and the troughs of sound and other waveforms, and is a peak detection algorithm based on a dynamic threshold value, and an audio sequence A [ n ]]Segmenting according to 2 seconds, and respectively calculating the average value of each segmentAVG[n]Represents the average value (avg) of the nth segment, A [ n ]]Is the audio sequence to be processed and also represents the audio intensity value of the nth point, l represents the 2 second lengthThe number of audio sampling points aiA dynamic threshold value representing the ith audio sequence, setting ai=A[n*l+i]Where n is a fixed value, i is 0-l, AVG [ n ]]Is the dynamic threshold of the nth segment if ai=A[n*l+i]>AVG[n]N x l + i is then a possible peak, if aiI is a possible peak if around 50ms around the ith is greater than aiThe ith peak is deleted from the candidate peak points.

Firstly, the left and right sound channels are added to the double-channel audio frequency in the received music to obtain the data sequence x [ n ] of music PCM type]Representing one-dimensional discrete-time signals, typically music being binaural, i.e. 2 one-dimensional discrete-time signals, L [ n ] respectively],R[n],x[n]=L[n]+R[n]I.e. the data obtained at the beginning, where n represents the serial number of the data sequence, taking the value as a natural number, adding a set of audio packet data every 20ms, and taking the average m represents the number of audio samples of 20ms length, m ═ 882 represents the number of samples of 20ms audio length for a 44.1kHz audio, m is a constant, 0 ≦ i < m, each set of audio packet data and adjacent audio packet data have a 10ms overlap, such that x is obtaineddeci[n]The signal is smoother and outputxdeci[n]Represents the pair x [ n ]]Signal reduction, i.e. a reduced audio signal, where m denotes a sequence length of 20ms, i denotes the sequence number of the sequence length, and n denotes the sequence number of the data sequence;inI.e. including a 10ms data overlap operation.

Second, it is used forTo xdeci[n]Low-pass filtering with cut-off frequency of 20Hz, processing with envelope detector with fast charge and slow discharge to maximally retain the projection of envelope, and outputting as audio envelope signal xenv[n],xenv[n]Is xdeci[n]Low-pass filtering with cut-off frequency of 20Hz to remove high-frequency variation component in audio signal, and then feeding into envelope detector with fast charge and slow discharge to protrude and retain the protrusion of envelopeenv[n]Performing first-order difference, half-wave rectification, weighting compensation, highlighting peak point in envelope, and outputting processed highlighted audio peak signal which can be expressed as audio peak signal xs[n],xs[n]Is xenv[n]The peak value signal is obtained after first-order difference, half-wave rectification and weighting compensation, and the first-order difference formula is dx [ n ]]=xenv[n]-xenv[n-1]The half-wave rectification formula is hx [ n ]]=max(dx[n]0), the weighted compensation formula is xs[n]=0.5*31.5*hx[n]+0.5*dx[n]Detecting x with a dynamic threshold based peak detectors[n]Outputs all possible beat points E [ n ]](ii) a At the same time, for xdeci[n]Calling autocorrelation function and outputting autocorrelation sequence ac [ n ]]For the autocorrelation sequence ac [ n ]]Calling a wave crest detector to find out all wave crests EacPeak[n]Each EacPeak[n]Saving possible beat values, using beat points E [ n ]]Sum peak EacPeak[n]Obtaining the position of the optimal beat path, En, by a multipath search algorithm]Representing the final peak signal, the beat points should all be contained in the peak signal, EacPeak[n]Representing all possible rhythms and the music rhythm should be included, road strength defines Path { b, { p1, p 2. } }, b represents rhythm values, p1, p2, p 3.. represent first, second, third and the like rhythm points respectively, an object function f (Path) is defined, and if the rhythm point of the road strength is matched with the b value, the value of f (Path) is larger, and if the rhythm point of the road strength is smaller, the value of f (Path) is smaller, and otherwise, the value of E (Path) is smalleracPeak[n]Comprising a rhythm sum E n]Initializing multiple paths { b, E [ s ] [ at the start beat point]Predicting and tracking the next beat point, judging whether to add the beat point into the path, updating a corresponding F (path) function, and finding the path with the maximum F (path) value, thereby finding the rhythm of the audio and the corresponding beat point.

As shown in FIG. 1, the left and right channel addition is performed on the two-channel audio to obtain a music PCM type data sequence x [ n ]]Then, a set of audio packet data every 20ms, each set of audio packet data overlapping with the adjacent audio packet data by 10ms, is added to be averaged, and outputm represents a sequence length of 20 ms.

As shown in FIG. 4, for xdeci[n]Low-pass filtering with cut-off frequency of 20Hz, processing the protrusion of maximum reserved envelope by a fast-charging slow-releasing envelope detector, and outputting xenv[n]Let t [ n ]]Is xdeci[n]The output of the low-pass filter with the cut-off frequency of 20Hz is output as the envelope detector with fast charging and slow discharging

As shown in FIG. 5, for xenv[n]Performing first-order difference, half-wave rectification and weighted compensation to highlight peak points in the envelope and output xs[n]。

As shown in fig. 2 and 3, x is detected using a dynamic threshold based peak detectors[n]Output all possible beat points E [ i ]],E[i]The positions of possible beat points are stored, for xdeci[n]And calling an autocorrelation function and outputting an autocorrelation sequence. For the autocorrelation sequence ac [ n ]]Calling a wave crest detector to find out all wave crests EacPeak[n]Each EacPeak[n]Saving possible beat values, from ac n]Can estimate a plurality of bpm values, can be realized by referring to soundtouch, wherein bmp is a beat period, finds a plurality of bmp values, and is based on E [ i ]]We can select a best fit rhythm point. By using EacPeak[n]And E [ n ]]And acquiring the position of the optimal beat path through a multipath searching algorithm.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:电子式警音器

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!