Intelligent temperament proofreading system and method, storage medium, equipment and terminal

文档序号:9748 发布日期:2021-09-17 浏览:42次 中文

阅读说明:本技术 一种智能音律校对系统、方法、存储介质、设备及终端 (Intelligent temperament proofreading system and method, storage medium, equipment and terminal ) 是由 卞艺衡 甘寒琪 黄喜琳 戴欣 于 2021-05-26 设计创作,主要内容包括:本发明属于音乐技术领域,公开了一种智能音律校对系统、方法、存储介质、设备及终端,所述智能音律校对方法包括:二维码识别的部分,主控芯片对FreeRTOS系统、摄像头和屏幕等外设进行初始化,将摄像头的图像以灰度模式进行输入,再于以HDMI线进行连接的屏幕上进行输出显示;接着在屏幕左上角对提示信息进行初始化,通过指定坐标的方式,在显示屏左上角回显现在是否识别到二维码;软件调出对应的电子乐谱数据,再另一个交互界面专用的显示屏上进行提示数据的回显,提示用户开始练习,并等待用户按下开始按键;到此开始进入第二个部分,即音准识别的部分,进入循环声音检测模块。本发明创新性的设计了能对用户音准进行实时的检测与校对的功能。(The invention belongs to the technical field of music and discloses an intelligent temperament proofreading system, a method, a storage medium, equipment and a terminal, wherein the intelligent temperament proofreading method comprises the following steps: in the part identified by the two-dimensional code, the main control chip initializes the external devices such as a FreeRTOS system, a camera, a screen and the like, inputs an image of the camera in a gray mode and then outputs and displays the image on the screen connected by an HDMI (high-definition multimedia interface) line; initializing the prompt information at the upper left corner of the screen, and displaying whether the two-dimensional code is identified at the upper left corner of the display screen in a coordinate-designating mode; calling out corresponding electronic music score data by software, and displaying back prompt data on a display screen special for another interactive interface to prompt a user to start practice and wait for the user to press a start key; to begin with, the second part, the part identified by the tone level, enters the cyclic sound detection module. The invention innovatively designs the function of detecting and correcting the intonation of the user in real time.)

1. An intelligent temperament proofreading method is characterized by comprising the following steps:

the two-dimensional code identification part initializes the FreeRTOS system, the camera and the screen peripheral equipment, inputs the image of the camera in a gray mode and then outputs and displays the image on a screen connected by an HDMI line;

initializing the prompt information at the upper left corner of the screen, and displaying whether the two-dimensional code is identified at the upper left corner of the display screen in a coordinate-designating mode;

calling out corresponding electronic music score data by software, and displaying back prompt data on a display screen special for another interactive interface to prompt a user to start practice and wait for an instruction input by the user;

to begin with, the second part, the part identified by the tone level, enters the cyclic sound detection module.

2. The intelligent pitch-rate verification method according to claim 1, wherein after the camera receives the picture containing the two-dimensional code, the picture is decoded together with the two-dimensional code identification open library zx ing, and the decoded content is retrieved; and displaying back on the upper left corner of the display screen, and comparing the back with the music score identification data stored in the music library.

3. The intelligent temperament proofreading method of claim 1, wherein the song titles of the currently played tones are displayed so that the beginner user can quickly understand them, and then cyclically record and recognize them at intervals of 1 second.

4. The intelligent pitch-rate verification method according to claim 1, wherein the specific process of the sound detection by the sound detection module is as follows:

the sound played by the user is processed by filtering and fast Fourier transform;

firstly, judging whether a user is playing or not aligning to a microphone, and comparing the voice with a standard pitch frequency comparison table in a library to obtain a comparison result and prompting in real time on a display screen, wherein the voice is too small;

if the intonation is accurate, the judgment of the next single tone is carried out until the whole song is finished, and then the prompt of completing the exercise is carried out;

the real-time prompt comprises the following steps: the sound is too small, the intonation is accurate, the tone is too high and the tone is too low.

5. The intelligent pitch-law correction method according to claim 1, wherein the elements in the pixel matrix all satisfy R, G, B, and the color variable value at that time is called as a gray value;

a common weighted average formula for graying is as follows:

1)Gray=B;Gray=G;Gray=R;

2)Gray=max(B+G+B);

4)Gray=0.072169B+0.715160G+0.212671R;

5)Gray=0.11B+0.59G+0.3R;

in the formula: r, G, B one is red, green and blue; Gray-Gray value results;

the noise is attenuated as much as possible while the edge is detected. Its stencil size is 3X3, which combines a directional difference operation with a local weighted average to extract edges. Before the gradient of the image is solved, weighted average is carried out, then undivided is carried out, the approximate position of the image can be determined after processing, and a complete two-dimensional code image is extracted; finally, establishing a sampling network by means of the plurality of correction graphs and the positioning icons, and converting the two-dimensional code into a data matrix to obtain data;

the correspondence between the signals before and after the FFT is as follows: the sampling frequency is Fs, the signal frequency is F, the number of sampling points is N, then the result after FFT is a complex number of N points, and each point corresponds to a frequency point; the peak of the original signal is a, then the modulus value of each point of the FFT result is N/2 times a. The first point is the direct current component, and the modulus value of the direct current component is N times of the direct current component; the phase of each point is the phase of the signal at the frequency; the first point represents the direct current component (namely 0Hz), and the next point of the last point N represents the sampling frequency Fs, the middle is divided into N equal parts by N-1 points, and the frequency of each point is increased in sequence;

the frequency represented by a certain point n is: and Fn is (N-1) Fs/N, and the formula shows that Fn can distinguish the frequency to be Fs/N, and if the sampling frequency Fs is 1024Hz, the number of sampling points is 1024 points, the Fn can distinguish to be 1 Hz. The sampling rate of 1024Hz samples 1024 points, which is exactly 1 second, that is, a signal of 1 second time is sampled and FFT is performed, the result can be analyzed to be accurate to 1Hz, if a signal of 2 seconds is sampled and FFT is performed, the result is analyzed to be accurate to 0.5Hz, if the frequency resolution is to be improved, the number of sampling points must be increased, that is, the sampling time, the frequency resolution and the sampling time are in reciprocal relation; according to the Nyquist sampling theorem, the maximum spectrum width after FFT can only be 1/2 of the original signal sampling rate, and if the original signal sampling rate is 4GS/s, the maximum bandwidth after FFT can only be 2 GHz; the reciprocal of the sampling period of the time domain signal, i.e. the sampling rate multiplied by a fixed coefficient, is the width of the spectrum after transformation, i.e. Frequency Span is K (1/. DELTA.T), where OT is the sampling period and the value of K depends on whether the original signal is down-sampled (tapped) before the FFT is performed, because this can reduce the amount of FFT operations;

after FFT, audio with wider frequency spectrum, clearer characteristics and more accurate resolution is obtained.

The application and calculation of the twelve-tone average law, wherein the frequency f represents A4 in the scientific tone notation, thirteen different tones can be distinguished according to the twelve-tone average law in the interval [ f, 2f ], and two tones spanning five minor degrees are calculated according to the frequency formula:

in the formula: f. of0-reference audio frequency/Hz; f. of5-a frequency/Hz of five tones two degrees less than the reference pitch;

calculating pitch comparison, namely identifying and comparing audio by means of fast Fourier transform, wherein the wav format audio file is a real number, and zero filling pretreatment is required for a deficiency-type part of data obtained by sampling; the nature of the fast fourier transform is the discrete fourier transform. The formula for the discrete fourier transform is:

in the formula: x-a sampled signal; n-number of signal points; the frequency domain signal after X-discrete Fourier transform; n-sequence index of time domain sampling point; an index of k-frequency domain values;

for fast fourier transform, the coefficient part containing x (N) (0, 1., N-1) is divided into two vectors of odd terms and even terms;

x[0]=[x(0),x(2),...,x(n-2)]T

x[1]=[x(1),x(3),...,x(n-1)]T

they correspond to two new polynomials X respectively[0](x) And X[1](x) Thus the following three expressions are now available:

X(a)=x0+x1a+x2a2+...+xn-1an-1

and (3) pushing out:

X(a)=X[0](a2)+aX[1](a2);

order:substituting the formula into the formula, and according to the elimination theory:

and the half-folding principle is as follows:

and (3) pushing out:

the waveform resolution is determined by the time length of the original data samples:

in the formula: Δ Rω-a waveform resolution; t-original data time;

and the FFT resolution is determined by the sampling frequency and the number of data points participating in the FFT:

in the formula: Δ Rfft-an FFT resolution; fs-sampling rate; n-number of data points.

6. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:

the two-dimensional code identification part initializes the FreeRTOS system, the camera and the screen peripheral equipment, inputs the image of the camera in a gray mode and then outputs and displays the image on a screen connected by an HDMI line;

initializing the prompt information at the upper left corner of the screen, and displaying whether the two-dimensional code is identified at the upper left corner of the display screen in a coordinate-designating mode;

calling out corresponding electronic music score data by software, and displaying back prompt data on a display screen special for another interactive interface to prompt a user to start practice and wait for an instruction input by the user;

to begin with, the second part, the part identified by the tone level, enters the cyclic sound detection module.

7. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:

the two-dimensional code identification part initializes the FreeRTOS system, the camera and the screen peripheral equipment, inputs the image of the camera in a gray mode and then outputs and displays the image on a screen connected by an HDMI line;

initializing the prompt information at the upper left corner of the screen, and displaying whether the two-dimensional code is identified at the upper left corner of the display screen in a coordinate-designating mode;

calling out corresponding electronic music score data by software, and displaying back prompt data on a display screen special for another interactive interface to prompt a user to start practice and wait for an instruction input by the user;

to begin with, the second part, the part identified by the tone level, enters the cyclic sound detection module.

8. An information data processing terminal, characterized in that the information data processing terminal is used for implementing the intelligent temperament proofreading method of any one of claims 1-5.

9. An intelligent temperament proofreading system for implementing the intelligent temperament proofreading method according to any one of claims 1-5, wherein the intelligent temperament proofreading system is provided with:

a main control chip;

the main control chip is respectively connected with the camera, the interface screen, the microphone, the keys and the SD card storage device.

10. The intelligent temperament verification system of claim 9, wherein the SD card storage device is coupled to an SD card socket.

Technical Field

The invention belongs to the technical field of music, and particularly relates to an intelligent temperament proofreading system, an intelligent temperament proofreading method, a storage medium, an intelligent temperament proofreading device and an intelligent temperament proofreading terminal.

Background

At present, in the practice of string music instruments, as the fingerboards of most string music instruments do not have marking positions of all pitches, beginners often only can mark paperboards by colored tapes or correction fluids, which not only can damage expensive wooden musical instruments, but also has limited number of marked pitches, and the tapes and the correction fluids are easy to fall off and fade in the practice, so that the effect is not ideal. On the other hand, once no professional teacher helps to indicate that the beginner has a wrong intonation during the autonomous exercise, the beginner can easily form wrong muscle memory, thereby affecting the subsequent exercise effect.

At the present stage, "internet +" is a new state of internet development, and it is an inevitable trend to deeply merge the internet and the traditional industry and create a new development form by using an information communication technology and an internet platform. Therefore, the constant integration of the internet and the string instrument also generates a series of string music type intelligent software with the internet technology and the internet platform as the core.

After market research, the string music APP developed in the market are mainly divided into two types. One is mainly web course teaching, and lacks targeted guidance depending on the self-training of the user; the other type of the instrument is mostly limited to tuners, and the tuners APP are only suitable for performing elastic tuning on stringed instruments such as guitars, violins and ukulele before practice, and lack of real-time dynamic monitoring on the whole playing process.

Through the above analysis, the problems and defects of the prior art are as follows:

(1) the prior art relies on the user's autonomic exercise and lacks targeted guidance.

(2) Tuner APP among the prior art is only applicable to before the exercise and carries out elasticity tuning to stringed instruments such as guitar, violin, you keli, lack to playing whole real-time dynamic monitoring.

The difficulty in solving the above problems and defects is: the targeted guidance requires real-time feedback of the current performance level of the player and evaluation analysis of the performance level. Most instrument beginners often fall into bottleneck periods in the early stage because of certain problems, but lack the self-correcting capability, and fall into the circulation of self-cognition negative feedback after the problems are not solved, so that the progress and the self-enthusiasm of self-learners are hindered to some extent. In the past teaching, the targeted guidance is usually completed through the opposite-side teaching below the line, while the on-line teaching has certain hysteresis due to the space-time positions of the two parties, and most of self-learners spend a great deal of effort and time to step over the step.

The dynamic detection of the whole playing process needs the product to detect and track and judge the whole process of the player, and requires the software to carry out standard comparison and feedback display on the real-time playing tone, so that the software has higher requirements on the processing instantaneity and feedback. In addition, compared with the current loose and tight tuning function, the real-time analysis of the performance audio of the performer also puts higher requirements on the sound library and the rapid analysis capability of the software.

The significance of solving the problems and the defects is as follows: the invention is used as an intelligent stringed instrument product, and the teaching innovation practice of stringed instruments is completed by utilizing intelligent identification, detection and error correction. By analyzing the functions and the application of the invention, more professionals in related industries can realize the internet +' era, so that beginners can give consideration to the correctness even though the beginners can not practice the music, and the teaching mode of the musical instrument is gradually changed. The teaching thinking changes the prior teaching thinking to a certain extent, and the teaching device also plays a guiding role in the combination of other musical instruments and the Internet. The teaching convenience brought by the intelligent product is easy to promote stringed instruments and even good development of the music field. On the basis of carrying out intelligent and rapid identification and judgment on the intonation of a player, the invention can also add the function of identifying the whole music score by intelligent images, thereby improving the degree of freedom of use of a user. Meanwhile, an intelligent scoring mechanism can be added to comprehensively analyze the intonation, rhythm and emotion aspects of the user, so that the user can obtain own quantitative exercise evaluation report, the specialty of the invention is improved, and the function is more complete. Along with the trend of people to more beautiful life, the demand of people on art is increasing day by day, and more people begin to enjoy music and contact musical instruments. In future life, on one hand, instrumental music teachers do not need to worry about correcting the tone accuracy problem of students one by one, on the other hand, people even do not need to specially learn a musical instrument to register courses, but can utilize fragmented time, and people can simply go home without leaving home.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides an intelligent temperament proofreading system, an intelligent temperament proofreading method, a storage medium, equipment and a terminal.

The invention is realized in such a way, and an intelligent temperament proofreading method comprises the following steps:

in the two-dimension code identification part, a main control chip initializes and calls the configuration of a FreeRTOS system, a camera, a screen and other peripherals, processes and collects the data by using the camera and a drp library function, transmits the data through a serial port after the data is collected, eliminates invalid or incomplete data and selects matched character string information. Meanwhile, an image of a camera is input in a gray scale mode, 01 information characteristics contained in a black-white matrix color block in the image are identified through three steps of filtering, enhancing and detecting, data characters are converted into bit streams, and each 8-bit code word integrally forms a code word sequence of data. Knowing this sequence of data code words is equivalent to knowing the data content of the two-dimensional code. Then, output display is carried out on the screen connected by the HDMI line;

and secondly, after the camera receives the picture containing the two-dimensional code, decoding the picture and a two-dimensional code identification open source library zxing, retrieving the decoded content, displaying back on the upper left corner of the display screen, initializing prompt information, displaying back on the upper left corner of the display screen in a coordinate-designated mode to judge whether the two-dimensional code is identified, and comparing the two-dimensional code with music score identification data stored in a music library. The step can help the user to know whether the user successfully activates the identification function of the software or not as soon as possible;

step three, calling out corresponding electronic music score data by software, and displaying back prompt data on a display screen special for another interactive interface to prompt a user to start practice and wait for the user to press a start key, thereby completing the preparation work before identification;

and fourthly, entering a second part, namely a tone recognition part, formally entering a circulating sound detection module, displaying the singing name of the single tone which is to be played currently, so that the beginner can quickly understand the singing name, and circularly recording and recognizing at the interval of 1 second.

Further, in the third step, the specific process of the sound detection by the sound detection module is as follows:

the sound played by the user is processed by filtering and fast Fourier transform;

firstly, judging whether a user is playing or not aligning to a microphone, and comparing the voice with a standard pitch frequency comparison table in a library to obtain a comparison result and prompting in real time on a display screen, wherein the voice is too small;

and (4) judging the next tone if the intonation is accurate until the whole song is finished, and prompting to complete the exercise after the whole song is finished.

Further, the real-time prompt includes: the sound is too small, the intonation is accurate, the tone is too high and the tone is too low.

Furthermore, the intelligent tone correction method enables elements in the pixel point matrix to meet the condition that R is G and B, and the color variable value at the moment is called as a gray value;

a common weighted average formula for graying is as follows:

1)Gray=B;Gray=G;Gray=R;

2)Gray=max(B+G+B);

3)

4)Gray=0.072169B+0.715160G+0.212671R;

5xGrαy=0.11B+0.59G+0.3R;

in the formula: r, G, B one is red, green and blue; Gray-Gray value results;

the noise is attenuated as much as possible while the edge is detected. Its stencil size is 3X3, which combines a directional difference operation with a local weighted average to extract edges. Before the gradient of the image is solved, weighted average is carried out, then undivided is carried out, the approximate position of the image can be determined after processing, and a complete two-dimensional code image is extracted; finally, establishing a sampling network by means of the plurality of correction graphs and the positioning icons, and converting the two-dimensional code into a data matrix to obtain data;

the correspondence between the signals before and after the FFT is as follows: the sampling frequency is Fs, the signal frequency is F, the number of sampling points is N, then the result after FFT is a complex number of N points, and each point corresponds to a frequency point; the peak of the original signal is a, then the modulus value of each point of the FFT result is N/2 times a. The first point is the direct current component, and the modulus value of the direct current component is N times of the direct current component; the phase of each point is the phase of the signal at the frequency; the first point represents the direct current component (namely 0Hz), and the next point of the last point N represents the sampling frequency Fs, the middle is divided into N equal parts by N-1 points, and the frequency of each point is increased in sequence;

the frequency represented by a certain point n is: and Fn is (N-1) Fs/N, and the formula shows that Fn can distinguish the frequency to be Fs/N, and if the sampling frequency Fs is 1024Hz, the number of sampling points is 1024 points, the Fn can distinguish to be 1 Hz. The sampling rate of 1024Hz samples 1024 points, which is exactly 1 second, that is, a signal of 1 second time is sampled and FFT is performed, the result can be analyzed to be accurate to 1Hz, if a signal of 2 seconds is sampled and FFT is performed, the result is analyzed to be accurate to 0.5Hz, if the frequency resolution is to be improved, the number of sampling points must be increased, that is, the sampling time, the frequency resolution and the sampling time are in reciprocal relation; according to the Nyquist sampling theorem, the maximum spectrum width after FFT can only be 1/2 of the original signal sampling rate, and if the original signal sampling rate is 4GS/s, the maximum bandwidth after FFT can only be 2 GHz; the reciprocal of the sampling period of the time domain signal, i.e. the sampling rate multiplied by a fixed coefficient, is the width of the spectrum after transformation, i.e. Frequency Span is K (1/. DELTA.T), where OT is the sampling period and the value of K depends on whether the original signal is down-sampled (tapped) before the FFT is performed, because this can reduce the amount of FFT operations;

after FFT, audio with wider frequency spectrum, clearer characteristics and more accurate resolution is obtained.

The application and calculation of the twelve-tone average law, wherein the frequency f represents A4 in the scientific tone notation, thirteen different tones can be distinguished according to the twelve-tone average law in the interval [ f, 2f ], and two tones spanning five minor degrees are calculated according to the frequency formula:

in the formula: f. of0-reference audio frequency/Hz; f. of5-a frequency/Hz of five tones two degrees less than the reference pitch;

calculating pitch comparison, namely identifying and comparing audio by means of fast Fourier transform, wherein the wav format audio file is a real number, and zero filling pretreatment is required for a deficiency-type part of data obtained by sampling; the nature of the fast fourier transform is the discrete fourier transform. The formula for the discrete fourier transform is:

in the formula: x-a sampled signal; n-number of signal points; the frequency domain signal after X-discrete Fourier transform; n-sequence index of time domain rice sample points; an index of k-frequency domain values;

for fast fourier transform, the coefficient part containing x (N) (N ═ 0, 1, …, N-1) is divided into two vectors of odd term and even term;

x[0]=[x(0),x(2),....,x(n-2)]T

x[1]=[x(1),x(3),...,x(n-1)]T

they respectively correspond to twoNew polynomial X[0](x) And X[1](x) Thus the following three expressions are now available:

X(a)=x0+x1a+x2a2+...+xn-1an-1

and (3) pushing out:

X(a)=X[0](a2)+aX[1](a2);

order:substituting the formula into the formula, and according to the elimination theory:

and the half-folding principle is as follows:

and (3) pushing out:

the waveform resolution is determined by the time length of the original data samples:

in the formula: Δ Rω′-a waveform resolution; t-original data time;

and the FFT resolution is determined by the sampling frequency and the number of data points participating in the FFT:

in the formula: Δ Rfft-an FFT resolution; fs-sampling rate; n-number of data points.

Another objective of the present invention is to provide an intelligent temperament proofreading system for implementing the intelligent temperament proofreading method, wherein the intelligent temperament proofreading system is provided with a main control chip;

the main control chip is respectively connected with the camera, the interface screen, the microphone, the keys and the SD card storage device.

Further, the SD card storage device is connected with the SD card seat.

By combining all the technical schemes, the invention has the advantages and positive effects that: intonation is the most fundamental and important part of instrument learning, and string beginners have difficulty in mastering it because fingerboards of most string instruments do not have marking positions of each pitch. In traditional string music study, a beginner usually marks on a fingerboard by using a colored adhesive tape or correction fluid, so that expensive wooden musical instruments are greatly damaged, the number of marked pitches is limited, and the adhesive tape and the correction fluid are easy to fall off and fade during practice, and the effect is not ideal. On the other hand, in the process of autonomous exercise of the beginners, if no professional teacher helps to point out errors of tone, the beginners are easy to form wrong muscle memory in the long run, and the training effect is greatly reduced.

Meanwhile, through market research, the mass of products in the industry is limited to tuners, the tuners are musical instrument auxiliary devices and mainly refer to electronic tuners and are used for tuning various stringed instruments, the tuners comprise violin tuners, viola tuners, bass tuners, guitar tuners, guzheng tuners, ukuleli tuners and ukulele tuners and the like, the tuners are used for tuning by matching vibration principles with sound principles and are used for preparation before practice, simple comparison can be carried out on each string of the stringed instruments such as guitars and violins, automatic correction of the tones is used for helping users to practice the tuning equipment, and real-time dynamic monitoring of the whole playing process is lacked. Based on the limitations, the invention overcomes the defects of people on the tuning machine, and innovatively designs the function of detecting and correcting the intonation of the user in real time. The invention can monitor and indicate whether each tone played is accurate or not in real time during the user practice, and provides an improved direction. Before the user exercises, the music book needing to be exercised at this time can be identified through two-dimensional code scanning, corresponding music book data can be called out from a music book library, and the intonation detection and correction of the user are started. The invention not only can play a role in tuning, but also can deepen the grasp of the user on each tone during the practice, and fills the blank of independent practice of the user in the industry.

In conclusion, the invention has practical significance. From the perspective of string music beginners, the intelligent tracking and recognizing device is simple and easy to use, has a clear and concise interface, can better protect musical instruments, and improves training efficiency and the confidence of players through intelligent tracking and recognition of training songs; from the perspective of a teacher, the teaching burden can be reduced, and the teacher does not need to monitor the practice of students all the time; from the commercial value, the product has wide application range and large audience population, and simultaneously lacks of similar products with similar functions to the product of the invention at present, the competitive advantage of the invention in the similar products is relatively large, once the product is popularized and used, the advantages of simplicity, easy use and easy operation attract a large number of users, and the product has very high commercial value.

The invention is designed by using a development board provided by a development platform based on the Thysasa company, and the functions of the invention are as follows: before the user exercises, the music book needing to be exercised at this time is identified through two-dimensional code scanning, corresponding music book data are called out from a music book library after processing, and real-time detection and correction are carried out on the intonation of the user so as to achieve the effect of helping beginners to exercise the organ. The invention is not only simple and easy to use, and the interface is clear and simple, but also can better protect the musical instrument. Meanwhile, the invention can intelligently track and identify exercise songs, monitor the intonation height of each song in real time during user exercise, indicate whether each played song is accurate or not and provide an improved direction. The special online training accompanying mode improves the training quality of a user, solves the problem of time and labor consumption of online training accompanying, and simplifies the steps of traditional teaching, thereby reducing the cost and threshold of learning stringed instruments, promoting the development of the stringed instruments, and fitting the current life theme of intelligent home.

Meanwhile, the invention carries out more comprehensive investigation on the market conditions of software and hardware related to musical instrument learning, designs according to the difficulty in musical instrument learning and the characteristics of special musical instruments, has high use value, and improves the usability of the system by the coordination and cooperation of various sub-modules. On the basis of knowing the related functions of the existing musical instrument tuner, the design and combination of submodules such as two-dimensional code scanning, pitch identification, screen feedback and the like are completed by combining with related knowledge of signal processing such as fast Fourier transform and the like, and simultaneously, the task of performing pitch identification on the single tone played by the user can be well completed.

Drawings

Fig. 1 is a flowchart of an intelligent temperament proofreading method according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of an intelligent temperament proofreading system according to an embodiment of the present invention.

Fig. 3 is a flowchart of a two-dimensional code identification method according to an embodiment of the present invention.

Fig. 4 is a flowchart of the intelligent temperament verification software according to the embodiment of the present invention.

In the figure: 1. a camera; 2. an HDM1 interface screen; 3. a screen; 4. an RZ/A2M main control chip; 5. a microphone; 6. pressing a key; 7. an SD card storage device; 8. an SD card seat.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In view of the problems in the prior art, the present invention provides an intelligent pitch range checking system, method, storage medium, device and terminal, and the present invention is described in detail below with reference to the accompanying drawings.

Those skilled in the art can also implement the intelligent temperament proofreading method provided by the present invention by using other steps, and the intelligent temperament proofreading method provided by the present invention in fig. 1 is only a specific embodiment.

As shown in fig. 1, the intelligent pitch-rate checking method provided in the embodiment of the present invention includes:

s101: in the part identified by the two-dimensional code, the main control chip initializes the external devices such as a FreeRTOS system, a camera, a screen and the like, inputs the image of the camera in a gray mode and then outputs and displays the image on the screen connected by the HDMI line.

S102: and initializing the prompt information at the upper left corner of the screen, and displaying whether the two-dimensional code is identified at the upper left corner of the display screen in a coordinate-designating mode.

S103: the software calls out the corresponding electronic music score data, and the prompting data is displayed back on a display screen special for another interactive interface to prompt the user to start practice and wait for a corresponding instruction of the user.

S104: to begin with, the second part, the part identified by the tone level, enters the cyclic sound detection module.

In S102 provided by the embodiment of the present invention, after receiving the picture including the two-dimensional code, the camera decodes the picture and the two-dimensional code identification open source library zx ing, and retrieves the decoded content. On one hand, the music score is displayed back on the upper left corner of the display screen, and on the other hand, the music score is compared with the music score identification data stored in the music score library.

In S103 provided by the embodiment of the present invention, the song name of the tone to be played currently is displayed, so that the beginner can quickly understand the song name, and then the song is cyclically recorded and recognized at intervals of 1 second.

In S103 provided by the embodiment of the present invention, the specific process of sound detection by the sound detection module is as follows:

the sound played by the user is processed by filtering and fast Fourier transform;

firstly, judging whether a user is playing or not aligning to a microphone, and comparing the voice with a standard pitch frequency comparison table in a library to obtain a comparison result and prompting in real time on a display screen, wherein the voice is too small;

if the intonation is accurate, the judgment of the next single tone is carried out until the whole song is finished, and then the prompt of completing the exercise is carried out; the whole software system has a relatively definite flow, can achieve the function of real-time detection, and has relatively good identification accuracy.

The real-time prompt comprises the following steps: the sound is too small, the intonation is accurate, the tone is too high and the tone is too low.

As shown in fig. 2, in the intelligent temperament proofreading system provided by the embodiment of the invention, an RZ/A2M main control chip 4 is respectively connected with a camera 1, an HDM1 interface screen 2, a screen 3, a microphone 5, a key 6, and an SD card storage device 7; the SD card storage device 7 is connected to the SD card socket 8.

The specific implementation process of the invention is as follows:

when the practice is started, a user clicks a 'start practice' button, then a matched entity music book is opened, the camera module provided by the invention is utilized to scan the two-dimensional code corresponding to the chapter to be practiced, the HDMI screen liquid crystal display module correspondingly displays the real-time image acquired by the two-dimensional code during scanning, and the user can conveniently align the two-dimensional code to the camera. After the scanning is finished, the electronic music book data corresponding to the scanned two-dimensional code can be called out in the software application, meanwhile, the scanning success is prompted, and a user can select a reference entity paper book or an electronic music book according to personal preference and start the exercise. At the moment, the microphone starts to collect audio data, and meanwhile, the lower left corner of the software interface can also display audio collection animation in real time, the animation changes in real time along with the playing of the user, the attractiveness of the interface is improved, and meanwhile the user is prompted about the situation of the currently collected sound sample. The mini piano animation at the lower right corner of the software interface is updated in real time according to notes played as required so as to prompt a piano beginner about keys which should be pressed currently. Meanwhile, the middle progress bar can be refreshed in real time according to the playing of the user, and the user can obtain the current playing progress by referring to the progress bar. The upper right corner of the picture can simultaneously prompt the current playing note, the playing accuracy of the user and the improvement suggestion in real time. When a user plays a note, the user can know whether the tone is accurate or not through the prompt of the upper right corner. If the playing is accurate, the playing of the next note can be carried out; if inaccurate, can adjust the intonation according to the suggestion, like violin beginner can adjust the fingerboard and press the position, piano beginner changes the key etc. of pressing down, until accurate back of intonation, can carry out next note. After all the notes are played, the upper right corner of the software interface prompts playing end information.

The innovation points of the invention are mainly as follows: because the intonation is judged by using the fast Fourier transform method, the method is suitable for the performance judgment of various musical instruments, is not limited to the intonation judgment of a certain musical instrument, and enables beginners of different musical instruments to reduce the setting process and use the instruments quickly; in addition, the whole program software for intonation judgment occupies less memory based on fast Fourier transform, so that the light weight of the software is realized, and the memory occupation trouble of a user is reduced; the invention also provides an interface for changing and updating the music score data, and a user can select the storage position of the electronic music score file by himself by pressing the option button, which means that the user can select the position of the music score file by himself, and can select the music score to be stored in an external disk when the music score data is too large, and the degree of freedom is high, and simultaneously, the user can selectively download and delete the music score to the folder at any time, and the music score can be updated along with the continuous enrichment of exercise contents, thereby prolonging the actual service life of the invention.

The technical solution of the present invention is further described with reference to the following specific examples.

1.1 the system scheme of the invention mainly comprises a two-dimensional code image identification module, a sound acquisition and identification module and an HDMI screen liquid crystal display module. The method is used based on the basic principle of Fast Fourier Transform (FFT), the image recognition functions of the RZ/A2M chip and the FREERTOS system are used to the maximum extent, and a convenient and fast note detection scheme is provided. The factors of two aspects of detection accuracy and detection rate are comprehensively considered, a module is introduced, and then a section of specific voice signal is sampled and compared to obtain a intonation result of string music voice more accurately. The two-dimensional code recognition module enables a camera and a drp library function of a Renesas RZ/A2M single-chip microcomputer to be processed and collected, data are transmitted through a serial port after collection, invalid or incomplete data are removed, and matched character string information is selected.

The basic seven major symbols of string music are designated CDEFGAB, and are denoted by the numbers 1, 2, 3, 4, 5, 6, do, re, mi, fa, so, la, si, respectively. The standard frequency of the chord scale lies between 40-4000 Hz. According to the sampling theorem, the sampling frequency is more than twice of the maximum frequency of the signal to ensure that the signal is not distorted. The chord tone signal is a short-time stable signal, the left and right sound acquisition and identification modules perform single-channel sampling on the ADC of the single chip microcomputer, the sampling rate is 44100Hz, the actual recording time is set to be 1s, the identification is convenient, and the system space is not excessively occupied. And comparing the pitch obtained after the FFT algorithm processing with the string music single-tone frequency table to determine the pitch difference, thereby realizing the effect of assisting the player to self-correct the playing practice. The port of the HDMI screen liquid crystal display module is connected with the HDMI port of the single chip microcomputer, and the camera information are displayed in real time.

1.2 principle of implementation

The traditional two-dimensional code identification mainly relies on a Canny edge detection algorithm in Opencv to identify points with obvious brightness change in a coding region and a functional region in a two-dimensional code array, namely 01 information features contained in a black and white matrix color block are identified through three steps of filtering, enhancing and detecting. The specific coding area and the functional area comprise image finding patterns, separators, positioning patterns and correction patterns, the functional area is not used for data coding, and the periphery of the functional area is blank area.

The general idea of feature recognition is as follows: firstly, the two-dimensional code is used for carrying out smooth filtering and binarization on the pattern by means of three positioning matrixes, namely upper left positioning matrix, lower left positioning matrix and upper right positioning matrix, and searching a general outline. And secondly, judging the specific position of the positioning matrix, wherein the maximum angle of a triangle formed by the three angular points is the point of the upper left corner of the two-dimensional code. And then determining the left lower position and the right upper position of the other two angular points according to the angle difference of the two sides of the angle, thereby performing perspective correction or radiation correction on the two-dimensional code. And thirdly, decoding the bar code data and segmenting and identifying code word elements, converting data characters into bit streams, and forming a code word sequence of data by one code word per 8 bits. The data content of the two-dimensional code is known by actually knowing the data code word sequence. The existing single chip microcomputer identification process is shown in fig. 3 and mainly comprises three modules of image preprocessing, primary positioning and target extraction, and the process is shown in fig. 3.

The two-dimensional code directly obtained by a general camera obtains a colorful RGB image, and each pixel point matrix consists of three 800 × 800 color vector matrixes of R, G and B. In order to make the processing of the single chip microcomputer simpler, the image needs to be grayed, that is, all the elements in the pixel point matrix satisfy R ═ G ═ B, and the color variable value at this time is called as a gray value.

A common weighted average formula for graying is as follows:

1)Gray=B;Gray=G;Gray=R;

2)Gray=max(B+G+B);

3)

4)Gray=0.072169B+0.715160G+0.212671R;

5)Gray=0.11B+0.59G+0.3R;

in the formula: r, G, B one is red, green and blue; Gray-Gray value results.

The binarization is responsible for converting the gray level image into an effect that the whole image has only black and white, wherein the selection of the threshold value directly determines the image effect. The Sobel operator enlarges the template of the edge detection operator, and weakens noise as much as possible while detecting the edge. Its stencil size is 3X3, which combines a directional difference operation with a local weighted average to extract edges. Before the image gradient is obtained, weighted average is carried out, and then undivided is carried out, so that the consistency of noise is enhanced. And the image edge curve with the more complex two-dimensional code shows a larger projection value on the coordinate, and the approximate position of the two-dimensional code in the image can be determined after processing, so that a complete two-dimensional code image is extracted.

The finding of the positioning pattern has a very important meaning for the detection of the two-dimensional code, and the special pattern proportion of black to white to black to 1: 3: 1 makes the two-dimensional code unique in the mask mode, so that whether the bar code rotates or rotates can be determined through the center coordinates of the three positioning patterns. And finally, establishing a sampling network by means of the plurality of correction graphs and the positioning icons, and converting the two-dimensional code into a data matrix to acquire data.

Fast Fourier Transform (FFT) is a computational theory that transforms a directly measured raw signal from the raw domain (usually time or space) to a representation in the frequency domain or vice versa. The reason why the audio processing needs the FFT processing is that most of signals in daily life are complex signals, and it is difficult to obtain effective characteristic values in the time domain, but the frequency, phase, amplitude, and other characteristics can be seen when the signals are converted into the frequency domain.

The main principle is that any continuously tested time domain signal can be represented as an infinite superposition of sinusoidal signals of different frequencies. While a sine wave is input to a linear system and no new frequency components are generated (a nonlinear system such as a frequency converter generates new frequency components, called harmonics). Sine waves with different frequencies of unit amplitude are input into a linear system, the relation between the amplitude and the frequency of the output sine wave is recorded, the amplitude-frequency characteristic of the system is obtained, the relation between the phase and the frequency of the output sine wave is recorded, and the phase-frequency characteristic FFT of the system is obtained, and the transformation is rapidly calculated by decomposing a DFT matrix into products of sparse (mostly zero) factors.

As a ring in identification, the reason why the single chip microcomputer can only process some discrete finite-length signals and the FFT is used instead of the DFT is that the input values used by the DFT processed on the computer are sampling values acquired by the digital oscilloscope after passing through the ADC, and the number of input sampling points determines the calculation scale of the conversion. The transformed spectral output contains the same number of samples, but half of them are redundant and usually not displayed in the spectrum, so the really useful information is N/2+1 points. While the FFT can quickly simplify the calculation process of DFT, if the complexity of calculating DFT is N2 operations (N represents the number of input sample points), the complexity of performing FFT is Nlg10 (N).

The correspondence between the signals before and after the FFT is as follows:

assume that the sampling frequency is Fs, the signal frequency is F, and the number of sampling points is N. The result after the FFT is a complex number of N points. Each point corresponds to a frequency point. The modulus of this point is the amplitude characteristic at that frequency value. Assuming the peak value of the original signal is a, the modulus value of each point (except the first point dc component) of the FFT result is N/2 times a. The first point is the dc component, whose modulus is N times the dc component. And the phase of each point is the phase of the signal at that frequency. The first point represents the dc component (i.e. 0Hz), and the next point to the last point N (in practice this point is not present, and this is assumed to be the (N + 1) th point, which can also be regarded as dividing the first point into two halves, and moving the other half to the last) represents the sampling frequency Fs, and this middle is divided equally into N equal parts by N-1 points, and the frequency of each point increases in turn.

The frequency represented by a certain point n is: fn is (N-1) Fs/N. As can be seen from the above formula, Fn can distinguish the frequency as Fs/N, and if the sampling frequency Fs is 1024Hz and the number of sampling points is 1024 points, it can distinguish to 1 Hz. A sampling rate of 1024Hz samples 1024 points, which is exactly 1 second, that is, a signal of 1 second time is sampled and FFT is performed, and the result can be analyzed to be accurate to 1Hz, and if a signal of 2 seconds is sampled and FFT is performed, the result can be analyzed to be accurate to 0.5 Hz. If the frequency resolution is to be increased, the number of sampling points, i.e., the sampling time, must be increased. The frequency resolution and the sampling time are inverse relations.

The width of the spectrum after transformation also has a certain correspondence with the original signal. According to Nyquist's sampling theorem, the spectrum width after FFT can only be 1/2 of the original signal sampling rate at most, and if the original signal sampling rate is 4GS/s, the bandwidth after FFT can only be 2GHz at most. The inverse of the sampling period of the time domain signal, i.e. the sampling rate multiplied by a fixed coefficient, is the width of the spectrum after the transform, i.e. Frequency Span-K (1/. DELTA.T), where OT is the sampling period and the value of K depends on whether the original signal is down-sampled (tapped) before the FFT is performed, since this reduces the amount of FFT operations.

After FFT, audio with wider frequency spectrum, clearer characteristics and more accurate resolution can be obtained.

1.3 design calculation

1.3.1 application and calculation of twelve-tone equal temperament

Pitch identification is one of the most important elements in the present invention, and frequency determines the pitch, and how to associate specific frequency values with different scales is also a key point. The twelve-tone equal temperament commonly used in the world is a very good tool for determining the relative pitches between tones, and divides an octave interval into twelve equal parts according to the frequency proportion, wherein each equal part is called half tone and two degrees, and each two equal parts is one degree and two degrees. The twelve-tone equal temperament takes discrete tones in continuously varying frequencies, with intervals in degrees representing the frequency difference between two tones. And every octave of pitch increase, the frequency is twice that before. If a4 (corresponding to the note name la) in the scientific tone notation is represented by frequency f, thirteen different tones (including tones with frequencies f and 2 f) can be distinguished according to the twelve-tone equal law within the interval of [ f, 2f ]. For example, over five small two-degree tones, the frequency is calculated as:

in the formula: f. of0-reference audio frequency/Hz; f. of5-frequency/Hz of a tone five degrees less than the reference pitch.

According to the above theory, when the frequency of the tone a4 is defined to be 440.010Hz, a reference table of pitch and frequency of the middle four octaves, which is also the frequency range mainly used in the present invention, can be obtained, as shown in table 1:

TABLE 1 Pitch and frequency LUT

According to the table, the detection approximate range of each tone can be designed, and the error of the system for taking soil at 5Hz is an acceptable error range.

1.3.2 calculation of Pitch alignment

The audio FORMAT used by the work is mainly wav FORMAT, the wav FORMAT file mainly comprises three blocks of RIFF, FORMAT and DATA, the FORMAT block comprises information of the number of sound channels, sampling rate, number of bytes of DATA per second, sampling storage bits and the like of the audio file, wherein in order to improve the sampling precision as much as possible, the sampling rate of the invention is generally set to be 44100Hz, the requirement on the sound channel is not high, and the requirement can be met by setting the sampling rate to be monaural acquisition. The DATA area in the DATA block is the key point to be paid attention to, but the invention mainly relies on fast fourier transform to identify and compare the audio, the wav format audio file is real number, therefore, zero-filling pretreatment is needed to the deficiency-type part of the sampled DATA.

The pitch-aligned part mainly applies the related knowledge of the fast fourier transform, which is in essence a discrete fourier transform. The formula for the discrete fourier transform is:

in the formula: x-a sampled signal; n-number of signal points; the frequency domain signal after X-discrete Fourier transform; n-sequence index of time domain sampling point; k-index of frequency domain value.

For fast fourier transform, it splits the coefficient part containing x (N) (N ═ 0, 1, …, N-1) into two vectors of odd and even terms;

x[0]=[x(0),x(2),...,x(n-2)]T

x[1]=[x(1),x(3),...,x(n-1)]T

they correspond to two new polynomials X respectively[0](x) And X[1](x) Thus the following three expressions are now available:

X(a)=x0+x1a+x2a2+...+xn-1an-1

thus, it can be deduced that:

X(a)=X[0](a2)+aX[1](a2);

order:substituting the formula into the formula, and according to the elimination theory:

and the half-folding principle is as follows:

and (3) pushing out:

according to the FFT algorithm which takes 2 as the base and is extracted according to the time, the operation amount of the DFT algorithm is reduced by half, and the conversion from a time domain signal to a frequency domain signal is completed quickly and swiftly. In order to separate adjacent frequencies and obtain better identification performance, the frequency resolution needs to be improved, and the waveform resolution and the FFT resolution can be considered. The waveform resolution is determined by the time length of the original data samples:

in the formula: Δ Rω-a waveform resolution; t-original data time.

And the FFT resolution is determined by the sampling frequency and the number of data points participating in the FFT:

in the formula: Δ Rfft-an FFT resolution; fs-sampling rate; n-number of data points;

if the frequency domain curve is displayed more smoothly, zero padding can be performed at the end of the time domain, which is equivalent to performing interpolation in the frequency domain, however, the waveform resolution finally determines whether two signal components with similar frequencies can be distinguished, and therefore, the requirement cannot be met only by zero padding operation, and the sampling time needs to be prolonged. To obtain a conversion result accurate to 1Hz, the sampling time should be kept around ls.

1.4 hardware framework

In the work, RZ/A2M is used as a main control chip and is connected with various peripheral devices to form a whole tone level recognition and echoing prompt system. The peripheral equipment comprises two display screens, and the real-time picture during scanning the two-dimensional code and the recognition result and the operation prompt of the single tone during playing are respectively provided with a good environment for human-computer interaction; meanwhile, the system comprises a plurality of basic external devices such as keys, a microphone, a camera, an SD card storage device and the like. The hardware block diagram is shown in fig. 2.

1.5 software flow

The basic flow of the software aspect of the invention is as follows. The software is generally divided into a two-dimensional code image recognition part and a tone recognition part, and is applied to the technologies of image processing and sound processing.

As shown in fig. 4, first, the two-dimensional code recognition part initializes the external devices such as the FreeRTOS system, the camera, and the screen, inputs the image of the camera in the grayscale mode, and outputs and displays the image on the screen connected by the HDMI cable. And initializing the prompt information at the upper left corner of the screen, and displaying whether the two-dimensional code is identified at the upper left corner of the display screen in a coordinate-designating mode. After receiving the picture containing the two-dimensional code, the camera decodes the picture with a two-dimensional code identification open source library zx and retrieves the decoded content, on one hand, the picture is displayed back at the upper left corner of the display screen, and on the other hand, the picture is compared with the music score identification data stored in the music library. Then, the software calls out the corresponding electronic music score data, and the prompting data is displayed back on a display screen special for another interactive interface to prompt the user to start practice and wait for the user to press a start key. To begin with, the second part, the part of tone recognition, is entered. And after the start key is pressed, entering a circulating sound detection module. The singing name of the single tone which is to be played currently is displayed at first, so that a beginner can conveniently and quickly understand the singing name, and then the singing name is recorded and identified in a circulating mode at intervals of 1 second.

The sound played by the user is processed by filtering and fast Fourier transform, whether the user is playing or not is judged firstly, the sound is too small, and then the sound is compared with a standard pitch frequency comparison table in a library to obtain a comparison result and prompt in real time on a display screen: under four conditions of too small sound, accurate intonation, too high tone and too low tone, if the intonation is accurate, the judgment of the next single tone is carried out until the whole song is finished, and the prompt of completing the exercise is carried out after the completion. The whole software system has a relatively definite flow, can achieve the function of real-time detection, and has relatively good identification accuracy.

1.6 function

Two-dimensional code for identification and detection

Before the user formally starts to practice, the image recognition is carried out through the camera module of the RZ/A2M microprocessor, the two-dimensional code is intelligently recognized after scanning, and the music score needing to be practiced at this time is judged by reading the information in the two-dimensional code.

Detecting intonation height of single tone

When the user formally starts to practice, the microphone peripheral equipment is used for collecting audio data, and meanwhile, the corresponding music score data are called from the music library to be used for detecting and correcting the accuracy of the playing of the user in real time. When a user plays a note, the system correspondingly detects the intonation height of each note in real time.

The pitch height result is fed back to the user

Whether the playing of the note is accurate or not and how to correct the note is prompted to the user by displaying the high, standard and low of the note, if the intonation is accurate, the next tone is judged until the whole song is finished, and the user is prompted to finish the exercise after the completion.

1.7 index

Designing an intelligent temperament proofreading system based on a RZ/A2M microprocessor of the RZ;

the design embodies the characteristics of good practicability, strong independence, low power consumption and the like;

designing parameters one-by-one performance index requirement; the sampling rate of the audio file is 44100 Hz; the sampling time is ls; frequency detection range for each tone (error 5 Hz).

The technical solution of the present invention is further described below with reference to experiments.

1 test device

The intelligent temperament checking assistant comprises several parts: the system comprises a RZ/A2M microprocessor, an HDMI interface screen 1, a camera, a screen 2, a microphone, a key, a standard SD card storage device, a standard SD card socket and a Kule band APP.

In the invention, RZ/A2M is used as a main control chip and is connected with a plurality of peripherals to form a whole tone recognition and echo prompting system. The peripheral equipment comprises two display screens, the functions of the two-dimensional code scanning real-time picture display and the single tone height recognition result and operation prompt display during playing are respectively realized, the keys are used for starting practice, the microphone collects musical notes for user practice, and the camera is used for recognizing the two-dimensional code and the SD card storage device is used for storing music score information.

The invention mainly uses the Kule band APP to simulate different musical instrument sounds for testing, the Kule band APP is digital music creation software written by apple Inc., one of the functions of the Kule band APP is to simulate the sounds of various musical instruments, and the Kule band APP is used to simulate the sounds of the musical instruments of guitars and violins for testing.

2 test environment set-up

In order to verify whether each part of the system can normally operate, a system test integral structure is built, and the main purpose is to test the correctness of a program and whether equipment can normally operate by simulating an actual piano practice environment. The establishment of the test environment is based on an MATLAB platform, and the preliminary debugging and index judgment are carried out on the works of the game. The test environment undergoes two times of edition change and upgrade, firstly, a common script is taken as a carrier, the frequency of the performance single tone identified each time is displayed back in a command line window, the result is evaluated, and sampling parameters are modified; the first edition modification uses a GUI design tool to preliminarily plan the interface and modify the software code into a callback mode of each element of the interface; and the App Designer design tool is used for second version change, the interface is beautified and improved in function, the exe file capable of being transplanted is obtained by packaging, the software version of the invention is obtained, the construction of the test environment aims to provide ideas and directions for the improvement of the invention, different improvement schemes are collected, and the user friendliness and the stability of the product are improved. And the final test environment is mainly the second overall reprinted version, and the construction of the test environment emphasizes the easy-to-use aesthetic property, the software transplanting stability and the accuracy of intonation identification of the test user.

2.1 easy to use aesthetics

Firstly, in the aspect of easy use and aesthetic property, the interface elements form an interface subjected to style optimization by a music score display module, a progress bar module, a prompt display window module, a piano animation prompt module and a key group. The 'music score playing' block displays three parts of contents, and firstly displays 'loading' animation when the two-dimensional code is scanned.

And then the corresponding music score is displayed in the playing process, so that the user can conveniently check and reference, and simultaneously, the real-time volume block animation is displayed below the music score, and the running state shows that the attractiveness of the interface is improved, so that users at different ages can simply start to play. The progress bar module is located at the boundary of the user interface, marks the current playing progress, and automatically refreshes according to the playing progress of the user, so that the user can know the current position conveniently, and meanwhile, the progress bar module also plays a role in dividing the work interface. The display screen is located the interface upper right corner, including three-colour suggestion banks and word display module, and the suggestion lamp is according to playing the automatic correspondence of intonation height and is lighted, when letting the user receive the prompt message sooner, also can let young user needn't read the suggestion characters and also can operate. The piano animation interface in the lower right corner not only plays a decorative role, but also can enable a piano beginner to improve the familiarity of the single tone corresponding to the keys when using the test environment. The attractive appearance and the easy use of the test environment interface provide guarantee for the smooth test, and simultaneously provide an improved thought and direction for the invention.

2.2 software migration stability

In order to improve the software transplanting stability, the user-defined setting of the positions of the folders of different machines needs to be realized, so that the option keys are added into the key group module, the service is provided in a popup window mode, and the attractiveness of the interface is ensured. In order to ensure convenient setting, a 'browse' key is additionally arranged behind each option, and the 'browse' key is easy to check and change.

2.3 Pitch recognition accuracy

The test environment needs to prepare for testing the accuracy of tone recognition, so that the software program can display the detection result of the single tone and the matching result of the tone height in the display screen area, and can display the specific currently recognized sound frequency in the command line window in real time, thereby facilitating the acquisition of test data and the test result.

3 test protocol

3.1 image testing

The image test is that a plurality of two-dimensional codes are identified through a camera, after the camera receives a picture containing the two-dimensional codes, the picture and a two-dimensional code identification open source library are decoded, decoded contents are retrieved and displayed back at the upper left corner of a display screen, and therefore the accuracy of the two-dimensional code image identification part is tested.

3.2 Cyclic Sound detection Module testing

The audio processing aspect is a testing cycle sound detection module, notes played by a user are collected through a microphone, the sampling time of each single tone is 1s, the sampling rate is 44100Hz, the main control chip carries out filtering and fast Fourier transform processing on the single tones, and the single tones are compared with a standard pitch frequency comparison table in a library, so that the playing correctness is judged. The cyclic sound detection module is provided with a screen 2 and is used for displaying prompt information of each note during playing, so that whether the four conditions of 'too small sound', 'accurate intonation', 'too high tone' and 'too low tone' displayed on the screen are correct or not needs to be judged during testing, and due to uncertainty of objective factors, a user is not necessarily in a quite quiet environment during exercise, and a noise interference test needs to be added to judge the accuracy of the module. Since the main target group of the present invention is string beginners and the most important target user is violin beginners, the pitch of the detection is within the range involved in the first handle of violin, i.e., the highest pitch is B2, and therefore the highest pitch to be tested is also B2 (singing name si).

According to the judgment standard, the invention simulates the following six conditions during testing:

(1) test for "underspeed of sound

The distance between the device and the playing place of the user is set to be 5 meters, the volume emitted by the library band APP used in the test is adjusted to be 30%, and whether the screen 2 displays 'sound is too small' or not in the playing process is observed. To ensure the accuracy and rigor of the experiment, the test was repeated 10 times with distances set to 0.5 m, 1 m, 2m, 3 m, 4 m, 2 times per fixed distance with a volume of 30% and 50%, respectively, and the viewing screen 2 showed "under-sounding" or not.

(2) "accuracy of tone" test

After the practice is started, the two-dimensional code is scanned to identify the music score for playing, the whole music score is played according to correct sound in the music score, and whether the observation screen 2 shows that the music score is accurate or not is judged. To ensure the accuracy and rigor of the experiment, 10 music pieces were replaced and tested sequentially.

(3) "Pitch too high" test

After practice is started, the two-dimensional code is scanned to identify the music score for playing, a tone higher than the corresponding tone in the music score is used for playing during playing, and the observation screen 2 shows whether the tone is too high. To ensure the accuracy and stringency of the experiment, all the tones to be tested were tested 10 times with the higher tone.

(4) "Pitch too Low" test

After the practice is started, the two-dimensional code is scanned to identify the music score for playing, a tone lower than the corresponding tone in the music score is used for playing during playing, and the observation screen 2 shows whether the tone is too low. To ensure the accuracy and stringency of the experiment, the test uses the lower tone for all tones tested 10 times. .

(5) Noise interference test

The invention is tested in the environment of continuously emitting 40-50 dB, 60-70 dB and 70-80 dB noises, and the accuracy of each sound detection is observed.

(6) Single tone frequency range testing

The system is tested in a dormitory closed environment, the sampling rate is 44100Hz, and the actual recording time of the audio is set to be 1 s. According to the twelve-tone equal rhythm, each note has a standard frequency of different octaves, and the frequency data obtained by the program test is compared with the standard frequency. Seven basic notes of Do, Re, Mi, Fa, So, La and Si in the fifth octave are selected for testing in the test, the stringed music is used for playing the music, and the FFT calculation is carried out after the system inputs the stringed music to obtain the test, and the test is compared with the standard frequency.

3.3 test data and results analysis

3.3.1 image testing

10 two-dimensional codes are identified through the camera for testing, and results are accurately displayed in the upper left corner of the display screen.

3.3.2 Cyclic Sound detection Module testing

(1) The "sound too small" test:

in 10 experiments, only when the distance between the device and the playing place of the user is 3 meters, the APP volume of the Kule band is set to be 30%; and a distance of 4 meters, and a display of "volume too small" in the case where the coupe APP volume is set to 30%, the test result is accurate within an error allowable range in consideration of a closed place as a test place.

(2) And (3) testing accuracy of tone:

the 10 different music scores are played by correct tones and can be accurately identified.

(3) "pitch too high" test:

all tones to be detected were detected 10 times using the higher tone, and the results all showed "pitch too high".

(4) "Pitch too Low" test:

all tones to be detected were detected 10 times using the lower tone, and the results all showed "too low tone".

(5) And (3) noise interference test:

the invention can still accurately identify the single tone under the environment of 40-50 decibels and 50-60 decibels, so that the invention can accurately identify the single tone under the environment of 60 decibels in outdoor or noisy environment.

(6) Single tone frequency range test:

and displaying the absolute value of the frequency difference between the actually measured data and the standard data in a column of standard deviation according to the test result, and calculating the error rate. As shown in Table 2, the test results of the standard notes are different from the standard frequency difference, the maximum frequency difference and the minimum frequency difference are respectively 1.727Hz and 0.021Hz, the error of all notes is less than 0.3%, the average error rate is 0.112%, and the error is very small.

Table 2 test results for the sixth octave

To further test the effectiveness of the designed system, seven additional basic notes of the fifth octave were selected for equal-term repeatability testing and the mean error rate was calculated as shown in Table 3.

As can be seen from table 3, the average error of the other octave is less than 0.2%, the maximum error is 0.193%, the minimum error is 0.007%, and the average error is 0.102%, which is within the error range of the adjacent semitone frequency defined by the twelve tone average law.

TABLE 3 fifth octave test results

The test result proves that the system can effectively calculate and analyze the sound of the player in real time, and has small error.

On the basis of investigation of auxiliary APP products of the musical instruments at the mobile end in the current market, the tuning system starts from the aspect of better helping string musical instrument beginners to correct and learn by themselves, combines the image recognition technology of two-dimensional code scanning recognition of warehousing music scores, dynamically monitors fingering and playing errors of practicers in real time, effectively solves the defects of time and labor consumption of off-line accompanying and practicing, simplifies the steps of traditional teaching, and embodies the trend of future life. Taking the development of more mature string instrument auxiliary APPs on different android and IOS systems as an example, the content directions of the APP products are roughly divided into two types. Firstly, take Finger and other bank guitars etc. net lessons APP to give first place to, the main content of this kind of APP is with user's social intercourse and zero basic net lessons, and the music score of imparting knowledge to students is leading, relies on user's autonomic exercise nature, nevertheless lacks the professional intonation correction training to individual. Secondly, the tuning software such as solo and Guitartunana is taken as the main part, and APP is used for preparing before practice of stringed instruments such as Guitar in you Ke Li, namely, the loose and tight sound correction of different strings.

The invention creatively discovers the content defect of the conventional APP in assisting the function of the beginner by combining the general practice experience of the beginner, namely correcting the playing of the beginner in the music score practice process. When beginners just start to touch music score practice, because the fingerboards have no marking bits of pitches, they often show unskilled playing fingering and tone memory, and when playing mistakes occur, muscle memory is often formed only by repeatedly confirming artificial marks on the fingerboards. In addition, the conventional method of marking on a fingerboard with a color-highlighting tape or correction fluid for string music practice has drawbacks, including the limited number of marks that are easily removed and the damage to the musical instrument itself.

According to the above investigation results, the present invention proposes the following innovative solutions: the user is allowed to scan the music book in a warehouse by using the two-dimensional code, the high and low intonation of each playing sound is dynamically displayed while the music book is displayed, so that the defects in the traditional practice memory are avoided, and better self-practice experience is provided for the user.

It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.

The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

25页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:歌曲生成方法、装置、电子设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!