Music generation method based on Monte Carlo tree search

文档序号:1965015 发布日期:2021-12-14 浏览:13次 中文

阅读说明:本技术 一种基于蒙特卡洛树搜索的音乐生成方法 (Music generation method based on Monte Carlo tree search ) 是由 殷渝杰 王越 姚奕芃 罗中祺 王子俊 廖奕凯 李伟睿 陈晨兆阳 张雨骁 于 2021-08-25 设计创作,主要内容包括:本发明涉及音乐技术领域,且公开了一种基于蒙特卡洛树搜索的音乐生成方法,所述方法是核心围绕调性、曲式、和声和节奏旋律组成,曲子采用4/4拍节奏,速度为120,规定固定长度的歌曲结构A、B、C…等都为16小节长度;本发明相较于传统的创作方式,将创作方式完全的移植到终端设备上,用户在进行创作的时候,直接在电脑上就行操作即可,然后根据其中预设的一些程序即可自动的生成对应的音乐,从而使音乐创作更加的方便快捷,大大减少了音乐创作所用到的工具,使用户可以随时随地的进行音乐创作。(The invention relates to the technical field of music and discloses a music generation method based on Monte Carlo tree search, which is characterized in that the method is composed of a core tone, a melody, a harmony sound and a rhythm melody, wherein the melody adopts 4/4 beat rhythm, the speed is 120, and the specified song structure A, B, C … with fixed length and the like are all 16 bar lengths; compared with the traditional creation mode, the method and the system have the advantages that the creation mode is completely transplanted to the terminal equipment, a user can directly operate on a computer when creating, and then the corresponding music can be automatically generated according to some preset programs, so that the music creation is more convenient and faster, tools used by the music creation are greatly reduced, and the user can perform the music creation anytime and anywhere.)

1. A music generation method based on Monte Carlo tree search is characterized in that: the method is composed of melody, harmony and rhythm, the melody adopts 4/4 beat rhythm, the speed is 120, the song structure A, B, C … with fixed length is specified to be 16 bars in length, and the method comprises the following steps:

s1, tone generation: selecting the melody by a user;

s2, generating a curved expression: the user inputs a number as the total number of paragraphs appearing in the structure and returns the corresponding structure, for example, the number 3 is input, and the return structure is A-B-A; inputting a number 5, returning to the structure of A-B-A-C-A and the like, wherein the structure is a + a + B + a under the A curve formula; a + b + b + a; a + b + a + a and the like, wherein the length of each small curved structure is 4;

s3, generating harmony: firstly, the chord structure is listed according to the definition of 3, 7 and 9 chords, then according to the content in the Stobbeobine and the acoustics, 2, 4 and 8 harmony sounds are firstly generated in the minor tune, if the harmony sounds are 2, one harmony sound occupies 2 bars, 8 harmony sounds occupy 0.5 bar, wherein the 3, 7 chords are used for harmony filling, and the 9 chords are used for melody generation;

s4, rhythm of the generated leaven: in this step, two bars are taken as a processing unit, and the rhythm is played according to 4/4, namely, a quarter note is played for one bar, and each bar is 4 bars, so that the unit has 8 quarter notes, but the 8 quarter notes are not all pronounced, and the notes in the positions are selected from the 8 notes to be pronounced according to the number given by the user, and the rest is processed by connecting lines to form the rhythm of the music;

s5, generating melody: the step is that the notes are filled in the selected position on the basis of the given rhythm, and because a chord environment is established for the melody, in order to make the melody more harmonious with the chord environment, the method of selecting the tones from the chord is adopted, and the 5 tones forming the chord are randomly selected, randomly arranged and combined, and filled in the position selected in the last step.

2. The method of claim 1, wherein the music generation method based on the Monte Carlo tree search comprises: in the step S2, a is a verse and B is a refrain, and the number of times of occurrence of a in the tune is greater than or equal to B, and the number of times of occurrence of B is greater than or equal to C, so that a is primary, B is secondary, and so on; other letters represent new music passages than A, B, specifying that the first two passages must be A-B.

3. The method of claim 1, wherein the music generation method based on the Monte Carlo tree search comprises: in step S3, each chord is processed into a sequence, the chord in the sequence can be repeated but not more than 2 times, and the first-level chord and the sixth-level chord with the selected tone must be in the sequence.

4. The method of claim 1, wherein the music generation method based on the Monte Carlo tree search comprises: in step S3, a list is generated according to the definition of the various 3, 7, 9 chords, such as a structure where the major 3 chord is [0,4,7], and so on.

5. The method of claim 1, wherein the music generation method based on the Monte Carlo tree search comprises: in step S3, the lowest note of the chord note is called the root of the chord, and the name of the root may correspond to the chord.

6. The method of claim 1, wherein the music generation method based on the Monte Carlo tree search comprises: the chord in the step S5 is nine chords.

7. The method of claim 1, wherein the music generation method based on the Monte Carlo tree search comprises: the method further comprises, after generating the final tune, outputting four tracks of audio by the tune, two tracks being the main melody of the tune, one track being the piano accompaniment, and the fourth track being the bass of the full tune, for balancing the frequency response range of the entire tune.

Technical Field

The invention relates to the technical field of music, in particular to a music generation method based on Monte Carlo tree search.

Background

Music is an artistic form and cultural activity, the medium of which is regular sound waves organized on time, and the basic elements of which include strength, tone, duration, timbre and the like. These basic elements are combined with each other to form the common "form factor" of music, such as: rhythm, melody, harmony, and force, speed, style, melody, texture, etc. The form factor constituting music is a means for expressing music. Different types of music may emphasize or ignore certain elements thereof. The music is played with various musical instruments and vocal music technique, divide into instrumental music, vocal music and will sing and musical instrument combination works together, and the creation of present music divides into the direct creation of musical instrument and the combination of musical instrument and computer and creates, but whatever mode, all can not leave the participation of musical instrument, and like this when carrying out the music creation, the instrument that uses is comparatively loaded down with trivial details, is unfavorable for that the user is convenient fast to carry out the creation to the music.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a music generation method based on Monte Carlo tree search, which solves the problems in the prior art.

(II) technical scheme

In order to achieve the purpose, the invention provides the following technical scheme: a music generating method based on monte carlo tree search, the method is characterized in that the core of the method is composed of melody, harmony and rhythm melodies, the melody adopts 4/4 beat rhythm, the tempo is 120, and song structures A, B, C … are all 16 bars in length, the method comprises the following steps:

s1, tone generation: selecting the melody by a user;

s2, generating a curved expression: the user inputs a number as the total number of paragraphs appearing in the structure and returns the corresponding structure, for example, the number 3 is input, and the return structure is A-B-A; inputting a number 5, and returning to the structure of A-B-A-C-A and the like;

s3, generating harmony: firstly, the chord structure is listed according to the definition of 3, 7 and 9 chords, then according to the content in the Stobbeobine and the acoustics, 2, 4 and 8 harmony sounds are firstly generated in the minor tune, if the harmony sounds are 2, one harmony sound occupies 2 bars, 8 harmony sounds occupy 0.5 bar, wherein the 3, 7 chords are used for harmony filling, and the 9 chords are used for melody generation;

s4, rhythm of the generated leaven: in this step, with two bars as a processing unit, according to the rhythm of 4/4 beats, that is, a quarter note is a beat, and each bar is 4 beats, so that the unit has 8 quarter notes, but the 8 quarter notes are not all pronounced, and according to the number given by the user, which notes are pronounced at which positions in the 8 notes and which notes are not pronounced are selected, so as to form the rhythm of the tune;

s5, generating melody: the step is that the notes are filled in the selected position on the basis of the given rhythm, and because a chord environment is established for the melody, in order to make the melody more harmonious with the chord environment, the method of selecting the tones in the chord is adopted, and the 5 tones forming the chord are randomly selected, randomly arranged and combined, and filled in the position selected in the last step.

Preferably, in the step S2, a is a verse and B is a refrain, and the number of occurrences of a in the tune is greater than or equal to B, and the number of occurrences of B is greater than or equal to C, so that a is the main, B is the secondary, and so on; other letters represent new music passages than A, B, specifying that the first two passages must be A-B.

Preferably, in step S3, each chord is processed into a sequence, the sequence of chord neutralizers can be repeated, but not more than 2 times, and the sequence of chord neutralizers has to have a first-level chord and a sixth-level chord with a selected tone.

Preferably, in step S3, a list is generated according to the definition of each 3, 7, 9 chord, such as a structure that the major 3 chord is [0,4,7], and so on.

Preferably, in step S3, the lowest note of the monophonic combined note is called a root of the chord, and the name of the root may correspond to the chord.

Preferably, the chord in step S5 is nine chords.

Preferably, the method further comprises, after the final tune is generated, outputting four tracks of audio by the tune, two tracks being a main melody of the tune, one track being a piano accompaniment, and a fourth track being a bass of a full tune for suppressing a high frequency of the tune.

(III) advantageous effects

The invention provides a music generation method based on Monte Carlo tree search, which has the following beneficial effects:

compared with the traditional creation mode, the creation mode is completely transplanted to the terminal equipment, a musician can directly operate on a computer when creating, and then corresponding music can be automatically generated according to some preset programs, so that the music creation is more convenient and faster, tools used by the music creation are greatly reduced, the musician can perform the music creation at any time and any place, and the condition that the music creation inspiration is lost in the process of searching for the creation tools is avoided.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a technical scheme that: a music generating method based on Monte Carlo tree search, the core of the method is around the tonality, the melody is curved, harmony and rhythm melody to make up, the melody adopts 4/4 beat rhythm, the speed is 120, stipulate the song structure A, B, C … all to be 16 bars of length, the method includes the following steps:

s1, tone generation: selecting the melody by a user;

s2, generating a curved expression: the user inputs a number as the total number of paragraphs appearing in the structure and returns the corresponding structure, for example, the number 3 is input, and the return structure is A-B-A; inputting a number 5, and returning to the structure of A-B-A-C-A and the like, wherein A is a verse, B is a refrain, the occurrence frequency of A in the tune is more than or equal to B, the occurrence frequency of B is more than or equal to C, therefore, A is the main, B is the secondary, and the like; other letters represent new music passages different from A, B, specifying that the first two passages must be A-B;

s3, generating harmony: firstly, a chord structure is listed according to the definition of 3, 7 and 9 chords, then according to the content in the Stobbeobine and acoustics, 2, 4 and 8 harmony sequences are firstly generated in a melody, if the harmony is 2, one harmony occupies 2 bars, 8 harmony occupies 0.5 bar, wherein, the 3 and 7 chords are used for harmony filling, the 9 chords are used for melody generation, each chord sequence is processed into a sequence, the chord in the sequence can repeatedly appear but not more than 2 times, and the first-level and sixth-level chords with selected tonality are necessary in the sequence, a list is generated according to the definition of various 3, 7 and 9 chords, for example, the major 3 chord is a structure of [0,4,7], and the rest is the lowest chord in the harmony is called the root of the chord, and the sound name of the root can correspond to the chord;

s4, rhythm of the generated leaven: in this step, with two bars as a processing unit, according to the rhythm of 4/4 beats, that is, a quarter note is a beat, and each bar is 4 beats, so that the unit has 8 quarter notes, but the 8 quarter notes are not all pronounced, and according to the number given by the user, which notes are pronounced at which positions in the 8 notes and which notes are not pronounced are selected, so as to form the rhythm of the tune;

s5, generating melody: the step is that the notes are filled in the selected position on the basis of the given rhythm, and because a chord environment is established for the melody, in order to make the melody more harmonious with the chord environment, the method of selecting the notes from the chord is adopted, and the notes are randomly selected from 5 of the nine chords, randomly arranged and combined, and filled in the position selected in the last step.

After the final tune is generated, the tune outputs four tracks of audio, two tracks as the main melody of the tune, one track as the piano accompaniment, and the fourth track as the bass of the full tune, for suppressing the high frequency of the tune.

The method also comprises a neural network, a backbone network and a combination part, wherein in the aspect of the neural network, adjacent related tokens (tokens) in a beat-based REMI expression method similar to MIDI are spliced into a new compound word (CP) so as to compress the sequence length, and the sequence length of the whole song can reach the range acceptable by the attention mechanism. The CP expression method is characterized in that the information of the dynamics, the chord, the pitch and other phonetic symbols is spliced into a CP, and then the auxiliary information and the time information of the speed, the beat, the bar and the like are spliced into the CP. Secondly, mapping the obtained CP to a uniform high-dimensional vector, reserving a section of length for the six symbols, and filling an [ ignore ] symbol representing null if the original CP does not exist. Finally, the spliced high-dimensional vector is linearly transformed to a lower dimension using a linear layer.

In practical application, it is found that velocity information and dynamics information are unnecessary for music generation, and certain interference is generated on generation of key information such as pitch, so that in practical use, dynamics and dynamics information in an original model are removed.

The pitch is treated as a combination of octaves and 12 pitch names, rather than as individual vectors. A total of 128 pitches are used in Midi, which corresponds to a total of 11 octaves (the last incomplete). In embedding, 12 pitch names and 11 octaves are initialized, and a complete pitch is obtained by splicing two vectors. Thus, the method realizes that a priori music rules are introduced into the neural network to a certain extent. The benefits of such an expression are many, for example, it is easier for a neural network to find connections between similar melodies that are different in octave. Moreover, in the data distribution, the occurrence probability of each pitch is extremely uneven, and the occurrence probability of some pitches is extremely low, so that the embedding expression trained by the pitch alone has poor effect, and the neural network is difficult to extract information in the information. But with this spliced representation, the representation is more generalized. Even though some pitches appear rarely in the data set, their corresponding octaves and pitch names are heavily trained. Thus, the model can extract the information corresponding to the pitch by combining the two.

In the aspect of backbone network, a rotation Position Embedding (RoPE) is adopted, which is constructed by complex numbers and can be suitable for the relative Position coding of linear attention.

In conjunction with this, 4 measures are selected from the melody generated by the artificial intelligence and harmony is added, thus leading to the logic of S1-S5. Meanwhile, two 4 sections can be cut out from a piece of music more than 30 sections generated by artificial intelligence for generation of different sections. Since the music generated by the artificial intelligence contains harmony, notes with Midi number (Midi number) of 40 or less can be deleted directly, and if two notes appear simultaneously, the highest note can be taken as the melody. Referring to the algorithm in S5, the melody sounds in the harmony rhythm are taken, and the chord in which the sound appears most is calculated, and that chord is selected as the harmony thereof. Once harmony is determined, the logic of the music theory algorithm is entered and a midi file may be further generated.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

6页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于深度学习的音乐流派分类方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!