Song menu extraction method based on morphological method

文档序号：1905446 发布日期：2021-11-30 浏览：6次中文

阅读说明：本技术 一种基于形态学方法的歌单提取方法 (Song menu extraction method based on morphological method ) 是由李文熙冯瑞王鑫郭干城于 2021-08-23 设计创作，主要内容包括：本发明涉及一种基于形态学方法的歌单提取方法,包括如下步骤：(1)、采用文字检测模块检测歌单截图中出现的文本行,并将所有的文本行的位置传递给下一个区域过滤模块；(2)、采用区域过滤模块过滤掉不符合要求的文本区域,保留真实的歌曲内容信息；(3)、采用区域合并模块合并有关联关系的不同文本行,从而整合出更丰富的歌曲信息；(4)、采用信息提取模块进行最终的信息提取,将根据合并后的区域再次过滤,获取到真实的歌曲条目信息。本发明能够删除掉不属于歌单信息中的文字；能够整合属于同一个歌曲的不同类型的信息,提高检索效率。本发明适用于各种音乐软件中,将音乐软件的歌单进行迁移,可以提高软件的初始效率。(The invention relates to a method for extracting a song list based on a morphological method, which comprises the following steps: (1) detecting text lines appearing in the singing bill screenshot by adopting a character detection module, and transmitting the positions of all the text lines to a next region filtering module; (2) filtering out text regions which do not meet the requirements by adopting a region filtering module, and reserving real song content information; (3) combining different text lines with the association relationship by adopting a region combining module, thereby integrating richer song information; (4) and finally extracting information by adopting an information extraction module, filtering again according to the combined region, and acquiring real song entry information. The invention can delete the characters which do not belong to the song list information; different types of information belonging to the same song can be integrated, and the retrieval efficiency is improved. The invention is suitable for various music software, and can transfer the song list of the music software, thereby improving the initial efficiency of the software.)

1. A singing list extraction method based on a morphological method is characterized in that: the method for extracting the song list based on the morphological method is used for extracting song information in a screenshot of the song list, and comprises the following steps:

step (1), a text line appearing in the singing bill screenshot is detected by a character detection module, and the positions of all the text lines are transmitted to a next region filtering module;

step (2), filtering out text regions which do not meet the requirements by adopting a region filtering module, and reserving real song content information;

step (3), combining different text lines with the association relationship by adopting a region combining module, thereby integrating richer song information;

and (4) adopting an information extraction module to extract final information, filtering again according to the combined area, and acquiring real song entry information.

2. The method of claim 1, wherein the singing list extraction method based on the morphological method comprises: the character detection module in the step (1) comprises two algorithms of detection and identification, and the algorithm can detect the position of a text line and the character content of the text line in the singing list screenshot.

3. The method of claim 1, wherein the singing list extraction method based on the morphological method comprises: the region filtering module in the step (2) removes the text box which is not in accordance with the singing bill information condition according to the information such as the proportion, the position and the like of the rectangular box by using the prior condition, wherein the rule comprises that when the coordinate of the upper left corner of the text line is positioned on the right side of the central axis of the image, the text box needs to be filtered, when the length-width ratio is larger than 1, and when the area is too small, the text box needs to be filtered.

4. The method of claim 1, wherein the singing list extraction method based on the morphological method comprises: the region merging module in the step (3) comprises an expansion operation merging method, the song name region and the album name region of the same song are connected to form a connected region, but different song regions are still not connected.

5. The method of claim 1, wherein the singing list extraction method based on the morphological method comprises: the information extraction module in the step (4) comprises an area classification unit and a structuring unit, the obtained areas are classified to obtain the position of the real song list, and then the song information in the real song list is extracted by using the typesetting characteristics of the characters.

6. The method of claim 5, wherein the singing list extraction method based on morphology method is as follows: the classification unit uses a neural network to classify the images, the classification result comprises a correct region and an incorrect region, the region represented correctly is a song information region, and the region represented incorrectly is other mobile phone information regions.

7. The method of claim 5, wherein the singing list extraction method based on morphology method is as follows: the structured unit structurally integrates the characters in the area, the integrated rule is that the characters in the first line in one area are song names, the characters in the second line are album names, if the second line does not exist, the album names are vacant, the playing time length is connected with the characters of the album names, and if the last bits of the album names are time, the parts are intercepted and used as the playing time length.

Technical Field

The invention relates to a song list extraction method, in particular to a morphological method-based song list extraction method which can extract song information in a song list through images and can improve the structured extraction of useful information in screenshot.

Background

With the rapid development of the internet, streaming media has become an integral part of daily life, and music is an important part of streaming media, and more internet companies are paying attention to the music market. Along with the improvement of the copyright awareness of people, the competition of music software gradually becomes the competition of music copyright, and along with the transfer of the copyright, the user of an audience also considers the transfer of the software, wherein an important factor for hindering the user transfer is the transfer of a song list, a lot of music liked by the user is usually collected on the original music software, a first search brings a great time cost, and therefore the batch import function can bring new vitality for the user transfer.

The traditional song importing mode needs to adopt a link mode, and the mode needs original music software to provide a related webpage sharing interface, and the music software accesses the content of the webpage and analyzes the content, so as to identify the content in the webpage, but not all music software open similar interfaces, which brings great difficulty to the migration of users.

Disclosure of Invention

In view of the above problems, the present invention is directed to a method for extracting a song menu based on a morphological method, which can extract song information in the song menu through an image and improve the structured extraction of useful information in a screenshot.

The invention solves the technical problems through the following technical scheme: a method for extracting a song list based on a morphological method is used for extracting song information in a screenshot of the song list, and comprises the following steps:

step (1), a text line appearing in the singing bill screenshot is detected by a character detection module, and the positions of all the text lines are transmitted to a next region filtering module;

step (2), filtering out text regions which do not meet the requirements by adopting a region filtering module, and reserving real song content information;

step (3), combining different text lines with the association relationship by adopting a region combining module, thereby integrating richer song information;

and (4) adopting an information extraction module to extract final information, filtering again according to the combined area, and acquiring real song entry information.

In an embodiment of the present invention, the text detection module in step (1) includes a two-part detection and recognition algorithm, which can detect the position of the text line and the text content of the text line in the song menu screenshot.

In a specific implementation example of the present invention, the region filtering module in step (2) removes the text box that does not meet the singing style information condition according to the information such as the proportion and the position of the rectangular box by using a priori condition, and the rule includes that when the coordinate of the upper left corner of the text line is located on the right side of the central axis of the image, the coordinate needs to be filtered, when the aspect ratio is greater than 1, the coordinate needs to be filtered, and when the area is too small, the coordinate needs to be filtered.

In the embodiment of the present invention, the region merging module in step (3) includes a method of merging the expanding operations, in which the title region and the album name region of the same song are connected to form a connected region, but different song regions are still not connected.

In a specific implementation example of the present invention, the information extraction module in step (4) includes a region classification unit and a structuring unit, and obtains a position of a real song menu by performing classification processing on the obtained region, and then extracts song information therein by using a typesetting feature of the text.

In a specific implementation example of the present invention, the classification unit uses a neural network to perform classification processing on the image, the classification result includes correct and incorrect, the correct region is indicated as a song information region, and the incorrect region is indicated as another mobile phone information region.

In the embodiment of the invention, the structuring unit performs structured integration on the characters in the area, the rule of the integration is that the characters in the first line in one area are song names, the characters in the second line are album names, if the second line does not exist, the album names are vacant, the playing time length is connected with the characters of the album names, and if the last bits of the album names are time, the part is intercepted as the playing time length.

The positive progress effects of the invention are as follows: the singing list extraction method based on the morphological method provided by the invention has the following advantages: the invention has character detection module, so it can extract the song information in the song list through the image, without relying on the original music software to provide the interface of migration. Meanwhile, due to the fact that the region filtering module and the region merging module are arranged, structured extraction of useful information in the screenshot can be improved.

Drawings

FIG. 1 is a system architecture diagram of the present invention.

The following are the names corresponding to the reference numbers in the invention:

the system comprises a character detection module 1, an area filtering module 2, an area merging module 3 and an information extraction module 4.

Detailed Description

The following provides a detailed description of the preferred embodiments of the present invention with reference to the accompanying drawings.

FIG. 1 is a system architecture diagram of the present invention, as shown in FIG. 1: the invention provides a method for extracting a song list based on a morphological method, which is used for extracting song information in a screenshot of the song list and comprises the following steps:

step (1), a text line appearing in the singing bill screenshot is detected by a character detection module, and the positions of all the text lines are transmitted to a next region filtering module;

step (2), filtering out text regions which do not meet the requirements by adopting a region filtering module, and reserving real song content information;

step (3), combining different text lines with the association relationship by adopting a region combining module, thereby integrating richer song information;

and (4) adopting an information extraction module to extract final information, filtering again according to the combined area, and acquiring real song entry information.

The character detection module in the step (1) comprises two algorithms of detection and identification, and the algorithm can detect the position of a text line and the character content of the text line in the singing list screenshot.

The region filtering module in the step (2) removes the text box which is not in accordance with the singing bill information condition according to the information such as the proportion, the position and the like of the rectangular box by using the prior condition, wherein the rule comprises that when the coordinate of the upper left corner of the text line is positioned on the right side of the central axis of the image, the text box needs to be filtered, when the length-width ratio is larger than 1, and when the area is too small, the text box needs to be filtered.

The region merging module in the step (3) comprises an expansion operation merging method, the song name region and the album name region of the same song are connected to form a connected region, but different song regions are still not connected.

The information extraction module in the step (4) comprises an area classification unit and a structuring unit, the obtained areas are classified to obtain the position of the real song list, and then the song information in the real song list is extracted by using the typesetting characteristics of the characters.

The classification unit uses a neural network to classify the images, the classification result comprises a correct region and an incorrect region, the region represented correctly is a song information region, and the region represented incorrectly is other mobile phone information regions.

The structured unit structurally integrates the characters in the area, the integrated rule is that the characters in the first line in one area are song names, the characters in the second line are album names, if the second line does not exist, the album names are vacant, the playing time length is connected with the characters of the album names, and if the last bits of the album names are time, the parts are intercepted and used as the playing time length.

The success of deep learning brings new possibilities for many fields, and in different scenes, deep learning and traditional methods have respective advantages, so that the performance of the algorithm can be better by combining the respective advantages.

The invention designs a song list extraction method based on a morphological method by utilizing the advantages of deep learning and the respective advantages of a traditional morphological method, and realizes the extraction of 'song name', 'album name', 'playing time' of each music item in a song list image.

In order to make the technical means, the creation characteristics, the achievement purposes and the efficacy of the invention easy to understand, the following embodiment and the accompanying drawings are used to specifically describe the singing sheet extraction method based on the morphological method of the invention.

The system in this embodiment is implemented on a Linux platform, which has at least one GPU card support.

As shown in fig. 1, the song list extraction method includes a text detection module 1, an area filtering module 2, an area merging module 3, and an information extraction module 4.

The character detection module 1 is used for detecting text lines appearing in the singing bill screenshot and transmitting the positions of all the text lines to the next module.

In this embodiment, the text detection module is configured to process the song list screenshot, obtain the position of each text line in the image by using a text detection algorithm, and extract the content in each text line by using a text recognition algorithm.

In this embodiment, the text detection module uses a plurality of deep learning units to complete text detection. Specifically, the character detection module 1 includes a detection algorithm unit and an identification algorithm.

The detection algorithm unit is used for directly processing the screenshot of the song list, and the text position in the image is obtained in a text line-based mode, wherein the coordinate representation mode is the coordinate of the upper left corner of the rectangular frame and the length and the width of the rectangular frame.

The recognition algorithm unit is used for processing each text line region, cutting out each text region independently and recognizing corresponding characters.

The region filtering module is used for filtering out text regions which do not meet the requirements and reserving real song content information.

In this embodiment, the regional filter module filters based on three kinds of modes of position, interview and shape, when the coordinate of text line upper left corner is located image axis right, need filter it, need filter when the aspect ratio is greater than 1, need filter when the area is too little.

The region merging module is used for merging different text lines with the association relationship, so that richer song information is integrated.

In this embodiment, the region merging mode is completed by adopting an expansion operation. Specifically, firstly, the filtered text line is drawn on a new image, the region of the text line is represented by 1, other parts are represented by 0, then the convolution is carried out by controlling a certain size of expansion kernel to obtain a processed image, and finally, connected domain judgment is carried out on the processed image to find the positions of all disconnected regions.

The information extraction module 4 is used for final information extraction, and filtering again according to the combined area to obtain real song entry information.

In this embodiment, the information extraction module may use a plurality of units to extract the valid information. Specifically, the information extraction module 4 includes a region classification unit, a character filtering unit, and an information integration unit.

The region classification unit is used for filtering the non-song region again, specifically, segmenting the original image according to the extracted different regions, performing classification judgment by using a classifier, finally obtaining the judgment whether the region is the song region, if the region is the song region, keeping entering the next unit, and if the region is not the song region, deleting the region.

The character filtering unit is used for filtering useless characters in each area, and the filtered content comprises some punctuations, single numbers and other data which influence the retrieved characters.

The information integration unit is used for integrating song names, album names and playing time of each song area, the integration rule is that characters in a first line in one area are song names, characters in a second line are album names, if the second line does not exist, the album names are vacant, the playing time is connected with the characters of the album names, and if the last bits of the album names are time, the parts are intercepted to be used as the playing time.

The following specifically describes the flow of the song list extraction method in the embodiment of the invention:

in this embodiment, the entry of the method for extracting the song list based on the morphological method is a text detection module 1, which receives screenshot information of the song list, detects a text region, and transmits the detected result to a region filtering module 2 in a coordinate form, the module primarily filters the extracted information, screens out regions which do not meet requirements, and transmits the filtered regions to a region merging module 3, the module connects text lines of the same song information to the same region based on the morphological method, and transmits a new region to an information extraction module 4, the module further filters the obtained regions, and performs structuring processing on the information of the real song region to obtain a final result.

In practical application, the song list extraction method based on the morphological method can be deployed on any music software, the method does not need the original music software to provide any borrowing port, and the song list information in the original music software can be migrated only by a mobile phone screenshot method.

According to the method for extracting the song list based on the morphological method, the character detection module is arranged, so that song information in the song list can be extracted through images, and an original music software does not need to be relied on to provide a migration interface. Meanwhile, due to the fact that the region filtering module and the region merging module are arranged, structured extraction of useful information in the screenshot can be improved.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the present invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined by the appended claims and their equivalents.

7页详细技术资料下载

Song menu extraction method based on morphological method

相关技术

网友询问留言