Media data analysis method and device, computer equipment and storage medium

文档序号：1889405 发布日期：2021-11-26 浏览：4次中文

阅读说明：本技术 一种媒体数据分析方法、装置、计算机设备和存储介质 (Media data analysis method and device, computer equipment and storage medium ) 是由马杰于 2021-09-06 设计创作，主要内容包括：本申请涉及一种媒体数据分析方法、系统、计算机设备和存储介质,所述方法包括：爬取与预设关键词对应的媒体信息,对该媒体信息进行分析,确定目标媒体数据的版权信息,从而提前得知目标媒体数据的版权信息,如果该目标媒体数据是用户喜欢的媒体数据,则可以预先对该目标媒体数据进行处理。(The present application relates to a media data analysis method, system, computer device and storage medium, the method comprising: media information corresponding to preset keywords is crawled, the media information is analyzed, and copyright information of target media data is determined, so that the copyright information of the target media data is known in advance, and if the target media data is media data liked by a user, the target media data can be processed in advance.)

1. A media data analysis method, the media data analysis method comprising:

crawling media information corresponding to preset keywords;

and analyzing the media information to determine the copyright information of the target media data.

2. The media data analysis method according to claim 1, wherein the preset keyword is a preset keyword related to copyright information of the media data.

3. The method for analyzing media data according to claim 1, wherein the crawling of the media information corresponding to the preset keyword further comprises:

acquiring media data preference data of a user;

and determining preset keywords according to the preference data of the media data.

4. The method of media data analysis of claim 1, the method further comprising:

and sending the copyright information to a terminal.

5. The method of media data analysis of claim 1, the method further comprising:

and downloading the target media data corresponding to the copyright information according to the copyright information.

6. A media data analysis apparatus, characterized in that the media data analysis apparatus comprises:

the crawling module is used for crawling media information corresponding to preset keywords;

and the copyright information determining module is used for analyzing the media information and determining the copyright information of the target media data.

7. The apparatus of claim 6, wherein the preset keyword is a preset keyword related to copyright information of the media data.

8. The media data analysis method according to claim 6, wherein the media data analysis device further comprises:

the favorite data acquisition module is used for acquiring media data favorite data of a user;

and the keyword determining module is used for determining preset keywords according to the favorite data of the media data.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 5 are implemented when the computer program is executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.

Technical Field

The present application relates to the field of big data technologies, and in particular, to a method and an apparatus for analyzing media data, a computer device, and a storage medium.

Background

At present, in order to maintain the rights and interests of the creators or in order to earn, the copyright protection is more and more emphasized by each platform. For example, music APP requires a user to purchase a member to listen to a song or download, and video APP requires a user to purchase a member to watch a video or cache a video. However, the APPs compete with each other, and each APP contends for resources, for example, the a music APP has all music of singer a (including music a1, a2, A3), the B music APP has all music of singer B (including music B1, B2, B3, B4), the singer a and the singer B need to pay for listening to songs at their respective music APPs, but the user likes music a1 and music B4, and needs to purchase members of two music APPs, namely, the a music APP and the B music APP, and some users need to purchase even more members of the music APP and the video APP. However, since each APP likes a certain song or a certain video, purchasing a member for this purchase order is wasteful for the user, and in the long run, the user stickiness is reduced.

Therefore, the technical problems that the user cannot enjoy favorite media data if the user does not purchase the APP members and that a plurality of APP members need to be purchased, which is too wasteful for the user and reduces the user viscosity exist in the prior art.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a media data analysis method, apparatus, computer device and storage medium for solving the above technical problems.

In a first aspect, a media data analysis method is provided, where the media data analysis method includes:

crawling media information corresponding to preset keywords;

and analyzing the media information to determine the copyright information of the target media data.

In one embodiment, the preset keyword is a preset keyword related to copyright information of the media data.

In one embodiment, the crawling of the media information corresponding to the preset keyword further includes:

acquiring media data preference data of a user;

and determining preset keywords according to the preference data of the media data.

In one embodiment, the method further comprises:

and sending the copyright information to a terminal.

In one embodiment, the method further comprises:

and downloading the target media data corresponding to the copyright information according to the copyright information.

In a second aspect, there is provided a media data analysis apparatus, comprising:

the crawling module is used for crawling media information corresponding to preset keywords;

and the copyright information determining module is used for analyzing the media information and determining the copyright information of the target media data.

In one embodiment, the preset keyword is a preset keyword related to copyright information of the media data.

In one embodiment, the media data analysis apparatus further includes:

the favorite data acquisition module is used for acquiring media data favorite data of a user;

and the keyword determining module is used for determining preset keywords according to the favorite data of the media data.

In a third aspect, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the following steps are implemented:

crawling media information corresponding to preset keywords;

and analyzing the media information to determine the copyright information of the target media data.

In a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

crawling media information corresponding to preset keywords;

and analyzing the media information to determine the copyright information of the target media data.

The media data analysis method, the media data analysis device, the computer equipment and the storage medium comprise the following steps: the media information corresponding to the preset keywords is crawled, the media information is analyzed, the copyright information of the target media data is determined, and therefore the copyright information of the target media data is known in advance.

Drawings

FIG. 1 is a flow diagram illustrating a method for media data analysis in one embodiment;

FIG. 2 is a flow chart illustrating a method for media data analysis in accordance with another embodiment;

FIG. 3 is a block diagram showing the structure of a media data analysis device according to an embodiment;

FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In one embodiment, as shown in fig. 1, there is provided a media data analysis method including the steps of:

step 101, crawling media information corresponding to preset keywords;

the preset keywords are keywords concerned by the user. Optionally, the preset keyword is a preset keyword related to copyright information of the media data.

In the embodiment of the invention, the preset keywords are keywords concerned by the user, and the keywords are related to the media data and the copyright information. For example, a user often searches for a song on the network, and the preset keywords are "Zhou Ji Lun", "music copyright", "copyright charge", "song charge", and so on.

In the embodiment of the invention, the media information corresponding to the preset keywords is crawled by using the web crawler. Web crawlers, also known as web spiders and web robots, among FOAF communities, more often called web chasers, are programs or scripts that automatically capture web information according to certain rules. Other less commonly used names are ants, automatic indexing, simulation programs, or worms. Among them, the crawling technique belongs to the prior art in the field and is not described herein.

In the embodiment of the invention, if the preset keywords are Zhou Ji Lun and copyright charging, all songs and charged media information related to Zhou Ji Lun and copyright charging are crawled from various search engines or microblogs and the like.

And 102, analyzing the media information and determining the copyright information of the target media data.

Wherein the copyright information indicates information whether the target media data is to be appreciated for a fee.

In the embodiment of the invention, after all songs related to 'Zhou Jieren' and 'copyright charge' and the charged media information are crawled, the media information is analyzed, thereby determining which songs are to be enjoyed with payment and which songs can be enjoyed without payment.

In the embodiment of the present invention, media information corresponding to a preset keyword is crawled, the media information is analyzed, and copyright information of target media data is determined, so that copyright information of the target media data is known in advance (for example, the copyright information is to be charged in the next week, or the next week is set to be VIP for appreciation, or for free appreciation).

The embodiment of the invention applies big data technology, including information crawling and big data analysis, and belongs to the existing hotspot technology aiming at the information crawling and the big data analysis. By utilizing the hotspot technology, the invention determines the copyright information of the target media data by crawling the keywords related to the copyright information of the media data in a special scene, thereby determining which songs are to be enjoyed for payment and which songs can be enjoyed without payment, and the invention has outstanding substantive characteristics and remarkable progress.

In one embodiment, as shown in fig. 2, there is provided a media data analysis method, including the steps of:

step S201, acquiring the favorite data of the media data of the user;

wherein the media data preference data represents genre preference data, character preference data, or the like. For example, the media data preference data indicates which genre of song the user likes to listen to, which singer's song to like, which actor's series to like, which genre of series to like, and so on.

In the embodiment of the present invention, the terminal or the server of each user may collect the media data preference data of the user through the APP (music APP, video APP, microblog) on the user terminal and the search record on the user terminal.

Step S202, determining preset keywords according to the favorite data of the media data;

in the embodiment of the invention, the media data preference data is subjected to word segmentation, clustering analysis and the like to obtain the preset keywords. The preset keyword represents a keyword that is obtained according to the user's liking.

Step S203, crawling media information corresponding to preset keywords;

step S204, the media information is analyzed, and copyright information of the target media data is determined.

In the embodiment of the invention, because the preset keywords are determined according to the favorite data of the media data, the crawled media information is also the information related to the favorite of the user, the analyzed target media data is also the favorite song or video of the user, so as to obtain the copyright information of the favorite song or video of the user, and the target media data is processed in advance, for example, the target media data is watched or downloaded before charging.

Optionally, after analyzing the media information and determining the copyright information of the target media data, the media information is sent to the terminal.

In the embodiment of the invention, after the copyright information of the target media data is determined, the copyright information is sent to the terminal to inform the user of the copyright information, so that the user can enjoy or download the copyright information in time before charging.

In one embodiment, a media data analysis method is provided, including:

acquiring media data preference data of a user; determining preset keywords according to the favorite data of the media data; crawling media information corresponding to preset keywords; analyzing the media information to determine copyright information of the target media data; and downloading the target media data corresponding to the copyright information.

In one embodiment, a media data analysis method is provided, including:

acquiring media data preference data of a user; determining preset keywords according to the favorite data of the media data; crawling media information corresponding to preset keywords; analyzing the media information to determine copyright information of the target media data; and if the copyright information is the charging information, downloading the target media data corresponding to the copyright information.

And if the copyright information is the target media data charging information, downloading the target media data corresponding to the copyright information before charging. If the copyright information is target media data non-charging information, no processing is performed or the end user is informed that the media data you like (i.e. the target media data) is not charged temporarily.

In one embodiment, a media data analysis method is provided, which is how to remind each user that favorite media data will be charged when media data of a plurality of users are analyzed, and the specific implementation process is as follows:

acquiring a terminal identifier; the terminal identification, the preset keyword, the target media data and the copyright information have corresponding relation;

and sending the copyright information to a terminal corresponding to the terminal identification according to the corresponding relation.

In the embodiment of the present invention, when media information corresponding to a preset keyword is crawled, a terminal identifier X is also obtained, for example, the terminal identifier X may be attached to each preset keyword, where the terminal identifier X corresponds to the preset keyword, when media information corresponding to the preset keyword is crawled, the crawled media information is also attached to the terminal identifier X, and then, the media information is analyzed, and when copyright information of target media data is determined, the target media data and the copyright information are attached to the terminal identifier X, so that the terminal identifier, the preset keyword, the target media data, and the copyright information have a corresponding relationship.

If the copyright information attached with the terminal identifier X is the charging information, the copyright information can be sent to the terminal corresponding to the terminal identifier X to remind the user that the favorite media data is to be charged, and the user can enjoy or download the media data in advance.

For convenience of understanding, there are a user a and a user b, the terminal identifier of the terminal of the user a is M, the terminal identifier of the terminal of the user b is N, the preset keywords of the user a are "zhou jenlen" and "copyright charge", and the preset keywords of the user b are "song" and "copyright charge", by way of example. When media information corresponding to Zhougelong and copyright charge is crawled, a terminal identification M is also obtained, when the media information corresponding to Zhougelong and copyright charge is crawled, the terminal identification M is also attached to the crawled media information, then the media information is analyzed, and when the copyright information of target media data (for example, songs such as snow) is determined, the terminal identification M is attached to the target media data and the copyright information, so that the terminal identification, preset keywords, the target media data and the copyright information have corresponding relations. When media information corresponding to the 'song' and the 'copyright charge' is crawled, a terminal identification N is also acquired, when the media information corresponding to the 'song' and the 'copyright charge' is crawled, the terminal identification N is also attached to the crawled media information, then the media information is analyzed, and when the copyright information of target media data (for example, a TV drama and Lange board) is determined, the terminal identification N is attached to the target media data and the copyright information, so that the terminal identification, preset keywords, the target media data and the copyright information have corresponding relations. If the copyright information attached with the terminal identifier N is the charging information and the copyright information attached with the terminal identifier M is the charging information, the copyright information to be charged by the drama and the gya can be sent to the terminal corresponding to the terminal identifier N, namely, the terminal is sent to the user b, the user b is reminded that the favorite drama of the user b is to be charged, and the user can enjoy or download the drama in advance. The copyright information attached with the terminal identification M can be not processed, or the copyright information can be sent to the terminal corresponding to the terminal identification M, namely sent to the user A, and the user A is informed that songs liked by the user A are temporarily free, so that the songs do not need to be downloaded in advance, and the terminal is prevented from downloading too many songs and occupying the memory space.

It should be understood that although the steps in the flowcharts of fig. 1 and 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 and 2 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 3, there is provided a media data analysis apparatus including:

the crawling module 301 is used for crawling media information corresponding to preset keywords;

a copyright information determining module 302, configured to analyze the media information and determine copyright information of the target media data.

In an optional embodiment, the preset keyword is a preset keyword related to copyright information of the media data.

In an optional embodiment, the media data analysis device further comprises:

the favorite data acquisition module is used for acquiring media data favorite data of a user;

and the keyword determining module is used for determining preset keywords according to the favorite data of the media data.

In an optional embodiment, the media data analysis device further comprises:

and the sending module is used for sending the copyright information to a terminal.

In an optional embodiment, the media data analysis device further comprises:

and the downloading module is used for downloading the target media data corresponding to the copyright information according to the copyright information.

For specific limitations of the media data analysis device, reference may be made to the above limitations of the media data analysis method, which are not described herein again. The modules in the media data analysis device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a media data analysis method.

Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

crawling media information corresponding to preset keywords;

and analyzing the media information to determine the copyright information of the target media data.

In one embodiment, the preset keyword is a preset keyword related to copyright information of the media data.

In one embodiment, the processor, when executing the computer program, further performs the steps of:

acquiring media data preference data of a user;

and determining preset keywords according to the preference data of the media data.

In one embodiment, the processor, when executing the computer program, further performs the steps of:

and sending the copyright information to a terminal.

In one embodiment, the processor, when executing the computer program, further performs the steps of:

and downloading the target media data corresponding to the copyright information according to the copyright information.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

crawling media information corresponding to preset keywords;

and analyzing the media information to determine the copyright information of the target media data.

In one embodiment, the preset keyword is a preset keyword related to copyright information of the media data.

In one embodiment, the computer program when executed by the processor further performs the steps of:

acquiring media data preference data of a user;

and determining preset keywords according to the preference data of the media data.

In one embodiment, the computer program when executed by the processor further performs the steps of:

and sending the copyright information to a terminal.

In one embodiment, the computer program when executed by the processor further performs the steps of:

and downloading the target media data corresponding to the copyright information according to the copyright information.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

11页详细技术资料下载

Media data analysis method and device, computer equipment and storage medium

相关技术

网友询问留言