Algorithm for analyzing film-watching mood by using AI (artificial intelligence)

文档序号:1846117 发布日期:2021-11-16 浏览:20次 中文

阅读说明:本技术 一种使用ai分析观影心情的算法 (Algorithm for analyzing film-watching mood by using AI (artificial intelligence) ) 是由 王宇廷 白志勇 李梦雪 陈鹏飞 于 2021-08-20 设计创作,主要内容包括:本发明公开了观影技术领域的一种使用AI分析观影心情的算法,包括以下步骤:S1、视频标签分类库:按照科室、医生、疾病、问题类型、疾病人群对视频打上标签;使用人工智能替代了人工,在速度上处于绝对领先;人工剪辑存在「人」因素,不同的剪辑人员,出品的质量是不一样的,能力和审美不同,出品有高有低,而人工智能不断学习大量的样本,大大降低出错率的同时,也能提升质量,最终不断趋近于最高限;基于上述两点,可以降低平均成本;更好的保护隐私,因为不需要直接获取观众的隐私数据;提升视频的艺术价值。(The invention discloses an algorithm for analyzing film-watching mood by using AI (artificial intelligence) in the technical field of film watching, which comprises the following steps: s1, video label classification library: labeling the video according to departments, doctors, diseases, problem types and disease groups; artificial intelligence is used for replacing manpower, and the speed is in absolute lead; the artificial editing has human factors, different editing personnel have different quality of products, different abilities and aesthetics, the products have different heights, and artificial intelligence continuously learns a large number of samples, so that the error rate is greatly reduced, the quality can be improved, and finally the maximum limit is continuously approached; based on the two points, the average cost can be reduced; privacy is better protected because the private data of the audience is not required to be directly acquired; the artistic value of the video is improved.)

1. An algorithm for analyzing a viewing mood using AI, comprising the steps of:

s1, video label classification library: labeling the video according to departments, doctors, diseases, problem types and disease groups;

s2, material label classification library: all materials including video clips, audio, background audio, pictures, dynamic pictures GIF and the like are labeled; the label type is the same as that of the video in the first step, and meanwhile, the material also needs own emotion labels (such as emotion labels for relieving, cheering, curing and the like, so that the judgment and the use of a material recommendation algorithm are facilitated);

s3, video segment searching algorithm: formally processing the video, dividing the video into a plurality of segments through a video segment retrieval algorithm, and analyzing the content of each segment;

s4, emotion inference algorithm: establishing multi-modal data according to the video material, performing emotion calculation and emotion assessment through an emotion model, finally outputting emotion, and deducing what the emotion corresponding to each fragment is through the multi-modal data establishment;

s5, a material recommendation algorithm: intervening the emotion output in the step S4, using a material recommendation algorithm, finding a proper material from a material library, and inserting the material into the video;

s6, synthesis rendering technology: rendering and synthesizing the video clips, wherein the algorithm considers the situations of emotion connection, fade-in and fade-out indexes (time is 0.3-1 second) of music, reasonable positions of materials and the like during synthesis, and finally the video clips are sliced.

2. The algorithm for analyzing a viewing mood using AI according to claim 1, wherein: in step S4, the multi-modal data establishment includes: and (4) extracting, identifying and analyzing the visual form, the voice form and the text form of the video for classification.

3. The algorithm for analyzing a viewing mood using AI according to claim 1, wherein: in the step S4:

the emotion models include: discrete model (Ekman model), dimension model (PAD three-dimensional emotion model, practick emotion cone model), composition model (Plutchik model);

the emotion calculation comprises the following steps: model fitting and model verification.

4. The algorithm for analyzing a viewing mood using AI according to claim 1, wherein: in step S3, the video segment search algorithm is: extracting audio content of the video, and performing voice recognition on the audio content to obtain video subtitle information; and training based on the Baidu ERNIE-GEN model to obtain a text abstract extraction model, and extracting the abstract of each sentence of subtitle information to obtain a subtitle sentence abstract.

5. The algorithm for analyzing a shadow mood using AI according to claim 4, wherein: and training the Baidu ERNIE-NLP-based model to obtain a text voice matching model, and calculating the similarity between the keywords and the labels of the material library through text semantic matching to obtain the material labels with the highest similarity.

6. The algorithm for analyzing a viewing mood using AI according to claim 1, wherein: in step S4, the emotion inference algorithm is: training based on a Baidu ERNIE-NLP model to obtain an emotion recognition model, performing emotion recognition calculation on each sentence of subtitles to obtain an emotion value of each sentence, and taking the emotion value with the highest weight as the overall emotion of the video.

7. The algorithm for analyzing a viewing mood using AI according to claim 1, wherein: in step S5, the material recommendation algorithm is to classify and store images, music, and the like in the material library according to labels such as content, emotion value, and the like; and firstly, finding a material content label through semantic matching, and then matching the material content label to a corresponding recommended material through an emotion value.

Technical Field

The invention relates to the technical field of film watching, in particular to an algorithm for analyzing film watching moods by using AI.

Background

The film and television art is a composite body of time art and space art, which not only shows pictures in the duration like the time art to form a complete screen image, but also develops the image in the picture space like the space art, so that the works obtain the expressive force of multiple means and modes. Film and television art includes the artistic effects expressed by movies, televisions, and both. Movies are the origin of the film and television arts, and televisions are one of the derivatives of the film and television arts;

the current later editing of the movie television can guess the emotion of audiences through manual work according to the development of movie plots, match corresponding music and have special effects of atmosphere backing up, and the short video industry is the same and needs editing personnel to process, so that the mode has high requirements on personnel (aesthetics, professional ability and the like) and consumes long time.

Technical means which have been commercialized, including analyzing and judging the viewing experience of the viewer by capturing the sound and facial expression of the viewer on site, are not suitable for videos which are not played yet and are still being produced, and there is a risk of invasion of privacy by capturing the sound and facial expression of the viewer.

At present, no practical method for analyzing and prejudging the plot and the emotion of audiences in advance by using artificial intelligence is available, if the judgment can be carried out in advance, the artificial intelligence can recommend proper background music, sound effect, video special effect and video material to the corresponding plot, and therefore an algorithm for analyzing the film-watching mood by using AI is provided.

Disclosure of Invention

The present invention is directed to an algorithm for analyzing a mood of a shadow using AI, so as to solve the problems of the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

an algorithm for analyzing a viewing mood using AI, comprising the steps of:

s1, video label classification library: labeling the video according to departments, doctors, diseases, problem types and disease groups;

s2, material label classification library: all materials including video clips, audio, background audio, pictures, dynamic pictures GIF and the like are labeled; the label type is the same as that of the video in the first step, and meanwhile, the material also needs own emotion labels (such as emotion labels for relieving, cheering, curing and the like, so that the judgment and the use of a material recommendation algorithm are facilitated);

s3, video segment searching algorithm: formally processing the video, dividing the video into a plurality of segments through a video segment retrieval algorithm, and analyzing the content of each segment;

s4, emotion inference algorithm: establishing multi-modal data according to the video material, performing emotion calculation and emotion assessment through an emotion model, finally outputting emotion, and deducing what the emotion corresponding to each fragment is through the multi-modal data establishment;

s5, a material recommendation algorithm: intervening the emotion output in the step S4, using a material recommendation algorithm, finding a proper material from a material library, and inserting the material into the video;

s6, synthesis rendering technology: rendering and synthesizing the video clips, wherein the algorithm considers the situations of emotion connection, fade-in and fade-out indexes (time is 0.3-1 second) of music, reasonable positions of materials and the like during synthesis, and finally the video clips are sliced.

Preferably, in step S4, the multi-modal data creating includes: and (4) extracting, identifying and analyzing the visual form, the voice form and the text form of the video for classification.

Preferably, in step S4:

the emotion models include: discrete model (Ekman model), dimension model (PAD three-dimensional emotion model, practick emotion cone model), composition model (Plutchik model);

the emotion calculation comprises the following steps: model fitting and model verification.

Preferably, in step S3, the video segment search algorithm is: extracting audio content of the video, and performing voice recognition on the audio content to obtain video subtitle information; and training based on the Baidu ERNIE-GEN model to obtain a text abstract extraction model, and extracting the abstract of each sentence of subtitle information to obtain a subtitle sentence abstract.

Preferably, the text voice matching model is obtained through training based on the Baidu ERNIE-NLP model, and similarity between the keywords and the tags of the material library is calculated through text semantic matching, so that the material tags with the highest similarity are obtained.

Preferably, in step S4, the emotion inference algorithm is: training based on a Baidu ERNIE-NLP model to obtain an emotion recognition model, performing emotion recognition calculation on each sentence of subtitles to obtain an emotion value of each sentence, and taking the emotion value with the highest weight as the overall emotion of the video.

Preferably, in step S5, the material recommendation algorithm is to classify and store images, music, and the like in the material library according to labels such as content, emotion value, and the like; and firstly, finding a material content label through semantic matching, and then matching the material content label to a corresponding recommended material through an emotion value.

Compared with the prior art, the invention has the beneficial effects that:

firstly, artificial intelligence is used for replacing manpower, and the speed is in absolute lead;

secondly, human factors exist in manual editing, the quality of products is different among different editing personnel, the capacity and the aesthetic quality are different, the product has high or low quality, and artificial intelligence continuously learns a large number of samples, so that the error rate is greatly reduced, the quality can be improved, and finally the maximum limit is continuously approached;

based on the two points, the average cost can be reduced;

fourthly, privacy is better protected, and private data of the audience are not required to be directly acquired;

and fifthly, the artistic value of the video is improved.

Drawings

FIG. 1 is a schematic view of the overall process of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, the present invention provides a technical solution:

an algorithm for analyzing a viewing mood using AI, comprising the steps of:

s1, video label classification library: labeling the video according to departments, doctors, diseases, problem types and disease groups;

s2, material label classification library: all materials including video clips, audio, background audio, pictures, dynamic pictures GIF and the like are labeled; the label type is the same as that of the video in the first step, and meanwhile, the material also needs own emotion labels (such as emotion labels for relieving, cheering, curing and the like, so that the judgment and the use of a material recommendation algorithm are facilitated);

s3, video segment searching algorithm: formally processing the video, dividing the video into a plurality of segments through a video segment retrieval algorithm, and analyzing the content of each segment;

s4, emotion inference algorithm: establishing multi-modal data according to the video material, performing emotion calculation and emotion assessment through an emotion model, finally outputting emotion, and deducing what the emotion corresponding to each fragment is through the multi-modal data establishment;

s5, a material recommendation algorithm: intervening the emotion output in the step S4, using a material recommendation algorithm, finding a proper material from a material library, and inserting the material into the video;

s6, synthesis rendering technology: rendering and synthesizing the video clips, wherein the algorithm considers the situations of emotion connection, fade-in and fade-out indexes (time is 0.3-1 second) of music, reasonable positions of materials and the like during synthesis, and finally the video clips are sliced.

Referring to fig. 1, in step S4, the multi-modal data establishment includes: extracting, identifying and analyzing the visual form, the voice form and the text form of the video for classification;

referring to fig. 1, in step S4:

the emotion models include: discrete model (Ekman model), dimension model (PAD three-dimensional emotion model, practick emotion cone model), composition model (Plutchik model);

the emotion calculation comprises the following steps: model fitting and model verification;

referring to fig. 1, in step S3, the video segment search algorithm is: extracting audio content of the video, and performing voice recognition on the audio content to obtain video subtitle information; training based on a Baidu ERNIE-GEN model to obtain a text abstract extraction model, and extracting the abstract of each sentence of subtitle information to obtain a subtitle sentence abstract;

referring to fig. 1, the text speech matching model is obtained based on the Baidu ERNIE-NLP model training, and similarity between the keyword and the tags in the material library is calculated through text semantic matching, so as to obtain the material tags with the highest similarity;

referring to fig. 1, in step S4, the emotion inference algorithm is: training based on a Baidu ERNIE-NLP model to obtain an emotion recognition model, performing emotion recognition calculation on each sentence of subtitles to obtain an emotion value of each sentence, and taking the emotion value with the highest weight as the overall emotion of the video;

referring to fig. 1, in step S5, the material recommendation algorithm is to classify and store images, music, and the like in the material library according to labels such as content, emotion value, and the like; firstly, finding a material content label through semantic matching, and then matching a corresponding recommended material through an emotion value;

the working principle is as follows: video label classification library: labeling the video according to departments, doctors, diseases, problem types and disease groups; material label classification library: all materials including video clips, audio, background audio, pictures, dynamic pictures GIF and the like are labeled; the label type is the same as that of the video in the first step, and meanwhile, the material also needs own emotion labels (such as emotion labels for relieving, cheering, curing and the like, so that the judgment and the use of a material recommendation algorithm are facilitated); video clip retrieval algorithm: formally processing the video, dividing the video into a plurality of segments through a video segment retrieval algorithm, and analyzing the content of each segment; and (3) emotion inference algorithm: establishing multi-modal data according to the video material, performing emotion calculation and emotion assessment through an emotion model, finally outputting emotion, and deducing what the emotion corresponding to each fragment is through the multi-modal data establishment; and (3) a material recommendation algorithm: intervening the emotion output in the step S4, using a material recommendation algorithm, finding a proper material from a material library, and inserting the material into the video; a synthesis rendering technology: rendering and synthesizing the video clips, wherein the algorithm considers the situations of emotion connection, fade-in and fade-out indexes (time is 0.3-1 second) of music, reasonable positions of materials and the like during synthesis, and finally the video clips are sliced.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

6页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:手势语识别方法、装置、电子设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!