Video recommendation method, device, equipment and storage medium

文档序号：1831329 发布日期：2021-11-12 浏览：21次中文

阅读说明：本技术 一种视频推荐方法、装置、设备及存储介质 (Video recommendation method, device, equipment and storage medium ) 是由李雪于 2021-08-13 设计创作，主要内容包括：本发明实施例提供了一种视频推荐方法、装置、设备及存储介质,当接收到用户输入的视频请求指令时,基于视频请求指令,获取多个待推荐视频；针对每两个待推荐视频,基于该两个待推荐视频的视频内容,计算该两个待推荐视频的相似度,作为第一相似度；基于各个第一相似度,从多个待推荐视频中,确定出第一数目个视频,作为目标视频；目标视频中任意两个视频的第一相似度小于预设相似度阈值；向用户推荐目标视频。基于上述处理,目标视频中的任意两个视频的视频内容的相似度较低,目标视频中存在重复的视频的可能性较低,向用户推荐目标视频,可以在一定避免向用户推荐重复的视频的问题,进而,可以减少对客户端中用于显示视频的显示位置的浪费。(The embodiment of the invention provides a video recommendation method, a video recommendation device, video recommendation equipment and a storage medium, wherein when a video request instruction input by a user is received, a plurality of videos to be recommended are obtained based on the video request instruction; for every two videos to be recommended, calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended; determining a first number of videos from a plurality of videos to be recommended as target videos based on the first similarity; the first similarity of any two videos in the target videos is smaller than a preset similarity threshold value; and recommending the target video to the user. Based on the processing, the similarity of the video contents of any two videos in the target video is low, the possibility of repeated videos existing in the target video is low, the target video is recommended to the user, the problem of recommending the repeated videos to the user can be avoided to a certain extent, and further, the waste of display positions for displaying the videos in the client side can be reduced.)

1. A method for video recommendation, the method comprising:

when a video request instruction input by a user is received, acquiring a plurality of videos to be recommended based on the video request instruction;

for every two videos to be recommended, calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended;

determining a first number of videos from the plurality of videos to be recommended as target videos based on the first similarity; the first similarity of any two videos in the target videos is smaller than a preset similarity threshold value;

and recommending the target video to the user.

2. The method according to claim 1, wherein the video request command carries a search keyword;

the obtaining of the plurality of videos to be recommended based on the video request instruction includes:

performing word segmentation processing on the search keywords to obtain a plurality of keywords which are used as keywords to be matched;

determining videos with intersection of the video feature labels and the keywords to be matched from a plurality of preset videos, and taking the videos as first alternative videos;

selecting a second number of videos from the first candidate videos as second candidate videos based on the uploading time of the first candidate videos and/or click parameters in a first historical time period;

and selecting a third number of videos from the second alternative videos as videos to be recommended.

3. The method according to claim 2, wherein the selecting a third number of videos from the second candidate videos as the videos to be recommended comprises:

and selecting a third number of videos from the second candidate videos as videos to be recommended based on the video feature tags of the second candidate videos, the identification of the user uploading the second candidate videos, the click parameters of the second candidate videos in a second historical time period and the uploading time of the second candidate videos.

4. The method according to claim 1, wherein the calculating the similarity of the two videos to be recommended as the first similarity based on the video contents of the two videos to be recommended comprises:

and calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended.

5. The method according to claim 4, wherein the calculating the similarity of the two videos to be recommended as the first similarity based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended comprises:

processing the sampled video frames and video titles of the two videos to be recommended based on a pre-trained text vectorization model to obtain respective feature vectors of the two videos to be recommended;

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

6. The method according to claim 4, wherein before calculating the similarity of the two videos to be recommended as the first similarity based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended, the method further comprises:

acquiring respective target keywords of the two videos to be recommended; the target keywords of the video to be recommended comprise: keywords used when the user searches the video to be recommended in the third history time period;

the calculating the similarity of the two videos to be recommended based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended as a first similarity includes:

and calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended, the video titles of the two videos to be recommended, the video feature labels of the two videos to be recommended and the target keywords of the two videos to be recommended.

7. The method according to claim 6, wherein the calculating the similarity of the two videos to be recommended as the first similarity based on the video contents of the two videos to be recommended, the video titles of the two videos to be recommended, the video feature labels of the two videos to be recommended, and the target keywords of the two videos to be recommended comprises:

processing the sampled video frames, the video titles, the video feature labels and the target keywords of the two videos to be recommended based on a pre-trained text vectorization model to obtain respective feature vectors of the two videos to be recommended;

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

8. The method of claim 7, wherein the processing the sample video frames, the video titles, the video feature labels and the target keywords of the two videos to be recommended based on the pre-trained text vectorization model to obtain the feature vectors of the two videos to be recommended comprises:

determining videos searched by the user based on the target keywords in the third history time period as videos to be processed aiming at each target keyword of the two videos to be recommended;

generating an information sequence corresponding to the target keyword; wherein the information sequence comprises: the respective video information of the target keyword and the video to be processed; in the information sequence, each piece of video information is positioned behind the target keyword, and the video information is arranged according to the sequence of the click times of the corresponding videos to be processed in the third history time period from large to small; video information of a video to be processed includes: the video identification, the video title, the sampling video frame and the video characteristic label of the video to be processed;

and inputting the information sequence corresponding to each target keyword into a pre-trained text vectorization model, and aiming at the video information of each video to be recommended, obtaining the feature vector of the video identifier in the video information of the video to be recommended, which is output by the text vectorization model, and taking the feature vector as the feature vector of the video to be recommended.

9. A video recommendation apparatus, characterized in that the apparatus comprises:

the system comprises a first acquisition module, a second acquisition module and a recommendation module, wherein the first acquisition module is used for acquiring a plurality of videos to be recommended based on a video request instruction input by a user when the video request instruction is received;

the first determining module is used for calculating the similarity of two videos to be recommended as a first similarity according to the video contents of the two videos to be recommended aiming at every two videos to be recommended;

the second determining module is used for determining a first number of videos from the plurality of videos to be recommended as target videos based on the first similarity; the first similarity of any two videos in the target videos is smaller than a preset similarity threshold value;

and the recommending module is used for recommending the target video to the user.

10. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1 to 8 when executing a program stored in the memory.

11. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-8.

Technical Field

The invention relates to the technical field of internet, in particular to a video recommendation method, device, equipment and storage medium.

Background

With the rapid development of internet technology, more and more functions are provided for users by clients, and users can browse various network resources through the clients. For example, when a user needs to watch a video, the user may input a search keyword to the client. When receiving a search keyword input by a user, the client may transmit the search keyword to the server. Then, the server may determine a video (which may be referred to as a video to be recommended) associated with the search keyword and recommend the video to be recommended to the user.

In the prior art, if videos with the same title exist in videos to be recommended, when a video is recommended to a user, one of the videos with the same title in the videos to be recommended may be selected to be recommended to the user, so as to avoid recommending repeated videos.

However, if the title of the video to be recommended is a title customized by the network user when uploading the video to be recommended, there may be a plurality of videos with the same content and different titles. Accordingly, based on the above processing, a plurality of videos having the same content but different titles may be recommended to the user, that is, repeated videos may be recommended to the user, which results in a waste of display positions for displaying videos in the client.

Disclosure of Invention

An object of the embodiments of the present invention is to provide a video recommendation method, apparatus, device, and storage medium, so as to reduce waste of a display position for displaying a video in a client. The specific technical scheme is as follows:

in a first aspect of the present invention, there is provided a video recommendation method, including:

when a video request instruction input by a user is received, acquiring a plurality of videos to be recommended based on the video request instruction;

for every two videos to be recommended, calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended;

and recommending the target video to the user.

Optionally, the video request instruction carries a search keyword;

the obtaining of the plurality of videos to be recommended based on the video request instruction includes:

performing word segmentation processing on the search keywords to obtain a plurality of keywords which are used as keywords to be matched;

determining videos with intersection of the video feature labels and the keywords to be matched from a plurality of preset videos, and taking the videos as first alternative videos;

and selecting a third number of videos from the second alternative videos as videos to be recommended.

Optionally, the selecting a third number of videos from the second candidate videos as videos to be recommended includes:

Optionally, the calculating, based on the video contents of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity includes:

Optionally, the calculating, based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity includes:

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

Optionally, before the calculating the similarity of the two videos to be recommended based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended as the first similarity, the method further includes:

Optionally, the calculating, based on the video content of the two videos to be recommended, the video titles of the two videos to be recommended, the video feature tags of the two videos to be recommended, and the target keywords of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity includes:

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

Optionally, the processing, based on the pre-trained text vectorization model, the sampled video frames, the video titles, the video feature tags, and the target keywords of the two videos to be recommended to obtain respective feature vectors of the two videos to be recommended includes:

determining videos searched by the user based on the target keywords in the third history time period as videos to be processed aiming at each target keyword of the two videos to be recommended;

In a second aspect of the present invention, there is also provided a video recommendation apparatus, including:

and the recommending module is used for recommending the target video to the user.

Optionally, the video request instruction carries a search keyword;

the first obtaining module is specifically configured to perform word segmentation processing on the search keywords to obtain a plurality of keywords serving as keywords to be matched;

determining videos with intersection of the video feature labels and the keywords to be matched from a plurality of preset videos, and taking the videos as first alternative videos;

and selecting a third number of videos from the second alternative videos as videos to be recommended.

Optionally, the first obtaining module is specifically configured to select a third number of videos from the second candidate videos as videos to be recommended based on the video feature tag of the second candidate video, the identifier of the user who uploads the second candidate video, the click parameter of the second candidate video in a second historical time period, and the upload time of the second candidate video.

Optionally, the first determining module is specifically configured to calculate, based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity.

Optionally, the first determining module is specifically configured to process the sampled video frames and video titles of the two videos to be recommended based on a pre-trained text vectorization model, so as to obtain respective feature vectors of the two videos to be recommended;

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

Optionally, the apparatus further comprises:

the second acquisition module is used for acquiring respective target keywords of the two videos to be recommended; the target keywords of the video to be recommended comprise: keywords used when the user searches the video to be recommended in the third history time period;

the first determining module is specifically configured to calculate, based on the video contents of the two videos to be recommended, the video titles of the two videos to be recommended, the video feature labels of the two videos to be recommended, and the target keywords of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity.

Optionally, the first determining module is specifically configured to process, based on a pre-trained text vectorization model, a sampled video frame, a video title, a video feature tag, and a target keyword of the two videos to be recommended, so as to obtain respective feature vectors of the two videos to be recommended;

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

Optionally, the first determining module is specifically configured to determine, for each target keyword of the two videos to be recommended, a video searched by the user based on the target keyword in the third history time period as a video to be processed;

In another aspect of the present invention, there is also provided an electronic device, including a processor, a communication interface, a memory and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing any video recommendation method steps when executing the program stored in the memory.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the video recommendation methods described above.

In yet another aspect of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the video recommendation methods described above.

According to the video recommendation method provided by the embodiment of the invention, when a video request instruction input by a user is received, a plurality of videos to be recommended are obtained based on the video request instruction; for every two videos to be recommended, calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended; determining a first number of videos from a plurality of videos to be recommended as target videos based on the first similarity; the first similarity of any two videos in the target videos is smaller than a preset similarity threshold value; and recommending the target video to the user.

Based on the processing, the first similarity of any two videos in the target video is smaller than the preset similarity threshold, and the first similarity can represent the similarity of the video content of the video to be recommended, so that the similarity of the video content of any two videos in the target video is low, the possibility that repeated videos exist in the target video is low, the target video is recommended to the user, the problem that the repeated videos are recommended to the user can be avoided to a certain extent, and further, waste of display positions for displaying the videos in the client can be reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a flowchart of a video recommendation method provided in an embodiment of the present invention;

fig. 2 is a flowchart of another video recommendation method provided in an embodiment of the present invention;

fig. 3 is a flowchart of another video recommendation method provided in an embodiment of the present invention;

fig. 4 is a flowchart of another video recommendation method provided in an embodiment of the present invention;

fig. 5 is a flowchart of another video recommendation method provided in an embodiment of the present invention;

fig. 6 is a flowchart of another video recommendation method provided in an embodiment of the present invention;

fig. 7 is a block diagram of a video recommendation apparatus according to an embodiment of the present invention;

fig. 8 is a structural diagram of an electronic device provided in an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

Referring to fig. 1, fig. 1 is a flowchart of a video recommendation method provided in an embodiment of the present invention, where the video recommendation method is applied to an electronic device, where the electronic device may be a client, or the electronic device may also be a server, and the electronic device is used for recommending a video to a user.

The method may comprise the steps of:

s101: when a video request instruction input by a user is received, a plurality of videos to be recommended are obtained based on the video request instruction.

S102: and for every two videos to be recommended, calculating the similarity of the two videos to be recommended based on the video contents of the two videos to be recommended as a first similarity.

S103: and determining a first number of videos from the plurality of videos to be recommended as target videos based on the first similarity.

The first similarity of any two videos in the target videos is smaller than a preset similarity threshold value.

S104: and recommending the target video to the user.

The video recommendation method provided by the embodiment of the invention has the advantages that the first similarity of any two videos in the target videos is smaller than the preset similarity threshold, and the first similarity can represent the similarity of the video contents of the videos to be recommended, so that the similarity of the video contents of any two videos in the target videos is low, the probability of repeated videos existing in the target videos is low, the target videos are recommended to users, the problem of recommending the repeated videos to the users can be avoided to a certain extent, and further, the waste of display positions for displaying the videos in the client can be reduced.

In step S101, the electronic device may be an intelligent terminal such as a mobile phone or a computer, or may also be a video application installed in the intelligent terminal, and the electronic device may receive a video request instruction input by a user. Or, the electronic device may also be a server, and accordingly, the client may send a video request instruction input by the user to the electronic device.

In one implementation, when a user needs to watch a video, the user can input a video request instruction to the electronic device. For example, the electronic device may be provided with a video recommendation page, and when the user refreshes the video recommendation page, the electronic device may determine that a video request instruction input by the user is received. Then, the electronic device can obtain a plurality of videos to be recommended based on the video request instruction.

In another implementation manner, a search box may be disposed in a display interface of the electronic device, and when a user needs to watch a video, the user may input a search keyword in the search box to search for the video. Accordingly, when the electronic device receives the search keyword, it may be determined that the video request instruction is received.

In one embodiment of the invention, the video request instruction carries a search keyword. Accordingly, on the basis of fig. 1, referring to fig. 2, step S101 may include the steps of:

s1011: when a video request instruction input by a user is received, word segmentation processing is carried out on the search keywords to obtain a plurality of keywords which are used as keywords to be matched.

S1012: and determining a video with the intersection of the video feature tag and the keywords to be matched from a plurality of preset videos as a first alternative video.

S1013: and selecting a second number of videos from the first candidate videos as second candidate videos based on the uploading time of the first candidate videos and/or the click parameters in the first historical time period.

S1014: and selecting a third number of videos from the second alternative videos as the videos to be recommended.

The preset videos are all videos in the database, and the videos in the database are determined by technicians according to videos which can be currently provided. For example, the preset video may be all videos that are currently online, or may be all videos that are uploaded by the user.

A video feature tag for a video may include: a feature tag representing the video type of the video, a feature tag representing the video subject of the video, and an identification of the user who uploaded the video, etc.

When a user needs to watch a video, the user may input a search keyword to the electronic device. When receiving a search keyword input by a user, the electronic device may perform word segmentation processing on the received search keyword based on a preset algorithm to obtain a plurality of keywords as keywords to be matched. The preset algorithm may be a forward maximum matching method, or the preset algorithm may also be a reverse maximum matching method, or the preset algorithm may also be a shortest path word segmentation method, and the like, but is not limited thereto.

Furthermore, the electronic device can acquire a plurality of videos to be recommended from a plurality of preset videos based on the keywords to be matched.

In one implementation, for each preset video, the electronic device may determine a video feature tag of the video. Then, the electronic device may determine whether the video feature tag of the video intersects with the keyword to be matched, and if the video feature tag of the video intersects with the keyword to be matched, that is, at least one video feature tag of the video is the same as the keyword to be matched, the video may be determined as the video to be recommended.

In another implementation manner, the electronic device may determine, as the first candidate video, a video in which the video feature tag and the multiple keywords to be matched have an intersection.

Then, the electronic device may select, as the second candidate video, a second number of videos whose uploading time is closest to the current time from the first candidate videos according to the uploading time of each first candidate video.

Or, the electronic device may select the first second number of videos as the second candidate videos according to the descending order of the click parameters of the first candidate videos in the first historical time period. The click parameter of a video may be the click rate of the video, or the click amount of the video, or the sum of the click rate of the video and the click amount of the video.

Alternatively, for each first alternative video, the electronic device may calculate a weighted sum of the upload time of the first alternative video and the click parameter of the first alternative video within the first historical time period. Then, the electronic device may select the first second number of videos from the first candidate videos as the second candidate videos in the descending order of the corresponding weighted sum.

The second number may be set by a skilled person based on experience, for example, the second number may be 1000, or the second number may also be 1500, but is not limited thereto. The first historical period may be set by a technician based on experience, for example, the first historical period may be one day before the current time, or may also be three days before the current time, but is not limited thereto.

Then, the electronic device may determine a third number of videos to be recommended from the second candidate videos in the following two ways. Wherein the third number may be set empirically by a skilled person, the third number may be the same as the second number, or the third number may be different from the second number.

Mode 1: the electronic device can directly use the second alternative video as the video to be recommended. The third number is then the same as the second number.

If the second alternative video is directly used as the video to be recommended, the number of the video to be recommended is large, and subsequently, the electronic equipment determines that the calculation amount of the first similarity of every two videos to be recommended is large.

Therefore, in order to reduce the calculation amount of the electronic equipment for determining the first similarity of every two videos to be recommended, the video recommendation efficiency is improved. In the mode 2, the step S1014 may include the steps of: and selecting a third number of videos from the second alternative videos as videos to be recommended based on the video feature tags of the second alternative videos, the identification of the user uploading the second alternative videos, the click parameters of the second alternative videos in a second historical time period and the uploading time of the second alternative videos.

The second historical period of time may be set empirically by a technician, may be the same as the first historical period of time, or may be different from the first historical period of time.

In one implementation, the electronic device may input the video feature tag of the second candidate video, the identifier of the user who uploads the second candidate video, the click parameter of the second candidate video in the second historical time period, and the uploading time of the second candidate video to a pre-trained scoring prediction model, so as to obtain a score of the second candidate video output by the scoring prediction model. The score prediction model may be a DNN (Deep Neural Network) model or may also be a NCF (Neural Collaborative Filtering) model.

Then, the electronic device may select, from the second candidate videos, the first third number of videos as videos to be recommended according to the descending order of the corresponding scores. In this case, the third number is different from the second number, for example, the third number may be 10% of the second number, or the third number may be 15% of the second number, but is not limited thereto.

The score prediction model is obtained by training based on a first preset training sample, where the first preset training sample may include: the method comprises the steps of video feature tags of the first sample video, identification of a user uploading the first sample video, click parameters of the first sample video in a fourth historical time period, uploading time of the first sample video and scores of the first sample video. The score of the first sample video may be determined according to whether the user watches the first sample video, for example, if the user watches the first sample video, the score of the first sample video is 1, and if the user does not watch the first sample video, the score of the first sample video is 0.

The electronic device may use the video feature tag of the sample video, the identifier of the user who uploads the first sample video, the click parameter of the first sample video in the fourth historical time period, and the uploading time of the first sample video as input data of the score prediction model of the initial structure. And taking the score of the first sample video as output data of the score prediction model of the initial structure, adjusting model parameters of the score prediction model of the initial structure, and obtaining the trained score prediction model when the preset convergence condition is reached.

Based on the processing, the number of the determined videos to be recommended can be reduced, the calculation amount of the first similarity of every two videos to be recommended determined by the electronic equipment can be further reduced, and the video recommendation efficiency is improved.

In step S102, the electronic device may calculate a first similarity of every two videos to be recommended in the following manner.

The first method is as follows: for each video to be recommended, the electronic device may extract a video frame from the video to be recommended to obtain a sampled video frame of the video to be recommended. For example, the electronic device may extract a first frame, a middle frame, and a last frame of the video to be recommended as sample video frames of the video to be recommended, or the electronic device may extract video frames from the video to be recommended as sample video frames of the video to be recommended according to a preset sampling interval.

Then, for each sampled video frame of the video to be recommended, the pixel value of each pixel point of the sampled video frame is obtained, and the feature matrix of the sampled video frame is generated according to the pixel value of each pixel point of the sampled video frame. For example, pixel values of pixel points of an RGB (red, green, and blue) image belong to 0 to 255, and 0 to 255 may be divided into 16 sub-ranges according to 16 pixel values as one sub-range. For each pixel point of the sampling video frame, if the pixel value of the pixel point belongs to the sub-range of 0-16, the characteristic element corresponding to the pixel point is determined to be 0, if the pixel value of the pixel point belongs to the sub-range of 17-32, the characteristic element corresponding to the pixel point is determined to be 1, if the pixel value of the pixel point belongs to the sub-range of 33-48, the characteristic element corresponding to the pixel point is determined to be 2, and so on, the characteristic elements corresponding to the pixel points can be obtained and used as the characteristic matrix of the sampling video frame.

Furthermore, for every two videos to be recommended, the electronic device may calculate a hamming distance of a feature matrix of each of the two sampled video frames to be recommended, and calculate a mean value of the hamming distances of the sampled video frames as a first similarity of the two videos to be recommended.

The second method comprises the following steps: in an embodiment of the present invention, on the basis of fig. 1, referring to fig. 3, step S102 may include the following steps:

s1021: and calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended.

In one implementation, for each of the two videos to be recommended, the electronic device may generate a content feature vector of a video content of the video to be recommended and a title feature vector of a video title of the video to be recommended. The manner in which the electronic device generates the content feature vector and the title feature vector may be found in the description of the following embodiments.

Then, the electronic device may calculate the similarity of the content feature vectors of the two videos to be recommended (which may be referred to as a second similarity), and calculate the similarity of the title feature vectors of the two videos to be recommended (which may be referred to as a third similarity), respectively. And further, calculating the weighted sum of the second similarity and the third similarity of the two videos to be recommended to obtain the first similarity of the two videos to be recommended.

In another implementation, on the basis of fig. 3, referring to fig. 4, step S1021 may include the following steps:

s10211: and processing the sampled video frames and the video titles of the two videos to be recommended based on a pre-trained text vectorization model to obtain respective feature vectors of the two videos to be recommended.

S10212: and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

The text vectorization model may be a word to vector (word to vector) model.

In one implementation, for each video to be recommended, the electronic device may process the sample video frames and the video titles of the two videos to be recommended based on a pre-trained text vectorization model to obtain respective feature vectors of the two videos to be recommended. The electronic device determines a manner of the feature vector of the video to be recommended, which can be referred to in the description of the following embodiments.

Furthermore, the electronic device may calculate a similarity between feature vectors of the two videos to be recommended as a first similarity. For example, the electronic device may calculate an euclidean distance between the feature vectors of the two videos to be recommended, or the electronic device may calculate a cosine similarity between the feature vectors of the two videos to be recommended.

The third method comprises the following steps: in an embodiment of the present invention, on the basis of fig. 3, referring to fig. 5, before step S1021, the method may further include the steps of:

s1022: and acquiring respective target keywords of the two videos to be recommended.

The target keywords of the video to be recommended comprise: and searching keywords used when the video to be recommended is searched by the user in the third history time period.

Accordingly, step S1021 may include the steps of:

s10213: and calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended, the video titles of the two videos to be recommended, the video feature labels of the two videos to be recommended and the target keywords of the two videos to be recommended.

The third history time period can be set by a technician according to experience, and can be the same as or different from the first history time period and the second history time period.

In one embodiment of the present invention, step S10213 may include the steps of:

step 1, processing the sampled video frames, video titles, video feature labels and target keywords of the two videos to be recommended based on a pre-trained text vectorization model to obtain respective feature vectors of the two videos to be recommended;

and 2, calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

In one embodiment of the present invention, step 1 may comprise the steps of:

and step 11, determining videos searched by the user based on the target keywords in the third history time period as videos to be processed aiming at each target keyword of the two videos to be recommended.

And step 12, generating an information sequence corresponding to the target keyword.

Wherein the information sequence comprises: the respective video information of the target keyword and the video to be processed. In the information sequence, each piece of video information is positioned behind the target keyword, and the video information is arranged according to the sequence of the click times of the corresponding videos to be processed in the third history time period from large to small. Video information of a video to be processed includes: the video identification, the video title, the sampling video frame and the video characteristic label of the video to be processed.

And step 13, inputting the information sequence corresponding to each target keyword into a pre-trained text vectorization model, and obtaining a feature vector of a video identifier in the video information of the video to be recommended, which is output by the text vectorization model, as a feature vector of the video to be recommended, aiming at the video information of each video to be recommended.

And the video information of the video to be processed searched by the user based on the target keyword in the third history time period.

The video to be processed is the video searched by the user based on the target keyword in the third history time period, and the target keyword is the keyword used when the user searches the video to be recommended in the third history time period. Based on the target keywords, the user may search for other videos than the video to be recommended, that is, the video to be recommended belongs to the video searched by the user based on the target keywords, that is, belongs to the video to be processed.

Therefore, the information sequences corresponding to the target keywords input to the text vectorization model all include the video information of the two videos to be recommended, and accordingly, feature vectors of the video information (including video identifiers, video titles, sampled video frames and video feature labels) of the two videos to be recommended can be obtained based on the text vectorization model, and the feature vectors of the video identifiers of the two videos to be recommended can be obtained and used as the respective feature vectors of the two videos to be recommended.

The video identification of a video may be a technician-set number indicating the uniqueness of the video.

The text vectorization model is obtained by training based on a second preset training sample. The second preset training sample may include: the sample keywords, the video information of the video searched by the user based on the sample keywords (which may be referred to as a second sample video), and the feature vectors of the video identifier, the video title, the sampling video frame, and the video feature tag in the video information of the second sample video, which are determined based on an onehot (one-hot) encoding algorithm.

The electronic device can determine an information sequence that includes the sample keyword and video information of the second sample video. Then, the electronic device may use an information sequence corresponding to the sample keyword as input data of the text vectorization model of the initial structure, use respective feature vectors of the video identifier, the video title, the sampled video frame, and the video feature tag in the video information of the second sample video as output data of the text vectorization model of the initial structure, adjust model parameters of the text vectorization model of the initial structure, and obtain the trained text vectorization model when a preset convergence condition is reached.

In one implementation, the electronic device may calculate a similarity between feature vectors of the two videos to be recommended, as the first similarity. For example, the electronic device may calculate an euclidean distance between the feature vectors of the two videos to be recommended, or the electronic device may calculate a cosine similarity between the feature vectors of the two videos to be recommended.

For example, the electronic device may calculate a euclidean distance between feature vectors of the two videos to be recommended, as the first similarity, based on the following formula.

d represents the Euclidean distance between the feature vectors of the two videos to be recommended, N represents the number of elements in the feature vectors of the two videos to be recommended, and x represents the number of the elements in the feature vectors of the two videos to be recommended_1oAn ith element, x, in the feature vector representing one of the two videos to be recommended_2oRepresent the two videos to be recommendedThe ith element in the feature vector of another video to be recommended.

In another implementation manner, for every two videos to be recommended, the electronic device may calculate similarity of feature vectors of video identifiers of the two videos to be recommended, similarity of feature vectors of video titles of the two videos to be recommended, similarity of feature vectors of sampled video frames of the two videos to be recommended, and similarity of feature vectors of video feature labels of the two videos to be recommended respectively. Then, the electronic device may calculate a weighted sum of the similarities as a first similarity of the two videos to be recommended.

In addition, in order to improve the video recommendation efficiency, the electronic device may acquire a target keyword and a searched video used by the user to continue searching in the third history time period, and calculate a feature vector of each video based on the target keyword and video information of each video. The electronic device may then store the computed feature vectors for each video locally. Subsequently, after determining the video to be recommended based on the received video request instruction, the electronic device may directly obtain the locally stored feature vectors of the video to be recommended, and calculate the first similarity of the feature vectors of every two videos to be recommended. Furthermore, the electronic device can recommend videos to the user based on the first similarity of the videos to be recommended, and compared with the method that after the videos to be recommended are determined, the feature vector of the videos to be recommended is calculated, video recommendation efficiency can be improved, user waiting time is reduced, and user experience can be improved.

In step S103, the first number may be set by a technician according to experience, for example, the first number may be the number of videos that can be displayed on the display interface of the client. The preset similarity threshold may be set by a skilled person according to experience, for example, the preset similarity threshold may be 0.3, or the preset similarity threshold may also be 0.2, but is not limited thereto.

The electronic device may obtain a score for the video to be recommended. Then, the electronic device may select, as the target video, the first number of videos to be recommended from the videos to be recommended, where the first similarity is smaller than a preset similarity threshold, in the order from large to small in the score.

For a plurality of videos to be recommended with smaller first similarity, the cover pictures of the videos may be the same, and if the videos to be recommended are determined as the target videos, the videos to be recommended are recommended to the user. When browsing videos, a user may think that the videos to be recommended are videos with the same content, and only browse one of the videos to be recommended, which results in low user experience and waste of display positions for displaying the videos in the client.

In one implementation manner, the electronic device may select, as the third candidate video, the first fourth number of videos to be recommended, of which the first similarity is smaller than the preset similarity threshold, from the videos to be recommended in an order from large to small in score. The fourth number is greater than the first number. For each third alternative video, the electronic device may obtain a cover picture of the third alternative video.

Then, the electronic device may select one video from the third candidate videos as a video to be compared according to the order of the scores from large to small, and determine whether a video identical to the cover picture of the video to be compared exists in the third candidate video. If the video identical to the cover picture of the video to be compared does not exist, the electronic equipment can determine that the video to be compared is the target video, select the next video from the third alternative videos according to the sequence of scores from large to small to serve as the video to be compared, and continuously judge whether the video identical to the cover picture of the video to be compared exists in the third alternative videos.

If the video identical to the video cover picture to be compared exists, the electronic equipment can determine that the video to be compared is the target video, and remove the video identical to the video cover picture to be compared from the third alternative video. Then, the electronic device may select a next video from the third candidate videos as a video to be compared according to the order of the scores from large to small, continue to determine whether a video identical to the cover picture of the video to be compared exists in the third candidate videos, and so on until the first number of target videos are determined.

Based on the processing, the videos with the same cover pictures can be prevented from being recommended to the user, waste of display positions for displaying the videos in the client side can be reduced, and user experience is improved.

In step S104, in one implementation, if the electronic device is a client, the electronic device may directly display the target video in the display interface for the user to watch.

In another implementation, if the electronic device is a server, the electronic device may send the target video to the client, so that the client displays the target video in the display interface for the user to watch.

Referring to fig. 6, fig. 6 is a flowchart of another video recommendation method provided in an embodiment of the present invention.

In the Offline stage, the electronic device may perform image embedding processing based on the target keyword used by the user for searching in the acquired third history time period and the searched video, that is, the electronic device may calculate the feature vector of each video based on the target keyword and the video information of each video. The electronic equipment can perform image-text similarity calculation on each video, namely the electronic equipment can acquire the feature vectors of each video stored locally, calculate the similarity of the feature vectors of the two videos and store the similarity in the local.

In an Online stage, when the electronic device receives a search keyword input by a user, the electronic device may recall a video from a corpus (database), that is, the electronic device may determine, from a plurality of preset videos, a video in which a video feature tag and the keyword to be matched have an intersection, and obtain a first candidate video. For each first alternative video, the electronic device may calculate a weighted sum of the upload time of the first alternative video and the click parameter of the first alternative video over a first historical period of time.

The electronic device may perform rough layout on the first alternative video to obtain a second alternative video. That is, the electronic device may select the first second number of videos from the first candidate videos as the second candidate videos according to the order from the largest to the smallest of the corresponding weighted sums.

The electronic device can perform fine ranking on the second alternative videos to obtain videos to be recommended, that is, the electronic device can select a third number of videos from the second alternative videos as the videos to be recommended based on the video feature tags of the second alternative videos, the identifiers of users who upload the second alternative videos, click parameters of the second alternative videos in a second historical time period, and upload time of the second alternative videos.

Then, the electronic device can perform image-text diversity control on the videos to be recommended, that is, the electronic device can obtain the first similarity between the locally stored videos to be recommended. The electronic equipment can determine a first number of target videos of which the first similarity of any two videos is smaller than a preset similarity threshold (namely theta) from a plurality of videos to be recommended.

For example, the video to be recommended may include 11 videos, where the 11 videos are: item1, item2, item3 … …, item10, item 11. For each video to be recommended, if the first similarity between the video to be recommended and any other video to be recommended is smaller than a preset similarity threshold (namely theta), the electronic equipment can determine that the video to be recommended is a target video. For example, the first similarity between item3 and item1 is less than a preset similarity threshold, i.e., sim (1, 3) < θ, the first similarity between item3 and item2 is less than a preset similarity threshold, i.e., sim (2, 3) < θ, and so on, until the first similarity between item3 and item11 is less than a preset similarity threshold, i.e., sim (11, 3) < θ, the electronic device may determine item3 as the target video.

Further, the electronic device may recommend the target video to the user. For example, the electronic device may display the target video according to Twiddler (preset presentation rule).

Based on the video recommendation method provided by the embodiment of the invention, the first similarity of any two videos in the target video is smaller than the preset similarity threshold, and as the first similarity can represent the similarity of the video contents of the videos to be recommended, the similarity of the video contents of any two videos in the target video is lower, the probability of repeated videos existing in the target video is lower, the target video is recommended to the user, the problem of recommending the repeated videos to the user can be avoided to a certain extent, and further, the waste of display positions for displaying the videos in the client can be reduced.

Corresponding to the embodiment of the method in fig. 1, referring to fig. 7, fig. 7 is a block diagram of a video recommendation apparatus provided in an embodiment of the present invention, where the apparatus includes:

the first obtaining module 701 is configured to, when a video request instruction input by a user is received, obtain a plurality of videos to be recommended based on the video request instruction;

a first determining module 702, configured to calculate, for every two videos to be recommended, a similarity between the two videos to be recommended based on video contents of the two videos to be recommended, as a first similarity;

a second determining module 703, configured to determine, based on each first similarity, a first number of videos from the multiple videos to be recommended as target videos; the first similarity of any two videos in the target videos is smaller than a preset similarity threshold value;

and a recommending module 704, configured to recommend the target video to a user.

Optionally, the video request instruction carries a search keyword;

the first obtaining module 701 is specifically configured to perform word segmentation processing on the search keyword to obtain a plurality of keywords serving as keywords to be matched;

determining videos with intersection of the video feature labels and the keywords to be matched from a plurality of preset videos, and taking the videos as first alternative videos;

and selecting a third number of videos from the second alternative videos as videos to be recommended.

Optionally, the first obtaining module 701 is specifically configured to select a third number of videos from the second candidate videos as videos to be recommended based on the video feature tag of the second candidate video, the identifier of the user who uploads the second candidate video, the click parameter of the second candidate video in a second historical time period, and the upload time of the second candidate video.

Optionally, the first determining module 702 is specifically configured to calculate, based on the video contents of the two videos to be recommended and the video titles of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity.

Optionally, the first determining module 702 is specifically configured to process the sampled video frames and video titles of the two videos to be recommended based on a pre-trained text vectorization model, so as to obtain feature vectors of the two videos to be recommended;

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

Optionally, the apparatus further comprises:

the first determining module 702 is specifically configured to calculate, based on the video contents of the two videos to be recommended, the video titles of the two videos to be recommended, the video feature labels of the two videos to be recommended, and the target keywords of the two videos to be recommended, a similarity of the two videos to be recommended as a first similarity.

Optionally, the first determining module is specifically configured to process the sampled video frames, the video titles, the video feature labels and the target keywords of the two videos to be recommended based on a pre-trained text vectorization model, so as to obtain respective feature vectors of the two videos to be recommended;

and calculating the similarity between the feature vectors of the two videos to be recommended as a first similarity.

Optionally, the first determining module 702 is specifically configured to determine, for each target keyword of the two videos to be recommended, a video searched by the user based on the target keyword in the third history time period as a video to be processed;

Based on the video recommendation device provided by the embodiment of the invention, the first similarity of any two videos in the target video is smaller than the preset similarity threshold, and the first similarity can represent the similarity of the video contents of the videos to be recommended, so that the similarity of the video contents of any two videos in the target video is low, the probability of repeated videos existing in the target video is low, the target video is recommended to the user, the problem of recommending the repeated videos to the user can be avoided to a certain extent, and further, the waste of display positions for displaying the videos in the client can be reduced.

An embodiment of the present invention further provides an electronic device, as shown in fig. 8, including a processor 801, a communication interface 802, a memory 803, and a communication bus 804, where the processor 801, the communication interface 802, and the memory 803 complete mutual communication through the communication bus 804;

a memory 803 for storing a computer program;

the processor 801 is configured to implement the following steps when executing the program stored in the memory 803:

when a video request instruction input by a user is received, acquiring a plurality of videos to be recommended based on the video request instruction;

for every two videos to be recommended, calculating the similarity of the two videos to be recommended as a first similarity based on the video contents of the two videos to be recommended;

and recommending the target video to the user.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

In still another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the video recommendation method in any of the above embodiments.

In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the video recommendation method of any of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the computer-readable storage medium, and the computer program product embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and in relation to them, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

21页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：用于输出信息的方法和装置

Video recommendation method, device, equipment and storage medium

相关技术

网友询问留言