Data recommendation method, device, equipment and storage medium

文档序号：1889430 发布日期：2021-11-26 浏览：9次中文

阅读说明：本技术 一种数据推荐方法、装置、设备及存储介质 (Data recommendation method, device, equipment and storage medium ) 是由龚静张恒吕有才詹乐于 2021-08-31 设计创作，主要内容包括：本发明实施例涉及人工智能领域,公开了一种数据推荐方法、装置、设备及存储介质,该方法包括：获取目标业务场景下用户的画像信息,画像信息包括用户画像和内容画像；从画像信息中提取特征信息,特征信息包括用户特征和内容特征；对用户特征和内容特征进行分类,对分类后的用户特征和内容特征进行扩展得到扩展特征集合；将扩展特征集合中的用户特征和内容特征输入训练好的召回模型,得到内容候选集,内容候选集包括多个候选文本内容；利用指定排序算法对多个候选文本内容进行排序,并按照排序后的顺序将多个候选文本内容发送给用户终端,提高了数据推荐的效率和准确率。本发明涉及区块链技术,如可将画像信息写入区块链中,以用于数据取证等场景。(The embodiment of the invention relates to the field of artificial intelligence, and discloses a data recommendation method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait; extracting characteristic information from the image information, wherein the characteristic information comprises user characteristics and content characteristics; classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set; inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set, wherein the content candidate set comprises a plurality of candidate text contents; and the plurality of candidate text contents are sequenced by using a specified sequencing algorithm, and are sent to the user terminal according to the sequenced sequence, so that the efficiency and the accuracy of data recommendation are improved. The present invention relates to blockchain techniques, such as writing image information into blockchains for use in data forensics and other scenarios.)

1. A method for recommending data, comprising:

acquiring portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait;

extracting feature information from the image information, wherein the feature information comprises user features and content features;

classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set;

inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set corresponding to the user characteristics and the content characteristics, wherein the content candidate set comprises a plurality of candidate text contents;

and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence.

2. The method of claim 1, wherein the extracting feature information from the image information comprises:

performing word segmentation processing on the image information to obtain a word sequence corresponding to the image information;

calculating the TF-IDF value of each word in the word sequence through TF-IDF, and selecting the largest TF-IDF value as the initial characteristic information of the portrait information;

inputting the word sequence into an LSTM-CRF model, and performing feature extraction on the word sequence through a segmentation mapping external feature layer of the LSTM-CRF model to obtain token representation information;

merging the initial characteristic information and the token representation information through a full connection layer of an LSTM-CRF model to obtain merged characteristic information;

and inputting the merged feature information into a CRF layer of an LSTM-CRF model to obtain the feature information of the portrait information.

3. The method of claim 2, wherein the classifying the user features and the content features and expanding the classified user features and the content features to obtain an expanded feature set comprises:

inputting the user characteristics and the content characteristics into a pre-trained BERT model to obtain category information of the user characteristics and the content characteristics;

according to the category information of the user features and the content features, determining synonymous user features and synonymous content features corresponding to the category information of the user features and the content features;

determining that the user characteristic, the content characteristic, the synonymous user characteristic, and the synonymous content characteristic constitute the extended feature set.

4. The method of claim 3, wherein before entering the user features and content features in the extended feature set into the trained recall model, further comprising:

inputting each user characteristic and content characteristic in the extended characteristic set into a pre-trained BERT model to obtain a ranking score evaluation index of each user characteristic and content characteristic;

determining the sequence of the ranking score evaluation indexes from big to small as the arrangement sequence of the user characteristics and the content characteristics according to the ranking score evaluation indexes of the user characteristics and the content characteristics;

and sequencing the user characteristics and the content characteristics in the extended characteristic set according to the sequencing order.

5. The method of claim 4, wherein the inputting the user features and the content features in the extended feature set into a trained recall model to obtain a content candidate set corresponding to the user features and the content features comprises:

extracting corresponding user characteristic vectors and content characteristic vectors from the user characteristics and the content characteristics in the sequenced extended characteristic set;

inputting the user characteristic vector and the content characteristic vector into a trained recall model to obtain a plurality of candidate text contents, and determining the content candidate set according to the candidate text contents.

6. The method of claim 5, wherein before entering the user features and the content features in the extended feature set into the trained recall model and obtaining the content candidate set corresponding to the user features and the content features, the method further comprises:

obtaining a sample set comprising a plurality of sample portrait information, the sample portrait information comprising a sample user portrait and a sample content portrait;

extracting sample user characteristics corresponding to the sample user images and sample content characteristics corresponding to the sample content images in the sample set;

and inputting the sample user characteristics and the sample content characteristics into a preset neural network model for training to obtain the recall model.

7. The method of claim 6, wherein said ranking the plurality of candidate textual content using a specified ranking algorithm comprises:

scoring each candidate text content in the content candidate set by using the specified sorting algorithm to obtain a score of each candidate text content;

and sequencing the candidate text contents according to the order of the scores of the candidate text contents from high to low.

8. A data recommendation device, comprising:

the system comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring portrait information of a user in a target service scene, and the portrait information comprises a user portrait and a content portrait;

an extraction unit configured to extract feature information from the portrait information, the feature information including a user feature and a content feature;

the extension unit is used for classifying the user characteristics and the content characteristics and extending the classified user characteristics and the content characteristics to obtain an extension characteristic set;

a recall unit, configured to input a user feature and a content feature in the extended feature set into a trained recall model, so as to obtain a content candidate set corresponding to the user feature and the content feature, where the content candidate set includes a plurality of candidate text contents;

and the pushing unit is used for sequencing the candidate text contents by using a specified sequencing algorithm and sending the candidate text contents to the user terminal according to the sequenced sequence.

9. A computer device comprising a processor and a memory, wherein the memory is configured to store a computer program and the processor is configured to invoke the computer program to perform the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1-7.

Technical Field

The invention relates to the field of artificial intelligence, in particular to a data recommendation method, device, equipment and storage medium.

Background

The recommendation system recommends information in which a user is interested to the user according to the information demand, the interest and the like of the user, and at present, the mainstream recommendation system generally comprises an indexing stage, a recall stage and a sorting stage, wherein the recall stage mainly selects contents from a content candidate set obtained from the indexing stage directly within limited response time and sends the selected contents to the sorting stage. However, this approach is limited by the huge candidate set and the requirement of real-time performance and complexity, and it is difficult for the recommendation processing logic to achieve a good recommendation effect.

Disclosure of Invention

Embodiments of the present invention provide a data recommendation method, apparatus, device, and storage medium, which are helpful for improving efficiency and accuracy of obtaining a content candidate set corresponding to a user characteristic and a content characteristic, thereby improving efficiency and accuracy of data recommendation.

In a first aspect, an embodiment of the present invention provides a data recommendation method, including:

acquiring portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait;

extracting feature information from the image information, wherein the feature information comprises user features and content features;

classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set;

and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence.

Further, the extracting feature information from the image information includes:

performing word segmentation processing on the image information to obtain a word sequence corresponding to the image information;

calculating the TF-IDF value of each word in the word sequence through TF-IDF, and selecting the largest TF-IDF value as the initial characteristic information of the portrait information;

merging the initial characteristic information and the token representation information through a full connection layer of an LSTM-CRF model to obtain merged characteristic information;

and inputting the merged feature information into a CRF layer of an LSTM-CRF model to obtain the feature information of the portrait information.

Further, the classifying the user features and the content features, and expanding the classified user features and the content features to obtain an expanded feature set includes:

inputting the user characteristics and the content characteristics into a pre-trained BERT model to obtain category information of the user characteristics and the content characteristics;

determining that the user characteristic, the content characteristic, the synonymous user characteristic, and the synonymous content characteristic constitute the extended feature set.

Further, before the inputting the user features and the content features in the extended feature set into the trained recall model, the method further includes:

and sequencing the user characteristics and the content characteristics in the extended characteristic set according to the sequencing order.

Further, the inputting the user features and the content features in the extended feature set into a trained recall model to obtain a content candidate set corresponding to the user features and the content features includes:

extracting corresponding user characteristic vectors and content characteristic vectors from the user characteristics and the content characteristics in the sequenced extended characteristic set;

Further, before the inputting the user features and the content features in the extended feature set into the trained recall model and obtaining the content candidate set corresponding to the user features and the content features, the method further includes:

obtaining a sample set comprising a plurality of sample portrait information, the sample portrait information comprising a sample user portrait and a sample content portrait;

extracting sample user characteristics corresponding to the sample user images and sample content characteristics corresponding to the sample content images in the sample set;

and inputting the sample user characteristics and the sample content characteristics into a preset neural network model for training to obtain the recall model.

Further, the ranking the plurality of candidate text contents using a specified ranking algorithm includes:

scoring each candidate text content in the content candidate set by using the specified sorting algorithm to obtain a score of each candidate text content;

and sequencing the candidate text contents according to the order of the scores of the candidate text contents from high to low.

In a second aspect, an embodiment of the present invention provides a data recommendation apparatus, including:

an extraction unit configured to extract feature information from the portrait information, the feature information including a user feature and a content feature;

In a third aspect, an embodiment of the present invention provides a computer device, including a processor and a memory, where the memory is used to store a computer program, and the computer program includes a program, and the processor is configured to call the computer program to execute the method of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the method of the first aspect.

The method and the device can acquire portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait; extracting feature information from the image information, wherein the feature information comprises user features and content features; classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set; inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set corresponding to the user characteristics and the content characteristics, wherein the content candidate set comprises a plurality of candidate text contents; and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence. By the method, the efficiency and the accuracy of obtaining the content candidate set corresponding to the user characteristics and the content characteristics are improved, and the efficiency and the accuracy of data recommendation are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart diagram of a data recommendation method provided by an embodiment of the invention;

FIG. 2 is a schematic block diagram of a data recommendation apparatus according to an embodiment of the present invention;

fig. 3 is a schematic block diagram of a computer device provided by an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The data recommendation method provided by the embodiment of the invention can be applied to a data recommendation device, and in some embodiments, the data recommendation device is arranged in computer equipment. In certain embodiments, the computer device includes, but is not limited to, one or more of a smartphone, tablet, laptop, and the like.

The method and the device can acquire portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait; extracting feature information from the image information, wherein the feature information comprises user features and content features; classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set; inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set corresponding to the user characteristics and the content characteristics, wherein the content candidate set comprises a plurality of candidate text contents; and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence. The embodiment of the invention is beneficial to improving the efficiency and the accuracy of obtaining the content candidate set corresponding to the user characteristics and the content characteristics through the mode, thereby improving the efficiency and the accuracy of data recommendation.

The embodiment of the application can acquire and process related data (such as portrait information of a user) based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The following describes schematically a data recommendation method provided by an embodiment of the present invention with reference to fig. 1.

Referring to fig. 1, fig. 1 is a schematic flowchart of a data recommendation method according to an embodiment of the present invention, and as shown in fig. 1, the method may be executed by a data recommendation apparatus, where the data recommendation apparatus is disposed in a computer device. Specifically, the method of the embodiment of the present invention includes the following steps.

S101: and obtaining portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait.

In the embodiment of the invention, the data recommendation device can acquire portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait.

In some embodiments, the user representation includes, but is not limited to, a customer manager representation, a customer representation, and the like; in some embodiments, the client manager representation includes client manager base information such as personal base information, post information, primary responsibility services; managing and performance information such as customer group information, product preference, performance ranking, exhibition business information; customer manager behavior information: positive feedback is conscious of cases or products, such as praise, collection, download, attention, stay time, search popularity, positive comments are published, and negative feedback is conscious, such as uninteresting expression, negative comments, reporting and the like. In some embodiments, the customer representation includes customer basic information such as age, native place, education level, occupation, holding products, risk level, etc.; the customer behavior information includes conscious feedback of the product, positive or negative evaluation of the customer manager, and the like.

In certain embodiments, the pictorial information includes, but is not limited to, content in the form of text, pictures, video, and the like.

S102: and extracting characteristic information from the image information, wherein the characteristic information comprises user characteristics and content characteristics.

In the embodiment of the present invention, the data recommendation device may extract feature information from the portrait information, where the feature information includes a user feature and a content feature.

In one embodiment, the data recommender may extract feature information from the image information according to a specified algorithm when extracting the feature information from the image information. In some embodiments, the specified algorithm includes, but is not limited to, Natural Language Processing (NLP), character recognition algorithm OCR, and the like.

In one embodiment, if the portrait information includes a picture or video, the picture or video may be first converted into text, and then feature information may be extracted from the converted text.

In one embodiment, the data recommendation device may perform word segmentation on the image information to obtain a word sequence corresponding to the image information when extracting feature information from the image information; calculating the TF-IDF value of each word in the word sequence through TF-IDF, and selecting the largest TF-IDF value as the initial characteristic information of the portrait information; inputting the word sequence into an LSTM-CRF model, and performing feature extraction on the word sequence through a segmentation mapping external feature layer of the LSTM-CRF model to obtain token representation information; merging the initial characteristic information and the token representation information through a full connection layer of an LSTM-CRF model to obtain merged characteristic information; and inputting the merged feature information into a CRF layer of an LSTM-CRF model to obtain the feature information of the portrait information.

S103: classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set.

In the embodiment of the present invention, the data recommendation device may classify the user features and the content features, and expand the classified user features and the classified content features to obtain an expanded feature set.

In one embodiment, when the data recommendation device classifies the user features and the content features and expands the classified user features and the classified content features to obtain an expanded feature set, the data recommendation device may input the user features and the content features into a pre-trained BERT model to obtain category information of the user features and the content features; according to the category information of the user features and the content features, determining synonymous user features and synonymous content features corresponding to the category information of the user features and the content features, and determining that the user features, the content features, the synonymous user features and the synonymous content features form the extended feature set. By expanding the user characteristics and the content characteristics, the accuracy of content recall is improved.

In an embodiment, when the data recommendation device expands the classified user features and the content features to obtain an expanded feature set, the data recommendation device may also expand the classified user features and the content features in a manual labeling manner, such as detailed feature information of a service scene to which the manual supplementary case belongs, product-corresponding customer group features, and the like. The user characteristics and the content characteristics are manually expanded, so that the accuracy of content recall is further improved.

S104: and inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set corresponding to the user characteristics and the content characteristics, wherein the content candidate set comprises a plurality of candidate text contents.

In this embodiment of the present invention, the data recommendation device may input the user characteristics and the content characteristics in the extended characteristic set into a trained recall model, so as to obtain a content candidate set corresponding to the user characteristics and the content characteristics, where the content candidate set includes a plurality of candidate text contents.

In one embodiment, before inputting the user features and the content features in the extended feature set into the trained recall model, the data recommendation device may input each user feature and content feature in the extended feature set into a pre-trained BERT model to obtain an evaluation index of a ranking score of each user feature and content feature; determining the sequence of the ranking score evaluation indexes from big to small as the arrangement sequence of the user characteristics and the content characteristics according to the ranking score evaluation indexes of the user characteristics and the content characteristics; and sequencing the user characteristics and the content characteristics in the extended characteristic set according to the sequencing order.

In one embodiment, when the user features and the content features in the extended feature set are input into a trained recall model to obtain a content candidate set corresponding to the user features and the content features, the data recommendation device may extract corresponding user feature vectors and content feature vectors from the user features and the content features in the ordered extended feature set; inputting the user characteristic vector and the content characteristic vector into a trained recall model to obtain a plurality of candidate text contents, and determining the content candidate set according to the candidate text contents.

In an embodiment, after determining the content candidate set according to the plurality of candidate text contents, the data recommendation device may calculate a distance between the user feature vector and the content feature vector, add an index identifier to each candidate text content according to a sequence from small to large of the distance, and store the index representation in a redis cache, so as to facilitate to quickly query whether a target recommended content corresponding to a target index identifier exists in the redis cache according to the target index identifier carried in a recommendation request when the recommendation request sent by the user terminal is acquired subsequently, and further improve recommendation efficiency. In some embodiments, the distance between the user feature vector and the content feature vector may be calculated by a similarity calculation or the like.

In one embodiment, the data recommender may obtain a sample set comprising a plurality of sample representation information, the sample representation information comprising a sample user representation and a sample content representation, before inputting the user features and the content features in the extended feature set into a trained recall model to obtain a content candidate set corresponding to the user features and the content features; extracting sample user characteristics corresponding to the sample user images and sample content characteristics corresponding to the sample content images in the sample set; and inputting the sample user characteristics and the sample content characteristics into a preset neural network model for training to obtain the recall model.

In an embodiment, when the sample user characteristic and the sample content characteristic are input into a preset neural network model for training to obtain the recall model, the sample user characteristic and the sample content characteristic may be input into a designated neural network model to obtain a loss function value, the loss function value is compared with a preset threshold value, if a comparison result does not satisfy a preset condition, a model parameter of the designated neural network model is adjusted, the sample user characteristic and the sample content characteristic are input into the neural network model after the model parameter is adjusted for retraining, and when a comparison result of the obtained loss function value and the preset threshold value satisfies the preset condition, the recall model is determined to be obtained.

S105: and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence.

In the embodiment of the present invention, the data recommendation device may sort the candidate text contents by using a designated sorting algorithm, and send the candidate text contents to the user terminal according to the sorted order.

In one embodiment, when the data recommendation device ranks the candidate text contents by using a specified ranking algorithm, the data recommendation device may score each candidate text content in the content candidate set by using the specified ranking algorithm to obtain a score of each candidate text content; and sequencing the candidate text contents according to the order of the scores of the candidate text contents from high to low. In certain embodiments, the specified ranking algorithm includes, but is not limited to, a multiple-target ranking algorithm.

In the embodiment of the invention, a data recommendation device can acquire portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait; extracting feature information from the image information, wherein the feature information comprises user features and content features; classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set; inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set corresponding to the user characteristics and the content characteristics, wherein the content candidate set comprises a plurality of candidate text contents; and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence. By the method, the efficiency and the accuracy of obtaining the content candidate set corresponding to the user characteristics and the content characteristics are improved, and the efficiency and the accuracy of data recommendation are improved.

The embodiment of the invention also provides a data recommendation device, which is used for executing the unit of the method in any one of the preceding claims. Specifically, referring to fig. 2, fig. 2 is a schematic block diagram of a data recommendation device according to an embodiment of the present invention. The data recommendation device of the embodiment includes: an acquisition unit 201, an extraction unit 202, an extension unit 203, a recall unit 204, and a push unit 205.

An obtaining unit 201, configured to obtain portrait information of a user in a target service scenario, where the portrait information includes a user portrait and a content portrait;

an extracting unit 202, configured to extract feature information from the image information, where the feature information includes a user feature and a content feature;

an expansion unit 203, configured to classify the user feature and the content feature, and expand the classified user feature and the classified content feature to obtain an expanded feature set;

a recall unit 204, configured to input the user features and the content features in the extended feature set into a trained recall model, so as to obtain a content candidate set corresponding to the user features and the content features, where the content candidate set includes a plurality of candidate text contents;

a pushing unit 205, configured to sort the candidate text contents by using a specified sorting algorithm, and send the candidate text contents to the user terminal according to the sorted order.

Further, when the extracting unit 202 extracts feature information from the image information, it is specifically configured to:

performing word segmentation processing on the image information to obtain a word sequence corresponding to the image information;

calculating the TF-IDF value of each word in the word sequence through TF-IDF, and selecting the largest TF-IDF value as the initial characteristic information of the portrait information;

merging the initial characteristic information and the token representation information through a full connection layer of an LSTM-CRF model to obtain merged characteristic information;

and inputting the merged feature information into a CRF layer of an LSTM-CRF model to obtain the feature information of the portrait information.

Further, when the extension unit 203 classifies the user features and the content features, and extends the classified user features and the content features to obtain an extension feature set, the extension unit is specifically configured to:

inputting the user characteristics and the content characteristics into a pre-trained BERT model to obtain category information of the user characteristics and the content characteristics;

determining that the user characteristic, the content characteristic, the synonymous user characteristic, and the synonymous content characteristic constitute the extended feature set.

Further, before the recall unit 204 inputs the user features and the content features in the extended feature set into the trained recall model, it is further configured to:

and sequencing the user characteristics and the content characteristics in the extended characteristic set according to the sequencing order.

Further, when the recall unit 204 inputs the user features and the content features in the extended feature set into the trained recall model to obtain a content candidate set corresponding to the user features and the content features, the recall unit is specifically configured to:

extracting corresponding user characteristic vectors and content characteristic vectors from the user characteristics and the content characteristics in the sequenced extended characteristic set;

Further, before the recall unit 204 inputs the user features and the content features in the extended feature set into the trained recall model and obtains the content candidate set corresponding to the user features and the content features, the recall unit is further configured to:

obtaining a sample set comprising a plurality of sample portrait information, the sample portrait information comprising a sample user portrait and a sample content portrait;

extracting sample user characteristics corresponding to the sample user images and sample content characteristics corresponding to the sample content images in the sample set;

and inputting the sample user characteristics and the sample content characteristics into a preset neural network model for training to obtain the recall model.

Further, when the pushing unit 205 ranks the candidate text contents by using a specified ranking algorithm, it is specifically configured to:

scoring each candidate text content in the content candidate set by using the specified sorting algorithm to obtain a score of each candidate text content;

and sequencing the candidate text contents according to the order of the scores of the candidate text contents from high to low.

Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device provided in an embodiment of the present invention, and in some embodiments, the computer device in the embodiment shown in fig. 3 may include: one or more processors 301; one or more input devices 302, one or more output devices 303, and memory 304. The processor 301, the input device 302, the output device 303, and the memory 304 are connected by a bus 305. The memory 304 is used for storing computer programs, including programs, and the processor 301 is used for executing the programs stored in the memory 304. Wherein the processor 301 is configured to invoke the program to perform:

acquiring portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait;

extracting feature information from the image information, wherein the feature information comprises user features and content features;

classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set;

and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence.

Further, when the processor 301 extracts feature information from the image information, it is specifically configured to:

performing word segmentation processing on the image information to obtain a word sequence corresponding to the image information;

calculating the TF-IDF value of each word in the word sequence through TF-IDF, and selecting the largest TF-IDF value as the initial characteristic information of the portrait information;

merging the initial characteristic information and the token representation information through a full connection layer of an LSTM-CRF model to obtain merged characteristic information;

and inputting the merged feature information into a CRF layer of an LSTM-CRF model to obtain the feature information of the portrait information.

Further, when the processor 301 classifies the user feature and the content feature, and expands the classified user feature and the content feature to obtain an expanded feature set, the method is specifically configured to:

inputting the user characteristics and the content characteristics into a pre-trained BERT model to obtain category information of the user characteristics and the content characteristics;

determining that the user feature, the content feature, the synonymous user feature, and the synonymous content feature constitute the extended feature set.

Further, before the processor 301 inputs the user features and the content features in the extended feature set into the trained recall model, it is further configured to:

and sequencing the user characteristics and the content characteristics in the extended characteristic set according to the sequencing order.

Further, when the processor 301 inputs the user features and the content features in the extended feature set into the trained recall model to obtain a content candidate set corresponding to the user features and the content features, the processor is specifically configured to:

extracting corresponding user characteristic vectors and content characteristic vectors from the user characteristics and the content characteristics in the sequenced extended characteristic set;

Further, before the processor 301 inputs the user features and the content features in the extended feature set into the trained recall model and obtains the content candidate set corresponding to the user features and the content features, the processor is further configured to:

obtaining a sample set comprising a plurality of sample portrait information, the sample portrait information comprising a sample user portrait and a sample content portrait;

extracting sample user characteristics corresponding to the sample user images and sample content characteristics corresponding to the sample content images in the sample set;

and inputting the sample user characteristics and the sample content characteristics into a preset neural network model for training to obtain the recall model.

Further, when the processor 301 ranks the candidate text contents by using a specified ranking algorithm, the method is specifically configured to:

scoring each candidate text content in the content candidate set by using the specified sorting algorithm to obtain a score of each candidate text content;

and sequencing the candidate text contents according to the order of the scores of the candidate text contents from high to low.

In the embodiment of the invention, computer equipment can acquire portrait information of a user in a target service scene, wherein the portrait information comprises a user portrait and a content portrait; extracting feature information from the image information, wherein the feature information comprises user features and content features; classifying the user characteristics and the content characteristics, and expanding the classified user characteristics and the content characteristics to obtain an expanded characteristic set; inputting the user characteristics and the content characteristics in the extended characteristic set into a trained recall model to obtain a content candidate set corresponding to the user characteristics and the content characteristics, wherein the content candidate set comprises a plurality of candidate text contents; and sequencing the candidate text contents by using a specified sequencing algorithm, and sending the candidate text contents to the user terminal according to the sequenced sequence. By the method, the efficiency and the accuracy of obtaining the content candidate set corresponding to the user characteristics and the content characteristics are improved, and the efficiency and the accuracy of data recommendation are improved.

It should be understood that, in the embodiment of the present invention, the Processor 301 may be a Central Processing Unit (CPU), and the Processor may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The input device 302 may include a touch pad, a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.

The memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.

In a specific implementation, the processor 301, the input device 302, and the output device 303 described in this embodiment of the present invention may execute the implementation described in the method embodiment shown in fig. 1 provided in this embodiment of the present invention, and may also execute the implementation of the data recommendation apparatus described in fig. 2 in this embodiment of the present invention, which is not described herein again.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the data recommendation method described in the embodiment corresponding to fig. 1 may be implemented, or the data recommendation apparatus according to the embodiment corresponding to fig. 2 may also be implemented, which is not described herein again.

The computer readable storage medium may be an internal storage unit of the data recommendation device according to any of the foregoing embodiments, for example, a hard disk or a memory of the data recommendation device. The computer readable storage medium may also be an external storage device of the data recommendation device, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the data recommendation device. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the data recommendation device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the data recommendation device. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a computer-readable storage medium, which includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

It is emphasized that the data may also be stored in a node of a blockchain in order to further ensure the privacy and security of the data. The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

16页详细技术资料下载

Data recommendation method, device, equipment and storage medium

相关技术

网友询问留言