Information searching method and device based on picture and storage medium

文档序号：1952854 发布日期：2021-12-10 浏览：19次中文

阅读说明：本技术 基于图片的信息搜索方法、装置以及存储介质 (Information searching method and device based on picture and storage medium ) 是由谢金林赵晓乐吴风于 2021-09-01 设计创作，主要内容包括：本申请公开了一种基于图片的信息搜索方法、装置以及存储介质。其中,方法包括：获取作为搜索线索的源图片；从与源图片相关的信息源获取与源图片相关的语言文本信息；根据语言文本信息,确定与源图片相关的第一关键词集合；以及利用第一关键词集合中的关键词进行搜索,获取与源图片相关的搜索结果。(The application discloses an information search method and device based on pictures and a storage medium. The method comprises the following steps: acquiring a source picture serving as a search clue; acquiring language text information related to a source picture from an information source related to the source picture; determining a first keyword set related to the source picture according to the language text information; and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.)

1. An information search method based on pictures is characterized by comprising the following steps:

acquiring a source picture serving as a search clue;

acquiring language text information related to the source picture from an information source related to the source picture;

determining a first keyword set related to the source picture according to the language text information; and

and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.

2. The method of claim 1, wherein determining a first set of keywords associated with the source picture from the language text information comprises:

generating a second keyword set related to the language text information by using a preset natural language processing model; and

and expanding the second keyword set according to a preset knowledge graph to generate the first keyword set.

3. The method of claim 2, wherein the operation of generating a second set of keywords related to the language text information using a preset natural language processing model comprises:

generating abstract text information and a third key word set related to the language text information by using a preset semantic analysis model;

generating a first word sequence corresponding to the abstract text information and the third key word set;

generating a second word sequence corresponding to the second keyword set according to the first word sequence by using a preset sequence processing model; and

and generating the second keyword set according to the second word sequence.

4. The method of claim 3, wherein the operation of generating the abstract text information related to the language text information by using a preset semantic analysis model comprises:

generating corresponding sentence vectors aiming at the sentences in the language text information by utilizing a preset sentence vector model;

clustering the sentences according to sentence vectors of the sentences so as to generate a plurality of sentence categories related to the sentences;

selecting sentences closest to the centroids of the sentence categories from the plurality of sentence categories, respectively; and

and generating the summary text information according to the selected sentence.

5. The method of claim 3, wherein the operation of generating a third set of keywords related to the language text information by using a preset semantic analysis model comprises: and generating a third key word set related to the language text information by using a preset topic model.

6. The method of claim 1, wherein the act of obtaining language text information associated with the source picture from an information source associated with the source picture comprises:

determining an approximate picture that approximates the source picture; and

the language text information is obtained from an information source associated with the source picture and the approximate picture.

7. The method of claim 6, wherein the operation of determining an approximate picture that approximates the source picture comprises:

extracting picture features of the source picture;

determining similarity between the source picture and each picture in a preset picture set according to the picture characteristics of the source picture and the picture characteristics of the pictures in the picture set; and

and determining the approximate picture according to the similarity between the source picture and each picture in the picture set.

8. A storage medium comprising a stored program, wherein the method of any one of claims 1 to 7 is performed by a processor when the program is run.

9. A picture-based information search apparatus (600), comprising:

a picture acquisition module (610) for acquiring a source picture as a search cue;

a language text information acquisition module (620) for acquiring language text information related to the source picture from an information source related to the source picture;

a keyword determination module (630) for determining a first keyword set related to the source picture according to the language text information; and

and the searching module (640) is used for searching by using the keywords in the first keyword set to obtain a searching result related to the source picture.

10. A picture-based information search apparatus (700), comprising:

a processor; and

a memory coupled to the processor for providing instructions to the processor for processing the following processing steps:

acquiring a source picture serving as a search clue;

acquiring language text information related to the source picture from an information source related to the source picture;

determining a first keyword set related to the source picture according to the language text information; and

and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for searching information based on pictures, and a storage medium.

Background

The information search technology based on pictures is being widely applied, users can input pictures into a search platform, the search platform can search information sources such as web pages containing the pictures on the internet, and the information sources are returned to the users, so that the users can access the information sources to obtain information related to the pictures. Furthermore, the search platform can also return the text information related to the picture in the information source to the user, so that the user can directly view the text information related to the picture.

However, the prior art is more directly returning the information of the information source containing the picture to the user according to the picture searching technology, and lacks further processing on the information of the information source, so that more accurate and comprehensive information cannot be provided for the user.

Aiming at the technical problem that the information of an information source containing a picture is not further processed in the prior picture searching technology, so that more accurate and comprehensive information cannot be provided for a user, an effective solution is not provided at present.

Disclosure of Invention

Embodiments of the present disclosure provide a method, an apparatus, and a storage medium for searching information based on pictures, so as to at least solve the technical problem that information of an information source including a picture is not further processed in the prior art, so that more accurate and comprehensive information cannot be provided to a user.

According to an aspect of the embodiments of the present disclosure, there is provided a picture-based information search method, including: acquiring a source picture serving as a search clue; acquiring language text information related to a source picture from an information source related to the source picture; determining a first keyword set related to the source picture according to the language text information; and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.

According to another aspect of the embodiments of the present disclosure, there is also provided a storage medium including a stored program, wherein the method described above is performed by a processor when the program is executed.

According to another aspect of the embodiments of the present disclosure, there is also provided a picture-based information search apparatus including: the image acquisition module is used for acquiring a source image serving as a search clue; the language text information acquisition module is used for acquiring language text information related to the source picture from an information source related to the source picture; the keyword determining module is used for determining a first keyword set related to the source picture according to the language text information; and the searching module is used for searching by using the keywords in the first keyword set to obtain a searching result related to the source picture.

According to another aspect of the embodiments of the present disclosure, there is also provided a picture-based information search apparatus including: a processor; and a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: acquiring a source picture serving as a search clue; acquiring language text information related to a source picture from an information source related to the source picture; determining a first keyword set related to the source picture according to the language text information; and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.

In the embodiment of the disclosure, after the language text information related to the source picture is acquired from the web page and the web address, rather than outputting only the language text information to the user, keywords related to the source picture are determined according to the language text information, and then a search is performed by using the keywords to acquire a search result related to the source picture. In this way, the computing device can then break through the limitations of the information sources associated with the source pictures and perform further searches based on the keywords redetermined from the language text information. Therefore, the information related to the source picture can be comprehensively and accurately searched, and the technical problem that the information of the information source containing the picture is further processed due to the lack of the existing picture searching technology, so that more accurate and comprehensive information cannot be provided for a user is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure. In the drawings:

fig. 1 is a hardware block diagram of a computing device for implementing the method according to embodiment 1 of the present disclosure;

fig. 2 is a schematic flowchart of a picture-based information search method according to a first aspect of embodiment 1 of the present disclosure;

fig. 3 is a schematic diagram of a part of language text information according to embodiment 1 of the present disclosure;

FIG. 4 is a schematic illustration of a portion of a knowledge-graph according to embodiment 1 of the present disclosure;

fig. 5 is a detailed flowchart of a picture-based information search method according to the first aspect of embodiment 1 of the present disclosure;

fig. 6 is a schematic diagram of an apparatus for searching information based on pictures according to embodiment 2 of the present disclosure; and

fig. 7 is a schematic diagram of an apparatus for searching information based on pictures according to embodiment 3 of the present disclosure.

Detailed Description

In order to make those skilled in the art better understand the technical solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure. It is to be understood that the described embodiments are merely exemplary of some, and not all, of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

According to the present embodiment, there is provided a method embodiment of a picture-based information search method, it should be noted that the steps shown in the flowchart of the figure may be executed in a computer system such as a set of computer-executable instructions, and that although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in an order different from that here.

The method embodiments provided by the present embodiment may be executed in a mobile terminal, a computer terminal, a server or a similar computing device. Fig. 1 illustrates a hardware configuration block diagram of a computing device for implementing a picture-based information search method. As shown in fig. 1, the computing device may include one or more processors (which may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory for storing data, and a transmission device for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computing device may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuitry may be a single, stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computing device. As referred to in the disclosed embodiments, the data processing circuit acts as a processor control (e.g., selection of a variable resistance termination path connected to the interface).

The memory may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the picture-based information search method in the embodiments of the present disclosure, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, that is, implements the picture-based information search method of the application software. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory may further include memory located remotely from the processor, which may be connected to the computing device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device is used for receiving or transmitting data via a network. Specific examples of such networks may include wireless networks provided by communication providers of the computing devices. In one example, the transmission device includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computing device.

It should be noted here that in some alternative embodiments, the computing device shown in fig. 1 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that FIG. 1 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in a computing device as described above.

Under the operating environment described above, according to a first aspect of the present embodiment, there is provided a picture-based information search method implemented by the computing device shown in fig. 1. Fig. 2 shows a flow diagram of the method, which, with reference to fig. 2, comprises:

s202: acquiring a source picture serving as a search clue;

s204: acquiring language text information related to a source picture from an information source related to the source picture;

s206: determining a first keyword set related to the source picture according to the language text information; and

s208: and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.

Specifically, when a user needs to retrieve based on pictures, a source picture as a search cue may be input to the computing device shown in fig. 1. Thus, the computing device acquires the source picture input by the user (S202). The picture may be, for example, a picture about a certain product, may be, for example, a picture about a cell phone, and so on.

The computing device then searches the internet based on the source picture to determine web pages and web addresses (i.e., information sources) associated with the source picture. The web page and web address contain language text information related to the source picture, such as explanatory words for a certain mobile phone product, or words describing a certain news in conjunction with a mobile phone product, and so on. Accordingly, the computing device may obtain language text information associated with the source picture from the web page or web address (S204). Fig. 3 shows a schematic diagram of a part of the language text information. Referring to fig. 3, a computing device obtains relevant language text information from a source picture about a cell phone. Also, when the computing device searches for multiple web pages and web addresses (i.e., multiple information sources) based on the source pictures, the computing device may merge the language text information obtained from the multiple web pages and web addresses to further process the merged language text information.

Then, the computing device determines keywords (i.e., a first keyword set) related to the source picture from the acquired language text information (S206), for example, the keywords may be keywords extracted from the acquired language text information or keywords generated from the language text information according to a preset natural language processing model. For example, when the source picture is a picture of a cell phone, the keywords determined by the computing device to be related to the source picture may include "cell phone", "mobile phone", etc., and may also include keywords of cell phone components such as "radio frequency circuit", "antenna", "4G module", or "5G" module. As for a method of determining keywords related to a source picture, detailed description will be made below.

The computing device then performs a search using the keyword. For example, the computing device may search a preset product database using the keywords to obtain search results related to the source picture. The search results include, for example, search results related to products of the mobile phone and search results of respective components of the mobile phone (S208). The computing device then outputs the searched search results to the user for easy viewing by the user.

As described in the background art, the prior art is based on the technology of picture search, and more directly returns the information of the information source containing the picture to the user, and lacks further processing of the information source, thereby failing to provide more accurate and comprehensive information to the user.

In view of the technical problem, in the method described in this embodiment, after the language text information related to the source picture is acquired from the web page and the web address, instead of outputting only the language text information to the user, the keywords (i.e., the first keyword set) related to the source picture are determined continuously according to the language text information, and then the keywords are used for searching to acquire the search result related to the source picture. In this way, the computing device can then break through the limitations of the information sources associated with the source pictures and perform further searches based on the keywords redetermined from the language text information. Therefore, the information related to the source picture can be comprehensively and accurately searched, and the technical problem that the information of the information source containing the picture is further processed due to the lack of the existing picture searching technology, so that more accurate and comprehensive information cannot be provided for a user is solved.

In particular, at least a part of the keywords determined according to the language text information are keywords that are not included in the language text information, so that the search range can be further expanded in this way, and a more comprehensive search can be realized.

Optionally, the operation of determining a first keyword set related to the source picture according to the language text information includes: generating a second keyword set related to the language text information by using a preset natural language processing model; and expanding the second keyword set according to a preset knowledge graph to generate a first keyword set.

Specifically, after acquiring the language text information related to the source picture, the computing device may first generate a keyword (i.e., a second keyword set) related to the language text information according to a preset natural language processing model (described in detail later). The computing device then augments the keywords with a pre-set knowledge-graph, where FIG. 4 shows a schematic diagram of a portion of the knowledge-graph.

For example, after the computing device first generates keywords such as "mobile phone", etc. according to the language text information related to the source picture, the keywords may be further expanded according to the knowledge graph shown in fig. 4, so as to obtain expanded keywords (i.e. a first keyword set), which may include keywords such as "radio frequency circuit", "antenna", "4G", or "5G", etc. The knowledge graph shown in fig. 3 may be designed in advance by a worker according to related knowledge, for example.

Thus, the method described in this embodiment combines a natural language processing model with a knowledge graph. Therefore, the keywords related to the language text information can be accurately generated by using the natural language processing model, and then the keywords are expanded according to the knowledge graph. Therefore, the keywords associated with the source picture can be determined more accurately and comprehensively, and accurate and comprehensive search can be realized.

Optionally, the operation of generating a second keyword set related to the language text information by using a preset natural language processing model includes: generating a first word sequence related to language text information by using a preset semantic analysis model; generating a second word sequence corresponding to the second keyword set according to the first word sequence by using a preset sequence processing model; and generating a second keyword set according to the second word sequence.

Specifically, after acquiring the language text information, the computing device generates abstract text information and keywords (i.e., a third keyword set) related to the language text information by using a preset semantic analysis model.

The computing device may then further generate new keywords (i.e., a second set of keywords) based on the generated summary text information and the keywords. In other words, the computing device considers the generated abstract text information and the third keyword set as a whole, and further generates a new keyword (i.e., the second keyword set) capable of representing the emphasis thereof.

Specifically, the computing device generates a corresponding word vector for each word in the summarized text information and a word vector for a keyword in the third set of keywords, and the computing device combines the word vector of the summarized text information and the word vector of the third set of keywords to generate a word vector sequence (i.e., a first word sequence). The first word sequence is generated by combining the word vector of the abstract text information and the word vector of the third key word set, so that the first word sequence can represent the semantics expressed by the abstract text information and the third key word set as a whole.

Then, the computing device uses the preset sequence processing model and uses the word vector sequence as an input sequence processing model, so that the word vector sequence (i.e. the second word sequence) output by the sequence processing model can be used for representing the key point of the semantic meaning expressed by the abstract text information and the third key word set as a whole. The sequence processing model therefore outputs a sequence of word vectors with a smaller number of word vectors than the input sequence of word vectors. For example, there may be 300 word vectors for a sequence of word vectors input to the sequence processing model and less than 30 word vectors output by the sequence processing model. But the output 30 word vectors can characterize the emphasis of the semantics characterized by the input 300 word vectors.

The computing device then generates a corresponding keyword (i.e., a second set of keywords) from the output sequence of word vectors. Wherein the keywords in the second keyword set are used for representing the key points of the semantics expressed by the abstract text information and the third keyword set as a whole.

Which will be described in detail later with respect to the semantic analysis model. As for the sequence processing model, for example, a sequence-to-sequence (seq2seq) model may be used to output a word vector sequence corresponding to the second keyword set from an input word vector sequence (i.e., the first word sequence), and the output word vector sequence can represent the emphasis of the semantics expressed by the input word vector sequence.

Therefore, the method of the embodiment further generates the abstract text information corresponding to the language text information on the basis of the generated keywords for the language text information, and combines the abstract text information with the generated keywords to generate new keywords. Therefore, compared with a method for directly extracting keywords (namely the third keyword set) aiming at the language text information, the method of the embodiment can more effectively and accurately correct and adjust the keywords by using the abstract text information, thereby ensuring that the finally generated keywords (namely the second keyword set) can more accurately reflect the semantics of the language text information.

Optionally, the operation of generating the abstract text information related to the language text information by using a preset semantic analysis model includes: generating corresponding sentence vectors aiming at sentences in the language text information by utilizing a preset sentence vector model; clustering the sentences according to the sentence vectors of the sentences so as to generate a plurality of sentence categories related to the sentences; selecting sentences closest to the centroids of the sentence categories from the sentence categories respectively; and generating and summarizing text information according to the selected sentence.

Specifically, when generating abstract text information related to language text information, the computing device first generates a corresponding sentence vector for a sentence in the language text information by using a preset sentence vector model. For example, the computing device may generate sentence vectors corresponding to individual sentences in the textual information using the Doc2Vec model.

The computing device then clusters the sentences according to the sentence vectors for each sentence, for example using a K-means clustering algorithm, resulting in K sentence categories corresponding to the sentences in the language text information. Wherein if each sentence is considered as one point corresponding to the sentence vector, the computing device may select the sentence closest to the centroid distance of the respective class from the k classes. The computing device then generates summary information associated with the language text information based on the sentences selected from the categories that are closest to the centroid.

Therefore, in this embodiment, first, relevant abstract text information is generated according to the language text information, and then, a corresponding word vector sequence (i.e., a part of the first word sequence) is generated according to the abstract text information. Thus, in this way, the present application can perform deduplication processing on sentences in the language text information. Namely, the abstract text information is generated by selecting the sentences closest to the centroid from the sentences with similar meanings in all the categories, and other sentences with the same or similar meanings are excluded. Therefore, the data volume processed by the subsequent sequence processing model is reduced, the processing efficiency of the subsequent sequence processing model is improved, meanwhile, the sequence processing model is prevented from processing a word vector sequence formed by a large number of similar word vectors, and the accuracy of the word vector sequence output by the sequence processing model is improved.

Optionally, the operation of generating a third set of keywords related to the language text information by using a preset semantic analysis model includes: and generating a third key word set related to the language text information by using a preset topic model. Specifically, for example, the computing device may generate keywords (i.e., the third set of keywords) associated with the language text information using a preset LDA topic model. Therefore, through the LDA topic model, the method of this embodiment can determine the keywords of the language text information according to the topic of the language text information.

Optionally, the operation of obtaining language text information related to the source picture from an information source related to the source picture includes: determining an approximate picture which is approximate to the source picture; and obtaining language text information from an information source associated with the source picture and the approximate picture.

Specifically, when searching for a web page or a web address related to a source picture, a computing device determines an approximate picture similar to the source picture according to the source picture. The computing device then performs a search based on the source picture and the approximate picture to obtain language text information associated with the source picture from a web page or web site (i.e., information source) associated with the source picture and the approximate picture. Therefore, through the mode, the search can be carried out more comprehensively, so that more comprehensive language text information related to the source picture is obtained, and the final search result can be obtained more comprehensively.

Optionally, the operation of determining an approximate picture that approximates the source picture includes: extracting picture characteristics of a source picture; determining similarity between the source picture and each picture in the picture set according to the picture characteristics of the source picture and the picture characteristics of the pictures in the preset picture set; and determining an approximate picture according to the similarity between the source picture and each picture in the picture set.

In particular, the computing device may first extract picture features of the source picture when determining an approximate picture that approximates the source picture. The method for extracting the picture features of the source picture is not limited, and for example, the picture features of the source picture may be extracted by using a trained convolutional layer (e.g., a convolutional layer of a convolutional neural network).

Then, the computing device determines the similarity between the source picture and each picture in the picture set according to the picture characteristics of the source picture and the picture characteristics of each picture in the preset picture set. For example, the similarity may be a feature distance calculated based on picture features of the source picture and picture features of the respective pictures.

Then, the computing device determines an approximate picture according to the similarity between the source picture and each picture in the picture set. For example, the computing device may select, as the approximate picture, a picture in which the distance between the source picture and the respective picture is less than a predetermined threshold.

In this way, an approximate picture that approximates the source picture can thus be determined based on the picture characteristics.

The method of this embodiment is described in detail in the flow sequence with reference to fig. 5. Fig. 5 shows a specific flow of the method according to the embodiment, and referring to fig. 5:

first, a computing device obtains a source picture input by a user (S502);

then, the computing device extracts the picture characteristics of the source picture and determines an approximate picture approximate to the source picture according to the picture characteristics (S504);

then, the computing device obtains relevant language text information from information sources (e.g., web pages, web sites, etc.) related to the source picture and the approximate picture (S506);

then, the computing device generates abstract text information related to the language text information and keywords (i.e. a third keyword set) by using a preset semantic analysis model (S508);

then, the computing device generates a corresponding word sequence (i.e. a first word sequence) according to the abstract text information and the keywords (S510);

then, the computing device inputs the generated word sequence into a preset sequence processing model, obtains an output word sequence (i.e. a second word sequence), and generates keywords (i.e. a second keyword set) related to the language text information according to the second word sequence (S512);

then, the computing device expands the keywords (i.e., the second keyword set) by using the preset knowledge graph to generate expanded keywords (i.e., the first keyword set) (S514); and

finally, the computing device performs a search using the expanded keywords to obtain a search result related to the source picture (S516).

Further, referring to fig. 1, according to a second aspect of the present embodiment, there is provided a storage medium. The storage medium comprises a stored program, wherein the method of any of the above is performed by a processor when the program is run.

Thus, according to the present embodiment, after the language text information related to the source picture is acquired from the web page and the web address, instead of outputting only the language text information to the user, keywords related to the source picture are determined continuously according to the language text information, and then a search is performed using the keywords to acquire a search result related to the source picture. In this way, the computing device can then break through the limitations of the information sources associated with the source pictures and perform further searches based on the keywords redetermined from the language text information. Therefore, the information related to the source picture can be comprehensively and accurately searched, and the technical problem that the information of the information source containing the picture is further processed due to the lack of the existing picture searching technology, so that more accurate and comprehensive information cannot be provided for a user is solved.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

Example 2

Fig. 6 shows a picture-based information search apparatus 600 according to the present embodiment, the apparatus 600 corresponding to the method according to the first aspect of embodiment 1. Referring to fig. 6, the apparatus 600 includes: a picture acquiring module 610, configured to acquire a source picture as a search cue; a language text information obtaining module 620, configured to obtain language text information related to the source picture from an information source related to the source picture; a keyword determining module 630, configured to determine, according to the language text information, a first keyword set related to the source picture; and a search module 640, configured to perform a search by using the keywords in the first keyword set to obtain a search result related to the source picture.

Optionally, the keyword determination module 630 includes: the first keyword generation submodule is used for generating a second keyword set related to the language text information by utilizing a preset natural language processing model; and the second keyword generation submodule is used for expanding the second keyword set according to a preset knowledge graph to generate the first keyword set.

Optionally, the first keyword generation sub-module includes: the semantic analysis unit is used for generating abstract text information and a third key word set which are related to the language text information by utilizing a preset semantic analysis model; the first word sequence generating unit is used for generating a first word sequence corresponding to the abstract text information and the third key word set; a second word sequence generating unit, configured to generate a second word sequence corresponding to the second keyword set according to the first word sequence by using a preset sequence processing model; and a keyword generation unit for generating a second keyword set according to the second word sequence.

Optionally, the semantic analysis unit comprises: the sentence vector generating subunit is used for generating corresponding sentence vectors aiming at the sentences in the language text information by utilizing a preset sentence vector model; the clustering subunit is used for clustering the sentences according to the sentence vectors of the sentences so as to generate a plurality of sentence categories related to the sentences; a sentence selection subunit for selecting a sentence closest to the centroid of each sentence category from the plurality of sentence categories, respectively; and the abstract generating subunit is used for generating and abstracting text information according to the selected sentences.

Optionally, the semantic analysis unit includes a keyword generation subunit, configured to generate a third keyword set related to the language text information by using a preset topic model.

Optionally, the language text information obtaining module 620 includes: the approximate picture determining submodule is used for determining an approximate picture approximate to the source picture; and a language text information acquisition sub-module for acquiring language text information from the information source related to the source picture and the approximate picture.

Optionally, the approximate picture determination sub-module includes: the picture characteristic extraction unit is used for extracting the picture characteristics of the source picture; the similarity determining unit is used for determining the similarity between the source picture and each picture in the picture set according to the picture characteristics of the source picture and the picture characteristics of the pictures in the preset picture set; and the approximate picture determining unit is used for determining the approximate picture according to the similarity between the source picture and each picture in the picture set.

Example 3

Fig. 7 shows a picture-based information search apparatus 700 according to the present embodiment, the apparatus 700 corresponding to the method according to the first aspect of embodiment 1. Referring to fig. 7, the apparatus 700 includes: a processor 710; and a memory 720, coupled to the processor 710, for providing instructions to the processor 710 to process the following steps: acquiring a source picture serving as a search clue; acquiring language text information related to a source picture from an information source related to the source picture; determining a first keyword set related to the source picture according to the language text information; and searching by using the keywords in the first keyword set to obtain a search result related to the source picture.

Optionally, the operation of generating a second keyword set related to the language text information by using a preset natural language processing model includes: generating abstract text information and a third key word set related to the language text information by using a preset semantic analysis model; generating a first word sequence corresponding to the abstract text information and the third key word set; generating a second word sequence corresponding to the second keyword set according to the first word sequence by using a preset sequence processing model; and generating a second keyword set according to the second word sequence.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, which can store program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

18页详细技术资料下载

Information searching method and device based on picture and storage medium

相关技术

网友询问留言