Method and system for carrying out multi-dimensional combined retrieval on enterprise information

文档序号:1937435 发布日期:2021-12-07 浏览:24次 中文

阅读说明:本技术 一种用于对企业信息进行多维度组合检索的方法及其系统 (Method and system for carrying out multi-dimensional combined retrieval on enterprise information ) 是由 黄永辉 袁石良 于 2021-09-07 设计创作,主要内容包括:本发明公开了一种用于对企业信息进行多维度组合检索的方法及其系统,具体包括以下步骤:步骤一、基础构建;步骤二、学习完善;步骤三、检索展示;步骤四、检索优化;本发明涉及信息检索技术领域。该用于对企业信息进行多维度组合检索的方法及其系统,通过互联网收集企业相关信息资料,先由NLP相关技术去构建词典,再采用bert+bilstm+crf模型架构去学习完善识别方法,构成基于fastapi的python微服务框架+docker+NLP与深度学习的技术框架,利用智能识别算法识别企业信息,以多维度组合检索的方式提供检索服务,不仅不需要大量的检索技巧,还可以满足大量用户的需求,提升用户的体验感,减少了网站跳出率,也对网站排名具有积极方向的帮助。(The invention discloses a method and a system for carrying out multi-dimensional combined retrieval on enterprise information, which specifically comprise the following steps: step one, constructing a foundation; step two, learning is completed; step three, retrieval and display; step four, retrieval optimization; the invention relates to the technical field of information retrieval. According to the method and the system for carrying out the multi-dimensional combined retrieval on the enterprise information, the enterprise related information data is collected through the Internet, a dictionary is built by the NLP related technology, then a bert + bilstm + crf model architecture is adopted to complete the learning and recognition method, a technical framework based on a fastapi python micro-service framework + docker + NLP and deep learning is formed, the enterprise information is recognized by an intelligent recognition algorithm, retrieval service is provided in a multi-dimensional combined retrieval mode, a large number of retrieval skills are not needed, the requirements of a large number of users can be met, the experience of the users is improved, the website jump rate is reduced, and the website ranking is helped in an active direction.)

1. A method for multi-dimensional combined retrieval of enterprise information, characterized by: the method specifically comprises the following steps:

step one, basic construction: constructing a fastapi + python micro-service framework, collecting enterprise information on the Internet through a docker engine, and performing dictionary construction on the collected enterprise information through an NLP (non line segment) correlation technique to generate an initial retrieval tag library corresponding to the enterprise information;

step two, learning is completed: importing the initial retrieval tag library corresponding to the enterprise information generated in the step one into a bert model, inputting an enterprise information statement into a bilstm + crf model architecture, finely adjusting retrieval tags stored in the initial retrieval tag library according to an output result, expanding the adjusted retrieval tags into the initial retrieval tag library in the step one after completing the dictionary constructed in the step one, and forming a fine-adjustment retrieval tag library;

step three, retrieval and display: receiving input of a user in a natural language form, obtaining higher-level features through keywords of user input sentences by utilizing a deep learning algorithm, namely, searching multiple layers of requirements behind the user, comparing the requirements with a search label library, obtaining search labels corresponding to the multiple layers of requirements, searching enterprise information corresponding to the search labels, integrating the information, and directly putting the integrated information on a display interface;

step four, retrieval optimization: and acquiring the frequency of browsing the display information corresponding to the display interface of the user, judging whether the analysis of the sentence input by the user meets the user requirement or not according to the frequency result, acquiring the keywords of the follow-up input sentence of the user when the analysis does not meet the user requirement, connecting the sentence keywords which do not meet the user requirement in series, and repeating the operation of the third step to realize retrieval optimization.

2. The method of claim 1, wherein the method comprises: and in the sentence input in the second step, each word in the sentence is a word vector containing word embedding and character embedding, the word embedding is trained in advance, the character embedding is initialized randomly, the word embedding vector is input into the bilstm + crf model architecture, then a prediction label corresponding to each word is output, and the corresponding initial retrieval label in the initial retrieval label library generated in the first step is adjusted according to the prediction label.

3. The method of claim 1, wherein the method comprises the steps of: the determination of whether the user requirement is met in the fourth step is specifically as follows: when the frequency quantity is less, the user requirement is met, otherwise, when the frequency quantity is more, the user requirement is not met.

4. A system for multidimensional combined retrieval of enterprise information, characterized by: the system comprises a basic construction unit, wherein the basic construction unit is used for constructing a service framework and generating an initial retrieval label aiming at enterprise information on the Internet, the basic construction unit is in butt joint with a learning perfecting unit and is used for determining the accuracy of the initial retrieval label, the basic construction unit is in butt joint with a retrieval display unit and is used for acquiring a multi-dimensional retrieval label and integrating retrieval results corresponding to the multi-dimensional retrieval label and displaying required information behind a user retrieval to the user, and the retrieval display unit is in butt joint with a retrieval optimization unit and is used for judging whether the retrieval results meet the user requirements or not according to the frequency of browsing the retrieval result information by the user and making corresponding adjustment for optimizing the retrieval results.

5. The system of claim 1, wherein the system is configured to perform multidimensional combinatorial retrieval of enterprise information, and wherein: the basic construction unit is used for constructing a fastapi + python micro-service framework, collecting enterprise information on the Internet through a docker engine, and performing dictionary construction on the collected enterprise information through an NLP (non line segment) correlation technique to generate an initial retrieval tag library corresponding to the enterprise information.

6. The system of claim 1, wherein the system is configured to perform multidimensional combinatorial retrieval of enterprise information, and wherein: the learning perfection unit further comprises a word vector input module, a result output module and a comparison adjustment module, and is used for importing an initial retrieval tag library of a corresponding enterprise into the bert model, inputting enterprise information sentences into the bilstm + crf model architecture, finely adjusting retrieval tags stored in the initial retrieval tag library according to output results, expanding the adjusted retrieval tags into the initial retrieval tag library after the construction of the dictionary is perfected, and forming a fine-adjusted retrieval tag library.

7. The system of claim 1, wherein the system is configured to perform multidimensional combinatorial retrieval of enterprise information, and wherein: the retrieval display unit also comprises a semantic analysis module, an index module and an information integration display module, and is used for receiving the input of a user natural language form, obtaining higher-level features through keywords of user input sentences by utilizing a deep learning algorithm, namely, the user retrieves the multi-layer requirements behind the user, compares the requirements with a retrieval tag library to obtain retrieval tags corresponding to the multi-layer requirements, retrieves enterprise information corresponding to the retrieval tags, integrates the information, and directly puts the integrated information on a display interface.

8. The system of claim 7, wherein the system is configured to perform multidimensional combinatorial retrieval of enterprise information, and wherein: the retrieval optimization unit further comprises a frequency statistics module and a sentence integration module, and is used for acquiring the frequency of browsing corresponding display interface display information of a user, judging whether the analysis of the user input sentences meets the user requirements or not according to the frequency result, acquiring keywords of subsequent input sentences of the user when judging that the user requirements are not met, connecting the sentence keywords in series when the user requirements are not met, and repeating the retrieval display operation to realize retrieval optimization.

Technical Field

The invention relates to the technical field of information retrieval, in particular to a method and a system for carrying out multi-dimensional combined retrieval on enterprise information.

Background

The method and the system for carrying out multi-dimensional combined retrieval on the enterprise information are provided, and the requirements of the user are met in a multi-dimensional combined retrieval mode.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a method and a system for carrying out multi-dimensional combined retrieval on enterprise information, which solve the problems.

(II) technical scheme

In order to achieve the purpose, the invention provides the following technical scheme: a method for carrying out multi-dimensional combined retrieval on enterprise information specifically comprises the following steps:

step one, basic construction: constructing a fastapi + python micro-service framework, collecting enterprise information on the Internet through a docker engine, and performing dictionary construction on the collected enterprise information through an NLP (non line segment) correlation technique to generate an initial retrieval tag library corresponding to the enterprise information;

step two, learning is completed: importing the initial retrieval tag library corresponding to the enterprise information generated in the step one into a bert model, inputting an enterprise information statement into a bilstm + crf model architecture, finely adjusting retrieval tags stored in the initial retrieval tag library according to an output result, expanding the adjusted retrieval tags into the initial retrieval tag library in the step one after completing the dictionary constructed in the step one, and forming a fine-adjustment retrieval tag library;

step three, retrieval and display: receiving input of a user in a natural language form, obtaining higher-level features through keywords of user input sentences by utilizing a deep learning algorithm, namely, searching multiple layers of requirements behind the user, comparing the requirements with a search label library, obtaining search labels corresponding to the multiple layers of requirements, searching enterprise information corresponding to the search labels, integrating the information, and directly putting the integrated information on a display interface;

step four, retrieval optimization: and acquiring the frequency of browsing the display information corresponding to the display interface of the user, judging whether the analysis of the sentence input by the user meets the user requirement or not according to the frequency result, acquiring the keywords of the follow-up input sentence of the user when the analysis does not meet the user requirement, connecting the sentence keywords which do not meet the user requirement in series, and repeating the operation of the third step to realize retrieval optimization.

By adopting the technical scheme, enterprise related information data are collected through the Internet, a dictionary is built by an NLP related technology, a bert + bilstm + crf model architecture is adopted to complete a learning and identifying method, a python micro-service framework + docker + NLP and a deep learning technology framework based on fastapi are formed, enterprise information is identified by an intelligent identification algorithm, and retrieval service is provided in a multi-dimensional combined retrieval mode.

The invention is further configured to: and in the sentence input in the second step, each word in the sentence is a word vector containing word embedding and character embedding, the word embedding is trained in advance, the character embedding is initialized randomly, the word embedding vector is input into the bilstm + crf model architecture, then a prediction label corresponding to each word is output, and the corresponding initial retrieval label in the initial retrieval label library generated in the first step is adjusted according to the prediction label.

By adopting the technical scheme, the initial retrieval tag is adjusted, and the accuracy of the retrieval result is ensured.

The invention is further configured to: the determination of whether the user requirement is met in the fourth step is specifically as follows: when the frequency quantity is less, the user requirement is met, otherwise, when the frequency quantity is more, the user requirement is not met.

By adopting the technical scheme, whether the retrieval result meets the requirements of the user or not is determined according to the judgment of the browsing frequency of the retrieval result that the user rarely browses other information after obtaining the required information.

The invention also discloses a system for carrying out multi-dimensional combined retrieval on enterprise information, which comprises a basic construction unit, wherein the basic construction unit is used for constructing a service framework and generating an initial retrieval label aiming at the enterprise information on the Internet, the basic construction unit is butted with a learning perfection unit and is used for determining the accuracy of the initial retrieval label, the basic construction unit is butted with a retrieval display unit and is used for obtaining the multi-dimensional retrieval label and integrating retrieval results corresponding to the multi-dimensional retrieval label and is used for displaying the required information behind the user retrieval to the user, and the retrieval display unit is butted with a retrieval optimization unit and is used for judging whether the retrieval result meets the user requirement or not according to the information frequency of the user browsing the retrieval result and making corresponding adjustment for optimizing the retrieval result.

By adopting the technical scheme, after a user inputs a retrieval instruction, the instruction can be analyzed in a higher-level characteristic manner, the purpose of multi-dimensional combination retrieval is realized, the retrieval results are integrated, the retrieval result meeting the user requirement is obtained, the satisfaction degree of the user requirement can be judged, and the optimization of the retrieval is realized.

The invention is further configured to: the basic construction unit is used for constructing a fastapi + python micro-service framework, collecting enterprise information on the Internet through a docker engine, and performing dictionary construction on the collected enterprise information through an NLP (non line segment) correlation technique to generate an initial retrieval tag library corresponding to the enterprise information.

By adopting the technical scheme, the integration of enterprise related information on the Internet is realized, and a dictionary and an initial retrieval tag are constructed.

The invention is further configured to: the learning perfection unit further comprises a word vector input module, a result output module and a comparison adjustment module, and is used for importing an initial retrieval tag library of a corresponding enterprise into the bert model, inputting enterprise information sentences into the bilstm + crf model architecture, finely adjusting retrieval tags stored in the initial retrieval tag library according to output results, expanding the adjusted retrieval tags into the initial retrieval tag library after the construction of the dictionary is perfected, and forming a fine-adjusted retrieval tag library.

The invention is further configured to: the retrieval display unit also comprises a semantic analysis module, an index module and an information integration display module, and is used for receiving the input of a user natural language form, obtaining higher-level features through keywords of user input sentences by utilizing a deep learning algorithm, namely, the user retrieves the multi-layer requirements behind the user, compares the requirements with a retrieval tag library to obtain retrieval tags corresponding to the multi-layer requirements, retrieves enterprise information corresponding to the retrieval tags, integrates the information, and directly puts the integrated information on a display interface.

The invention is further configured to: the retrieval optimization unit further comprises a frequency statistics module and a sentence integration module, and is used for acquiring the frequency of browsing corresponding display interface display information of a user, judging whether the analysis of the user input sentences meets the user requirements or not according to the frequency result, acquiring keywords of subsequent input sentences of the user when judging that the user requirements are not met, connecting the sentence keywords in series when the user requirements are not met, and repeating the retrieval display operation to realize retrieval optimization.

(III) advantageous effects

The invention provides a method and a system for carrying out multi-dimensional combined retrieval on enterprise information. The method has the following beneficial effects:

(1) according to the method and the system for carrying out the multi-dimensional combined retrieval on the enterprise information, the enterprise related information data is collected through the Internet, a dictionary is built by the NLP related technology, then a bert + bilstm + crf model architecture is adopted to complete the learning and recognition method, a technical framework based on a fastapi python micro-service framework + docker + NLP and deep learning is formed, the enterprise information is recognized by an intelligent recognition algorithm, retrieval service is provided in a multi-dimensional combined retrieval mode, a large number of retrieval skills are not needed, the requirements of a large number of users can be met, the experience of the users is improved, the website jump rate is reduced, and the website ranking is helped in an active direction.

(2) According to the method and the system for carrying out multi-dimensional combined retrieval on the enterprise information, the initial retrieval label is adjusted through a bert + bilstm + crf model architecture in a learning-perfect mode, and the accuracy of a retrieval result is guaranteed.

(3) According to the method and the system for carrying out the multi-dimensional combined retrieval on the enterprise information, whether the retrieval result meets the requirements of a user or not is determined by judging the browsing frequency of the retrieval result according to the fact that the user rarely browses other information after obtaining the required information, and then the retrieval is carried out through the keyword series connection of input sentences in the subsequent retrieval, so that a more detailed retrieval result is obtained, and the purpose of retrieval optimization is achieved.

Drawings

FIG. 1 is a system schematic block diagram of the infrastructure element of the present invention;

FIG. 2 is a system schematic block diagram of the learning refinement unit of the present invention;

FIG. 3 is a schematic block diagram of a system for retrieving a presentation unit according to the present invention;

FIG. 4 is a schematic block diagram of a system for a search optimization unit according to the present invention.

In the figure, 1, a basic construction unit; 2. a learning improvement unit; 3. retrieving a display unit; 4. and retrieving an optimization unit.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-4, the embodiment of the present invention provides the following two technical solutions:

the first embodiment,

A method for carrying out multi-dimensional combined retrieval on enterprise information specifically comprises the following steps:

step one, basic construction: constructing a fastapi + python micro-service framework, collecting enterprise information on the Internet through a docker engine, and performing dictionary construction on the collected enterprise information through an NLP (non line segment) correlation technique to generate an initial retrieval tag library corresponding to the enterprise information;

step two, learning is completed: importing the initial retrieval tag library corresponding to the enterprise information generated in the step one into a bert model, inputting an enterprise information statement into a bilstm + crf model architecture, finely adjusting retrieval tags stored in the initial retrieval tag library according to an output result, expanding the adjusted retrieval tags into the initial retrieval tag library in the step one after completing the dictionary constructed in the step one, and forming a fine-adjustment retrieval tag library;

step three, retrieval and display: receiving input of a user in a natural language form, obtaining higher-level features through keywords of user input sentences by utilizing a deep learning algorithm, namely, searching multiple layers of requirements behind the user, comparing the requirements with a search label library, obtaining search labels corresponding to the multiple layers of requirements, searching enterprise information corresponding to the search labels, integrating the information, and directly putting the integrated information on a display interface;

step four, retrieval optimization: collecting the frequency of browsing display information corresponding to a display interface of a user, judging whether the analysis of input sentences of the user meets the user requirements or not according to a frequency result, when the analysis does not meet the user requirements, collecting keywords of subsequent input sentences of the user, serially connecting the sentence keywords which do not meet the user requirements, repeating the operation of the third step to realize retrieval optimization, further explaining, collecting enterprise related information data through the Internet, firstly constructing a dictionary by an NLP related technology, then adopting a bert + bilstm + crf model architecture to complete a learning identification method, forming a python micro-service frame based on fastpi, a docker + NLP and a deep learning technical frame, identifying enterprise information by using an intelligent identification algorithm, providing retrieval service in a multi-dimensional combination retrieval mode, not only needing a large amount of retrieval skills, but also meeting the requirements of a large number of users and improving the experience of the user, the website jump rate is reduced, and the website ranking is positively helped.

Example II,

As an improvement of the previous embodiment, the present embodiment is a method for performing multidimensional combination retrieval on enterprise information, and specifically includes the following steps:

step one, basic construction: constructing a fastapi + python micro-service framework, collecting enterprise information on the Internet through a docker engine, and performing dictionary construction on the collected enterprise information through an NLP (non-line segment) related technology, wherein the NLP related technology specifically comprises text retrieval and is used for retrieving large-scale data; machine translation, used for cross-language translation; classifying texts; information extraction for extracting desired information from the irregular text; the sequence label is used for marking each character/word in the text with a corresponding label; the text abstract is used for focusing on the most core part in a given text and automatically generating the abstract; generating an initial retrieval tag library corresponding to the enterprise information, wherein the initial retrieval tag library is an initial retrieval tag corresponding to the enterprise information;

step two, learning is completed: importing the initial retrieval tag library corresponding to the enterprise information generated in the step one into a bert model, inputting an enterprise information sentence, wherein each word in the sentence is a word vector containing word embedding and word embedding, the word embedding is trained in advance, the word embedding is random initialization, inputting the word embedding vector into a bilstm + crf model architecture, finely adjusting the retrieval tag stored in the initial retrieval tag library according to an output result, and if the word embedding is required, the output result is a prediction tag corresponding to each word;

step three, retrieval and display: receiving input of a user in a natural language form, obtaining higher-level features through keywords of user input sentences by utilizing a deep learning algorithm, namely, searching multiple layers of requirements behind the user, comparing the requirements with a search label library, obtaining search labels corresponding to the multiple layers of requirements, searching enterprise information corresponding to the search labels, integrating the information, and directly putting the integrated information on a display interface;

step four, retrieval optimization: collecting the frequency of browsing the display information corresponding to the display interface of the user, judging whether the analysis of the user input sentences meets the user requirements according to the frequency result, when the frequency quantity is less, the user requirements are met, otherwise, when the frequency quantity is more, the user requirements are not met, when the frequency quantity needs to be explained, the specific judgment standard of the frequency quantity is set as the standard frequency quantity serving as a comparison threshold value, namely, the frequency and the frequency are lower than or the comparison threshold value, the frequency quantity is less, when the frequency quantity is higher than the comparison threshold value, the frequency quantity is more, when the frequency quantity is judged to not meet the user requirements, collecting the keywords of the follow-up input sentences of the user, connecting the sentence keywords when the frequency and the frequency are not met with the user requirements in series, repeating the operation of the third step, and realizing retrieval optimization.

The advantages of the second embodiment over the first embodiment are: according to the method, whether the retrieval result meets the requirements of the user is determined by judging the browsing frequency of the retrieval result, which means that the user rarely browses other information after obtaining the required information, and then the retrieval is carried out by serially connecting the keywords of the input sentences during the subsequent retrieval to obtain a more detailed retrieval result, so that the purpose of retrieval optimization is achieved.

The invention also discloses a system for carrying out multi-dimensional combined retrieval on the enterprise information, which comprises a basic construction unit 1, wherein the basic construction unit 1 is used for constructing a service framework and generating an initial retrieval label aiming at the enterprise information on the Internet, and specifically, the basic construction unit 1 is used for constructing a fastapi + python micro-service framework, collecting the enterprise information on the Internet through a docker engine, and carrying out dictionary construction on the collected enterprise information through NLP correlation technology to generate an initial retrieval label library corresponding to the enterprise information.

As a preferred scheme, the basic building unit 1 is in butt joint with the learning perfecting unit 2 and is used for determining the accuracy of an initial retrieval tag, specifically, as shown in fig. 2, the learning perfecting unit 2 further includes a word vector input module, a result output module and a comparison adjustment module, and is used for importing an initial retrieval tag library of a corresponding enterprise into a bert model, inputting an enterprise information sentence into a bilstm + crf model architecture, finely adjusting the retrieval tag stored in the initial retrieval tag library according to an output result, and expanding the adjusted retrieval tag into the initial retrieval tag library after completing the construction of a dictionary to form a fine-adjustment retrieval tag library.

As a preferred scheme, the basic construction unit 1 is connected to the retrieval and presentation unit 3 in a butt joint manner, and is configured to acquire a multidimensional retrieval tag, integrate retrieval results corresponding to the multidimensional retrieval tag, and present the retrieval results corresponding to the multidimensional retrieval tag to the user, specifically, as shown in fig. 3, the retrieval and presentation unit 3 further includes a semantic analysis module, an index module, and an information integration and presentation module, and is configured to receive input in a natural language form of the user, obtain a higher-level feature, i.e., a multi-level requirement behind user retrieval, by using a deep learning algorithm through a keyword of a user input sentence, compare the requirement with a retrieval tag library, acquire a retrieval tag corresponding to the multi-level requirement, retrieve enterprise information corresponding to the retrieval tag, and perform information integration, and directly present the integrated information on a presentation interface.

The retrieval and display unit 3 is connected with the retrieval and optimization unit 4 in a butt joint mode and is used for judging whether the retrieval result meets the user requirements or not according to the frequency of browsing the retrieval result information by the user and making corresponding adjustment for optimizing the retrieval result, specifically, as shown in the attached drawing 4, the retrieval and optimization unit 4 further comprises a frequency counting module and a sentence integration module which are used for collecting the frequency of browsing the corresponding display interface display information by the user, judging whether the analysis of the input sentences of the user meets the user requirements or not according to the frequency result, collecting the keywords of the subsequent input sentences of the user when the user requirements are not met, connecting the sentence keywords when the user requirements are not met in series, repeating the retrieval and display operation, and realizing the retrieval and optimization.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

10页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种数据可视化方法、装置和系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!