Search input information error correction method and device, electronic equipment and storage medium

文档序号：1889931 发布日期：2021-11-26 浏览：22次中文

阅读说明：本技术 搜索输入信息纠错方法、装置以及电子设备、存储介质 (Search input information error correction method and device, electronic equipment and storage medium ) 是由孙健于 2021-08-31 设计创作，主要内容包括：本公开提供了一种搜索输入信息纠错方法、装置以及电子设备、存储介质,涉及计算机技术领域,其中的方法包括：对搜索输入信息进行检测处理,用以判断搜索输入信息是否需要纠错,检测处理包括确定搜索输入信息的理解困惑度信息；如果需要纠错,则对搜索输入信息进行纠错处理,用以生成与搜索输入信息相对应的输入纠错信息,进行相应的搜索处理,纠错处理包括：基于预设词典生成输入纠错信息和/或根据搜索输入信息的纠错得分信息生成输入纠错信息；本公开的方法、装置以及电子设备、存储介质,可以提高纠错准确性,减少模型训练所需的语料,在线预测阶段延时小,适用于商业查询等场景。(The present disclosure provides a method and an apparatus for error correction of search input information, an electronic device and a storage medium, and relates to the technical field of computers, wherein the method comprises the following steps: detecting the search input information to judge whether the search input information needs error correction or not, wherein the detection comprises determining understanding confusion information of the search input information; if the error correction is needed, the error correction processing is carried out on the search input information to generate input error correction information corresponding to the search input information, and the corresponding search processing is carried out, wherein the error correction processing comprises the following steps: generating input error correction information based on a preset dictionary and/or generating input error correction information according to error correction score information of the search input information; the method, the device, the electronic equipment and the storage medium can improve the error correction accuracy, reduce the corpus required by model training, have small delay in the online prediction stage, and are suitable for scenes such as commercial inquiry.)

1. An error correction method for search input information, comprising:

the method comprises the following steps of detecting search input information to judge whether the search input information needs error correction or not, wherein the detection process comprises the following steps: determining comprehension confusion information of the search input information;

if yes, carrying out error correction processing on the search input information to generate input error correction information corresponding to the search input information, and carrying out corresponding search processing, wherein the error correction processing comprises: and generating the input error correction information based on a preset dictionary and/or generating the input error correction information according to the error correction score information of the search input information.

2. The method of claim 1, wherein the performing detection processing on the search input information to determine whether the search input information requires error correction comprises:

acquiring a first understanding confusion value corresponding to the search input information;

if the first understanding confusion value is within a preset range or smaller than the lower limit value of the preset range, the error correction processing on the search input information is not needed;

and if the first understanding confusion value is larger than the upper limit value of the preset range, carrying out error correction processing on the search input information.

3. The method of claim 2, wherein said performing error correction processing on said search input information to generate input error correction information corresponding to said search input information, performing corresponding search processing comprises:

carrying out error correction processing on the search input information to generate first input error correction information;

detecting the first input error correction information to obtain a second understanding confusion value corresponding to the first input error correction information;

and if the second understanding confusion value is within the preset range or smaller than the lower limit value of the preset range, using the first input error correction information to perform corresponding search operation.

4. The method of claim 3, wherein said error correcting said search input information to generate first input error correction information comprises:

performing word segmentation processing on the search input information to obtain search words;

querying in a preset dictionary by using the search word, and if the query result is not empty, determining the search word as a first reserved word; if the query result is empty, performing error correction processing on the search word based on the preset dictionary to generate a first replacement word;

generating the first input error correction information based on the first reserved word and/or the first replacement word.

5. The method of claim 4, wherein the error correction processing comprises: sound-like, homophonic or homomorphic error correction processing; the performing error correction processing on the search word based on the preset dictionary and generating a replacement word comprises:

determining the position of a word to be corrected in a search word needing to be subjected to error correction processing, and acquiring the search word according to the position;

acquiring a pronunciation-like word table, a homophone word table or a homomorphic word table in the preset dictionary, and acquiring a pronunciation-like replacement word, a homophone replacement word or a homomorphic replacement word corresponding to the search word based on the pronunciation-like word table, the homophone word table or the homomorphic word table;

generating a candidate replacement word set corresponding to the search word according to the sound-like replacement word, the homophonic replacement word or the homomorphic replacement word and the search word;

and determining candidate replacement words in the candidate replacement word set according to the use frequency of each candidate replacement word in the candidate replacement word set, wherein the candidate replacement words are used as the replacement words.

6. The method of claim 3, wherein generating input error correction information corresponding to the search input information, performing a corresponding search process further comprises:

if the second understanding confusion value is larger than the upper limit value of the preset range, acquiring an error correction subsequent information set corresponding to the search input information;

determining difference information between each input candidate information in the error correction subsequent information set and the search input information, and calculating an error correction score of each input candidate information according to the difference information;

and selecting input candidate information as second input error correction information based on the error correction score, wherein the second input error correction information is used for carrying out corresponding search operation.

7. The method of claim 6, wherein the set of error correction successor information comprises: a first error correction subsequent set and a second error correction subsequent set; the acquiring the error correction subsequent information set corresponding to the search input information includes:

acquiring query string information corresponding to the search input information, and performing query processing in a position positioning module based on the query string to acquire a first error correction subsequent information set corresponding to the search input information;

matching the search word with preset error-prone word pair information or similar word pair information to obtain a second reserved word and a second replacement word;

and generating error correction input information corresponding to the search input information based on the second reserved words, the second replacement words and the keywords, and generating the second error correction subsequent information set based on the error correction input information.

8. The method of claim 6, wherein the difference information includes an edit distance; the calculating an error correction score of each input candidate information according to the difference information includes:

calculating an error correction score of the input candidate information based on the edit distance;

wherein, for the same input candidate information in the first error correction subsequent set and the second error correction subsequent set, the weighting coefficients of the same input candidate information corresponding to the first error correction subsequent set and the second error correction subsequent set respectively are obtained, and the error correction score of the same input candidate information is calculated based on the edit distance and the weighting coefficients.

9. The method of claim 8, wherein the selecting input candidate information based on the error correction score as second input error correction information comprises:

and selecting the input candidate information corresponding to the error correction score selected by the error correction score from the first error correction subsequent set and the second error correction subsequent set as second input error correction information.

10. The method of claim 7, wherein the position-location module comprises: a Finite State Transducer (FST) module, the method further comprising:

acquiring words or phrases with use frequency exceeding a use frequency threshold, splitting the words or phrases to obtain word strings corresponding to the words or phrases;

constructing an FST branch based on the word strings and the words or phrases, and constructing the FST module according to the FST branch.

11. The method of claim 7, further comprising:

generating the error-prone word pair information or the shape-approximating word pair information based on the search log information and the search input modification information.

12. The method of claim 3, further comprising:

and detecting search input information and first input error correction information by using a trained language detection model to obtain the first understanding confusion value and the second understanding confusion value.

13. The method of any of claims 1 to 12, further comprising:

preprocessing user input information, removing preset interference information, performing format adjustment processing, and generating the search input information, wherein the interference information includes one or more of the following: emoticons, tabs, and nonsense characters.

14. An error correction apparatus for search input information, comprising:

the detection module is used for detecting and processing the search input information and judging whether the search input information needs error correction or not; wherein the detection process comprises: determining comprehension confusion information of the search input information;

the error correction module is used for carrying out error correction processing on the search input information if the search input information is judged to need error correction, and the error correction module is used for generating input error correction information corresponding to the search input information; wherein the error correction processing includes: and generating the input error correction information based on a preset dictionary and/or generating the input error correction information according to the error correction score information of the search input information.

15. An electronic device, characterized in that the electronic device comprises:

a processor; a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1-13.

16. A computer-readable storage medium, characterized in that the storage medium stores a computer program for performing the method of any of the preceding claims 1-13.

17. A computer program comprising computer readable code, characterized in that when the computer readable code is run on a device, a processor in the device executes instructions for implementing the method of any of claims 1-13.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for error correction of search input information, an electronic device, and a storage medium.

Background

Currently, search engines can provide search services for users, providing information that users need. When a user searches, errors of search input information can be corrected, the user is assisted in carrying out correct demand expression, and irrelevant results or zero results are reduced. For a commonly used chinese error correction method for searching input information, mainly for sentences or chapters with strong coherence, an error correction mechanism is mainly based on a deep neural network, for example, an encoder-decoder mechanism based on seq2 seq. The error correction method based on the deep neural network architecture has low error correction accuracy, huge required training corpus, high training cost and serious delay in an online error correction stage, and is not suitable for scenes such as commercial inquiry. Therefore, a new technical solution for error correction of search input information is needed.

Disclosure of Invention

The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides a method and a device for correcting errors of search input information, an electronic device and a storage medium.

According to a first aspect of the embodiments of the present disclosure, there is provided a search input information error correction method, including: detecting the search input information to judge whether the search input information needs error correction; wherein the detection process comprises: determining comprehension confusion information of the search input information; if yes, carrying out error correction processing on the search input information to generate input error correction information corresponding to the search input information, and carrying out corresponding search processing, wherein the error correction processing comprises: and generating the input error correction information based on a preset dictionary and/or generating the input error correction information according to the error correction score information of the search input information.

Optionally, the detecting the search input information to determine whether the search input information needs error correction includes: acquiring a first understanding confusion value corresponding to the search input information; if the first understanding confusion value is within a preset range or smaller than the lower limit value of the preset range, the error correction processing on the search input information is not needed; and if the first understanding confusion value is larger than the upper limit value of the preset range, carrying out error correction processing on the search input information.

Optionally, the performing error correction processing on the search input information to generate input error correction information corresponding to the search input information, and performing corresponding search processing includes: carrying out error correction processing on the search input information to generate first input error correction information; detecting the first input error correction information to obtain a second understanding confusion value corresponding to the first input error correction information; and if the second understanding confusion value is within the preset range or smaller than the lower limit value of the preset range, using the first input error correction information to perform corresponding search operation.

Optionally, the performing error correction processing on the search input information, and generating first input error correction information includes: performing word segmentation processing on the search input information to obtain search words; querying in a preset dictionary by using the search word, and if the query result is not empty, determining the search word as a first reserved word; if the query result is empty, performing error correction processing on the search word based on the preset dictionary to generate a first replacement word; generating the first input error correction information based on the first reserved word and/or the first replacement word.

Optionally, the error correction processing includes: sound-like, homophonic or homomorphic error correction processing; the performing error correction processing on the search word based on the preset dictionary and generating a replacement word comprises: determining the position of a word to be corrected in a search word needing to be subjected to error correction processing, and acquiring the search word according to the position; acquiring a pronunciation-like word table, a homophone word table or a homomorphic word table in the preset dictionary, and acquiring a pronunciation-like replacement word, a homophone replacement word or a homomorphic replacement word corresponding to the search word based on the pronunciation-like word table, the homophone word table or the homomorphic word table; generating a candidate replacement word set corresponding to the search word according to the sound-like replacement word, the homophonic replacement word or the homomorphic replacement word and the search word; and determining candidate replacement words in the candidate replacement word set according to the use frequency of each candidate replacement word in the candidate replacement word set, wherein the candidate replacement words are used as the replacement words.

Optionally, the generating input error correction information corresponding to the search input information, and performing corresponding search processing further includes: if the second understanding confusion value is larger than the upper limit value of the preset range, acquiring an error correction subsequent information set corresponding to the search input information; determining difference information between each input candidate information in the error correction subsequent information set and the search input information, and calculating an error correction score of each input candidate information according to the difference information; and selecting input candidate information as second input error correction information based on the error correction score, wherein the second input error correction information is used for carrying out corresponding search operation.

Optionally, the set of error correction subsequent information includes: a first error correction subsequent set and a second error correction subsequent set; the acquiring the error correction subsequent information set corresponding to the search input information includes: acquiring query string information corresponding to the search input information, and performing query processing in a position positioning module based on the query string to acquire a first error correction subsequent information set corresponding to the search input information; matching the search word with preset error-prone word pair information or similar word pair information to obtain a second reserved word and a second replacement word; and generating error correction input information corresponding to the search input information based on the second reserved words, the second replacement words and the keywords, and generating the second error correction subsequent information set based on the error correction input information.

Optionally, the difference information includes an edit distance; the calculating an error correction score of each input candidate information according to the difference information includes: calculating an error correction score of the input candidate information based on the edit distance; wherein, for the same input candidate information in the first error correction subsequent set and the second error correction subsequent set, the weighting coefficients of the same input candidate information corresponding to the first error correction subsequent set and the second error correction subsequent set respectively are obtained, and the error correction score of the same input candidate information is calculated based on the edit distance and the weighting coefficients.

Optionally, the selecting the input candidate information based on the error correction score as the second input error correction information includes: and selecting the input candidate information corresponding to the error correction score selected by the error correction score from the first error correction subsequent set and the second error correction subsequent set as second input error correction information.

Optionally, the position location module comprises: a Finite State Transducer (FST) module, the method further comprising: acquiring words or phrases with use frequency exceeding a use frequency threshold, splitting the words or phrases to obtain word strings corresponding to the words or phrases; constructing an FST branch based on the word strings and the words or phrases, and constructing the FST module according to the FST branch.

Optionally, the error-prone word pair information or the shape-approximating word pair information is generated based on search log information and search input modification information.

Optionally, a trained language detection model is used to detect search input information and first input error correction information, and the first understanding confusion value and the second understanding confusion value are obtained.

Optionally, preprocessing user input information, removing preset interference information, and performing format adjustment processing to generate the search input information; wherein the interference information comprises one or more of: emoticons, tabs, and nonsense characters.

According to a second aspect of the embodiments of the present disclosure, there is provided a search input information error correction apparatus including: the detection module is used for detecting and processing the search input information and judging whether the search input information needs error correction or not; wherein the detection process comprises: determining comprehension confusion information of the search input information; the error correction module is used for carrying out error correction processing on the search input information if the search input information is judged to need error correction, so as to generate input error correction information corresponding to the search input information and carry out corresponding search processing; wherein the error correction processing includes: and generating the input error correction information based on a preset dictionary and/or generating the input error correction information according to the error correction score information of the search input information.

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; the processor is used for executing the method.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above-mentioned method.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program comprising computer readable code which, when run on a device, a processor in the device executes a method for implementing the above.

Based on the search input information error correction method and device, the electronic device and the storage medium provided by the embodiment of the disclosure, the search input information is detected by using a language detection model, and a first understanding confusion value is obtained; if the first understanding confusion value is larger than a preset range, generating first input error correction information; detecting the first input error correction information by using a language detection model to obtain a second understanding confusion value; if the second understanding confusion value is larger than the preset range, acquiring an error correction subsequent information set; determining distance information between the input candidate information and the search input information, and calculating an error correction score; determining second input error correction information based on the error correction score, and performing corresponding search operation; the method can provide various error correction information for the user, can correct the search input of the user in time, improves the error correction accuracy and reduces the search cost of the user; the method has the advantages of reducing corpora required by model training, reducing cost, being capable of performing online prediction stage in real time, being small in delay, being suitable for scenes such as commercial inquiry and the like, and effectively improving user experience.

The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps. FIG. 1 is a flow chart of one embodiment of a search input information error correction method of the present disclosure;

FIG. 1 is a flow chart of one embodiment of a search input information error correction method of the present disclosure;

FIG. 2 is a flow chart of another embodiment of a search input information error correction method of the present disclosure;

FIG. 3 is a flow chart of generating first input error correction information in one embodiment of a search input information error correction method of the present disclosure;

FIG. 4 is a schematic diagram of generating alternative words in an embodiment of a method for error correction of search input information according to the present disclosure;

FIG. 5 is a flowchart of obtaining a subsequent set of error corrections in one embodiment of a method of error correction for search input information according to the present disclosure;

FIG. 6 is a schematic block diagram of an embodiment of a method for error correction of search input information according to the present disclosure in a practical application scenario;

FIG. 7 is a schematic structural diagram of an embodiment of an apparatus for error correction of search input information according to the present disclosure;

fig. 8 is a schematic structural diagram of another embodiment of a search input information error correction apparatus according to the present disclosure;

FIG. 9 is a schematic structural diagram of a detection module in an embodiment of the apparatus for error correction of search input information according to the present disclosure;

FIG. 10 is a schematic structural diagram of an error correction module in an embodiment of the apparatus for correcting search input information according to the present disclosure;

FIG. 11 is a schematic structural diagram of an error correction detection module in an embodiment of the apparatus for error correction of search input information according to the present disclosure;

FIG. 12 is a schematic structural diagram of an error correction information generating module in an embodiment of the apparatus for error correction of search input information according to the present disclosure;

FIG. 13 is a block diagram of one embodiment of an electronic device of the present disclosure.

Detailed Description

Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.

It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.

It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more than two and "at least one" may refer to one, two or more than two.

It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.

In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, such as a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

Embodiments of the present disclosure may be implemented in electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with an electronic device, such as a terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

Summary of the application

In the process of implementing the present disclosure, the inventor finds that the error correction accuracy of the existing search input information error correction method is low, the required training corpus is huge, the training cost is high, and the delay of the online error correction stage is serious.

The search input information error correction method provided by the disclosure is used for detecting search input information to judge whether the search input information needs error correction or not, wherein the detection processing comprises determining understanding confusion information of the search input information; if the error correction is needed, the error correction processing is carried out on the search input information to generate input error correction information corresponding to the search input information, and the corresponding search processing is carried out, wherein the error correction processing comprises the following steps: and generating input error correction information based on a preset dictionary and/or generating input error correction information according to the error correction score information of the search input information. The search input information error correction method can provide various error correction information for the user, can correct the search input of the user in time, improves the error correction accuracy and reduces the search cost of the user; the method can reduce the corpus required by model training, reduce the cost, can be carried out in real time in the online prediction stage, has small time delay, and is suitable for scenes such as commercial inquiry.

Exemplary method

Fig. 1 is a flowchart of an embodiment of a method for error correction of search input information according to the present disclosure, where the method shown in fig. 1 includes the steps of: S101-S106. The following describes each step.

S101, detecting the search input information to judge whether the search input information needs error correction; wherein the detection process includes determining understanding confusion information of the search input information.

In one embodiment, the search input information is information input by the user at the entrance of the search engine for performing a search, including a search keyword query, the entrance of the search engine may be a dialog box, etc., and the user may input the search input information at the entrance of the search engine through a mobile phone, a PC, etc. The search input information can be manually input by a user, or the user inputs information in a short message, a WeChat or a webpage as the search input information into an entrance of a search engine in a copying and pasting mode. The understanding confusion degree information of the search input information may be an understanding confusion degree value of the search input information or the like, the understanding confusion degree value being used to represent whether the search input information (input text) is "understandable".

S102, if yes, carrying out error correction processing on the search input information to generate input error correction information corresponding to the search input information, and carrying out corresponding search processing; wherein the error correction processing includes: and generating input error correction information based on a preset dictionary and/or generating input error correction information according to the error correction score information of the search input information.

In one embodiment, the first input error correction information of the search input information may be generated based on a preset dictionary, or the first input error correction information of the search input information may be generated based on the preset dictionary, and the second input error correction information may be generated according to the error correction score information of the first search input information. The first input error correction information of the search input information may be generated based on the error correction score information of the search input information. And performing corresponding search processing by using the first input error correction information or the second input error correction information. The error correction of the search input information can be directly obtained, a plurality of error correction results can also be output and the user is prompted to select, and the user selects one error correction result from the plurality of error correction results.

Fig. 2 is a flowchart of another embodiment of the error correction method for search input information according to the present disclosure, where the method shown in fig. 2 includes the steps of: S201-S206. The following describes each step.

S201, the trained language detection model is used for detecting the search input information, and a first understanding confusion value corresponding to the search input information is obtained.

In one embodiment, the search input information comprises brand phrases, industry vocabularies, names of people, place names and the like in a commercial search scene, most of the search input information for the user to conduct the commercial search comprises certain entities or phrases and the like, and the sentence pattern component is single and simple.

The language detection model can adopt various models, search input information is input into the language detection model, the probability of the search input information appearing in a search scene is calculated through the language detection model to determine the comprehension confusion value of the search input information, and the comprehension confusion value is used for representing whether the search input information (input text) is "comprehensible".

The corpus of the language detection model may use a high frequency of clicks for historical search input information, etc. The comprehension confusion value of the search input information and the corrected first input error correction information may be determined using a language model module. For example, when the search input information is "hangzhou bird network", the first understanding confusion value is 100; when the search input information is "hangzhou cuiguange", the first understanding confusion value is 1000. The larger the confusion value, the more difficult the search input information is to be understood.

S202, if the first understanding confusion value is larger than a preset range, carrying out error correction processing on the search input information to generate first input error correction information.

In one embodiment, a preset range is preset, whether the first understanding confusion value is larger than the preset range is judged, if yes, error correction processing is carried out on the search input information, and a plurality of error correction processing methods can be adopted to generate first input error correction information; and if the first understanding confusion value is smaller than a preset range, the search input information can be understood, and corresponding search operation is carried out by using the search input information. The search operation may be an existing search operation, such as providing a user with search results corresponding to search input information, or the like.

S203, the language detection model is used for detecting the first input error correction information, and a second understanding confusion value corresponding to the first input error correction information is obtained.

And S204, if the second understanding confusion value is larger than the preset range, acquiring an error correction subsequent information set corresponding to the search input information.

In one embodiment, after the first input error correction information is generated, the first input error correction information is input to the language detection model, and a second understanding confusion value output by the language detection model is obtained. Judging whether the second understanding confusion value is larger than a preset range, if so, acquiring an error correction subsequent information set corresponding to the search input information, and acquiring the error correction subsequent information set by adopting various methods; and if the second understanding confusion value is smaller than the preset range, the first input error correction information can be understood, and the first input error correction information is used for carrying out corresponding searching operation. The search operation may be an existing search operation, such as prompting the user to correct the search input information, providing the user with search results corresponding to the search input information, and so forth.

S205, determining distance information between each input candidate information in the error correction subsequent information set and the search input information, and calculating the error correction score of each input candidate information according to the distance information.

And S206, selecting the input candidate information as second input error correction information based on the error correction score, and performing corresponding search operation by using the second input error correction information.

In one embodiment, after the second understanding confusion value is judged to be larger than the preset range, the most possible error correction subsequent information set is obtained through a plurality of methods, the distance information between each input candidate information in the error correction subsequent information set and the search input information is determined, the distance can be edited, and the like, and the error correction score of each input candidate information is calculated according to the distance information. For example, the input candidate information corresponding to the maximum error correction score is selected as the second input error correction information, and the second input error correction information is used for performing the corresponding search operation. The search operation may be an existing search operation, such as prompting the user to correct the search input information, providing the user with search results corresponding to the search input information, and so forth.

Various methods may be used to generate the first input error correction information. Fig. 3 is a flowchart of generating first input error correction information in an embodiment of the method for error correcting search input information according to the present disclosure, where the method shown in fig. 3 includes the steps of: S301-S303. The following describes each step.

S301, performing word segmentation processing on the search input information to obtain search words.

In one embodiment, a variety of segmentation processing methods may be employed. For example, the search input information is "Hangzhou dish birdfamily branch public and private", the word segmentation processing is performed on the "Hangzhou dish birdfamily branch public and private", and the search words are "Hangzhou", "vegetable bird", "branch of the family" and "public and private". Various existing word segmentation processing methods can be adopted.

S302, using the search word to perform query in a preset dictionary, and if the query result is not empty, determining the search word as a first reserved word; and if the query result is null, performing error correction processing on the search word based on a preset dictionary to generate a first replacement word.

In one embodiment, a preset dictionary is provided, and the preset dictionary can provide a quick query function for classified dictionary sets storing high-frequency high-quality words and phrases. The preset dictionary comprises a social name stream, entrepreneurs, stars, political matters and other name dictionaries, wherein names such as 'thank you front', 'martial heaven' and the like; the preset dictionary comprises a common Chinese brand dictionary, and Chinese brands such as 'sky eye searching', 'Tengchong video', 'byte jumping' and the like; the preset dictionary comprises a common industry vocabulary dictionary, and industry vocabularies such as human resources, concrete, glass curtain walls and the like.

For example, the search words "hangzhou", "vegetable and bird", "branch of family", "public and private" are used to perform the query in the preset dictionary. If the query result of the Hangzhou and the vegetable bird is not empty, determining the Hangzhou and the vegetable bird as a first reserved word; if the query results of the branch department and the public and private department are null, error correction processing is carried out on the Hangzhou and the vegetable bird based on a preset dictionary, and a first replacement word is generated. The error correction processing includes homophonic or homomorphic error correction processing and the like.

S303, generating first input error correction information based on the first reserved word and the first replacement word.

Fig. 4 is a schematic diagram of generating a replacement word in an embodiment of the error correction method for search input information of the present disclosure, where the method shown in fig. 4 includes the steps of: S401-S404. The following describes each step.

S401, determining each search word forming the search word needing to be subjected to error correction processing.

S402, a homophonic character table or a homomorphic character table is obtained in a preset dictionary, and homophonic replacement characters or homomorphic replacement characters corresponding to the search characters are obtained based on the homophonic character table or the homomorphic character table.

And S403, generating a candidate replacement word set corresponding to the search word according to the homophonic replacement word or the homomorphic replacement word and the search word.

S404, determining candidate replacement words in the candidate replacement word set according to the use frequency of each candidate replacement word in the candidate replacement word set, wherein the candidate replacement words are used as the replacement words.

In one embodiment, the search words that need to be processed for error correction include search words generated due to homophone word selection errors, pinyin spelling errors, font input errors, and the like. Determining the search words forming the search words 'branch of subject' and 'public and private' needing to be subjected to error correction processing as 'branch of subject', 'branch', 'public' and 'private'. The preset dictionary is preset with homophonic character table or homomorphic character table, and homophonic alternative characters or homomorphic alternative characters corresponding to the search characters are obtained according to the homophonic character table or the homomorphic character table. For example, the homophonic alternative word of the 'section' is 'ok', and the homomorphic alternative word is 'material'; the homophonic character replacement of the branches is the letter, and the homomorphic character replacement is the skill; generating the candidate replacement words in the candidate replacement word set corresponding to the "branch of the department" includes: "science and technology", etc.

The frequency of use of the candidate replacement words used by all users in performing the search may be stored in advance, and the frequency of use may be the number of times that the candidate replacement words are used in a period of one week, one month, or the like. For example, if the use frequencies of the candidate replacement words "science", "technology", "available", "material skill" and the like are determined to be 0,100,0,0 and the like, respectively, the candidate replacement word corresponding to the highest use frequency value is used as the replacement word, that is, the replacement word of "branch of science" is "technology"; based on the same method, the alternative word of "public and private" is determined to be "company". The first input error correction information corresponding to the search input information "hangzhou cuisine branch public privacy" is generated as "hangzhou cuisine science and technology company".

In one embodiment, the error correction successor information set may be obtained using a variety of methods. The set of error correction successor information includes a first error correction successor set and a second error correction successor set. Fig. 5 is a flowchart of obtaining an error correction subsequent set in an embodiment of the error correction method for searching input information of the present disclosure, where the method shown in fig. 5 includes the steps of: S501-S503. The following describes each step.

S501, acquiring query string information corresponding to the search input information, performing query processing in a preset Finite State Transducer (FST) module based on the query string, and acquiring a first error correction subsequent information set corresponding to the search input information.

In one embodiment, the structure of the FST (Finite State Transducers) is similar to the data structure of the prefix matching tree trie, enabling the location of the query string in the data set to be quickly located. The FST module can be various existing FST modules, and the FST module can provide query functions in the form of FST < Key, Value >. Query string information corresponding to the search input information can be acquired and used as a Key, and the Key is input into the FST module to acquire a corresponding Value; and performing query processing in a preset FST module based on the query string to obtain a first error correction subsequent information set corresponding to the search input information.

For example, the search input information is "Tencent shard Wired public wire", when a second understanding confusion value corresponding to first input error correction information corresponding to "Tencent shard Wired public wire" is larger than a preset threshold value, query string information corresponding to "Tencent shard Wired public wire" is acquired as "Tencent" and the like, query processing is performed in a preset FST module based on the query string "Tencent" and the like, a first error correction subsequent information set corresponding to the search input information is acquired, and input candidate information contained in the first error correction subsequent information set comprises "Tencent shard Limited company", "Tencent science and technology Limited company" and the like.

And S502, matching the search word with preset error-prone word pair information or shape-similar word pair information to obtain a second reserved word and a second replacement word.

S503, generating error correction input information corresponding to the search input information based on the second reserved words, the second replacement words and the keywords, and generating a second error correction subsequent information set based on the error correction input information.

In one embodiment, the error-prone word pair information or the shape-similar word pair information is preset, and the search word is matched with the preset error-prone word pair information or the shape-similar word pair information. For example, word segmentation processing is performed on "Tengchin strand score Wired and common filament" to generate search words "Tengchen", "share", "Wired" and "common filament", and matching processing is performed on information or similar word pair information by using the search words "Tengchen", "share", "Wired" and "common filament" and preset error-prone word pairs, so that a second reserved word "Tengchen" can be obtained; the second alternative word corresponding to the "share score" is "share", the second alternative word corresponding to the "wire" is "limited", and the second alternative word corresponding to the "public silk" is "company". Generating error correction input information corresponding to the 'Tencent shard wire common wire' based on the second reserved word 'Tencent' and the second replacement word and the keyword, and generating a second error correction subsequent information set based on the error correction input information, wherein the error correction input information in the second error correction subsequent information set comprises: "Tencent GmbH", and the like.

In one embodiment, the distance information may be edit distance information or the like. Calculating an error correction score of the input candidate information based on the edit distance information; if the first error correction subsequent set and the second error correction subsequent set have the same input candidate information, determining the weighting coefficient of the same input candidate information, and calculating the error correction score of the same input candidate information based on the edit distance information and the weighting coefficient.

The Edit Distance (MED), also called Levenshtein Distance, refers to the Minimum number of editing operations required to change one character string into another character string. The distance information between each input candidate information in the error correction subsequent information set and the search input information may be calculated using various existing methods.

For example, edit distances between the respective input candidate information in the first error-corrected subsequent set and the second error-corrected subsequent set and the search input information "Tencent shards". The first error correction subsequent set and the second error correction subsequent set have the same input candidate information 'Tencent Limited company', the weighting coefficient of the same input candidate information 'Tencent Limited company' is determined to be 2, the edit distance between the input candidate information 'Tencent Limited company' and the search input information 'Tencent sharing Wired common wire' is multiplied by the weighting coefficient 2, and the error correction score of the 'Tencent sharing Limited company' is calculated.

And if the input candidate information 'Tencent science and technology limited' only appears in the first error correction subsequent set, taking the edit distance between the input candidate information 'Tencent science and technology limited' and the search input information 'Tencent sharing Wired common wire' as an error correction score. And selecting the input candidate information corresponding to the maximum error correction score from the first error correction subsequent set and the second error correction subsequent set as second input error correction information by selecting the error correction score.

In one embodiment, words or phrases with usage frequency exceeding a usage frequency threshold are obtained, splitting processing is carried out on the words or phrases, and word strings corresponding to the words or phrases are obtained; and constructing an FST branch based on the word strings and the words or phrases, and constructing an FST module according to the FST branch. The existing methods can be adopted to construct an FST branch based on word strings and words or phrases, and an FST module is constructed according to the FST branch. The FST module can be constructed using openFST tools.

For example, words and phrases with high quality and high frequency are mined in an off-line manner, words and phrases with usage frequency exceeding a usage frequency threshold are obtained, the usage frequency threshold can be set according to the number of times of usage in a period of one week or one month. Splitting the acquired words or phrases to acquire word strings corresponding to the words or phrases; and constructing an FST branch based on the word strings and the words or phrases, and constructing an FST module according to the FST branch. A large number of high frequency, high quality candidate words and phrases are maintained, giving a candidate string that is most similar to the u query string based on edit distance.

And generating error-prone word pair information or similar-shape word pair information based on the search log information and the search input modification information. For example, error-prone pairs and shape-similar words are mined from a large-scale clicking and searching sequence log in an off-line mode, and error-prone pair word pair information or shape-similar word pair information is generated and updated in real time. The situations of one-tone multiple characters, one-character multiple tones, shapes and characters of Chinese characters are abundant, such as 'laundry wood-washing machine', 'Su you month-Su you pun', etc.

In one embodiment, the language detection model may be a variety of models, such as an n-gram model or the like; and detecting the search input information and the first input error correction information by using the trained n-gram model to obtain a first understanding confusion value and a second understanding confusion value. The N-gram is an algorithm based on a statistical language model, and is characterized in that the content in the text is subjected to sliding window operation with the size of N according to bytes to form a byte fragment sequence with the length of N. The n-gram model can be trained using existing n-gram models and using existing training methods.

Preprocessing user input information, removing preset interference information, performing format adjustment processing, and generating search input information, wherein the interference information comprises: emoticons, tabs, nonsense characters, and the like. Emoticons, tabs, nonsense non-Chinese characters and the like can be removed through preprocessing, and long sentences or segments with obvious boundaries can be divided and the like.

In an embodiment, as shown in fig. 6, the query is search input information, the Preprocessor is a preprocessing module of the query, and the preprocessing module is configured to preprocess the search input information input by the user, remove preset interference information, perform format adjustment processing, remove emoticons, tabs, nonsense non-chinese characters, and the like, and can divide a long sentence or a segment with an obvious boundary to provide a more regular query for subsequent processing.

The DictDataStore is a dictionary set used for storing high-frequency and high-quality words and phrases in categories and providing a quick query function for other modules. The preset dictionary comprises a social name stream, entrepreneurs, stars, political matters and other name dictionaries, wherein names such as 'thank you front', 'martial heaven' and the like; the preset dictionary comprises a common Chinese brand dictionary, and Chinese brands such as 'sky eye searching', 'Tengchong video', 'byte jumping' and the like; the preset dictionary comprises a common industry vocabulary dictionary, and industry vocabularies such as human resources, concrete, glass curtain walls and the like.

The LanguageModel is a language detection model and is used for calculating the probability of the input text appearing in a search scene and outputting the confusion degree of a numerical type to express whether the input text is 'understandable'. The training corpus of the model is high click frequency query, and the scale can be tens of millions of orders. In an error correction system, a language detection model is responsible for giving an understanding confusion value of an original query and a corrected correct candidate so as to assist an error detection module (Detector) and an error correction module (Corrector) to make decisions. For example, query is "hangzhou bird network" with an understanding confusion value of 100; query is "hangzhou vegetable birdnet chuck" and the comprehension value is 1000.

The Detector is an error detection module, and the function targets thereof are as follows: and under the condition of the least error rate (the correct query is identified as having an error), correctly identifying the wrong query and giving an error position. The Corrector is an error correction module, and the functional targets thereof are as follows: after the Detector informs the query that the error exists, a plurality of modes are tried to obtain the most possible error correction subsequent set, the calculation score sorting is carried out, and the highest score is obtained as the final error correction result.

The Ranker is a ranking module and is used for calculating probability scores of candidates generated by the corector, considering each candidate feature and the similarity degree of the candidate feature and the original query from multiple angles, and the candidate feature can be a noise channel model w ═ argmax p (x) and the like. FST (Finite State Transducers) is a data structure similar to a prefix matching tree trie, and can quickly locate the position of a query string in a data set. The FST-Store is an FST storage module and is used for maintaining a large number of high-frequency high-quality candidate words and phrases, recalling a path of independent candidate words and phrases, and giving a candidate string which is most similar to a query string based on an edit distance. The generation of candidate lexical phrases is responsible for continually updating the candidate set by the CorrectCandidateWriter.

The CandidateGenerator is a candidate generator module, is used for generating possible candidates by replacing homophones and homonyms in cooperation with a Detector module aiming at the condition that a large number of Chinese characters with one pronunciation and multiple characters, one character and multiple pronunciations, similar characters and the like occur. Homophones, near-to-shape, and common error pairs are maintained within the CandidateGenerator module, such as "laundry wood-washing machine", "sovereiyue-sovereikone". And the CorrectCandidateWriter is a correction candidate writing module and is responsible for offline mining high-quality and high-frequency words and phrases and updating the words and phrases into the FST-Store module in real time. SearchLogMining is a search log mining module and is responsible for mining error-prone pairs and similar words from large-scale click and search sequence logs offline and updating the words into a CandidateGenerator module in real time.

Exemplary devices

In one embodiment, as shown in fig. 7, the present disclosure provides a search input information error correction apparatus, including: a detection module 51 and an error correction module 53. The detection module 51 performs detection processing on the search input information to determine whether the search input information requires error correction, the detection processing including determination of understanding confusion information of the search input information and the like. If the search input information needs to be corrected, the error correction module 52 performs error correction processing on the search input information to generate input error correction information corresponding to the search input information, and performs corresponding search processing, where the error correction processing includes: and generating at least one of input error correction information based on a preset dictionary and input error correction information according to the error correction score information of the search input information.

In one embodiment, as shown in fig. 9, the detection module 51 includes: a confusion determination module 501 and an error correction detection module 502. The confusion determination module 501 acquires a first understanding confusion value corresponding to the search input information. If the first understanding confusion value is within the preset range or less than the lower limit value of the preset range, the error correction detection module 502 determines that error correction processing is not required for the search input information. If the first understanding confusion value is larger than the upper limit value of the preset range, the error correction detection module 502 determines that the error correction processing needs to be performed on the search input information.

As shown in fig. 10, the error correction module 52 includes: an error correction information generation module 503, an error correction score determination module 504 and an error correction information selection module 505. The error correction information generation module 503 performs error correction processing on the search input information to generate first input error correction information; the confusion determination module 501 detects the first input error correction information and obtains a second understanding confusion value corresponding to the first input error correction information. If the second understanding confusion value is within the preset range or smaller than the lower limit value of the preset range, the error correction information generation module 503 performs a corresponding search operation using the first input error correction information.

In one embodiment, as shown in fig. 11, the error correction detection module 502 includes a word segmentation processing unit 5021, an error correction processing unit 5022, and an information generation unit 5023. The word segmentation processing unit 5021 performs word segmentation processing on the search input information to obtain search words. The error correction processing unit 5022 performs a query in a preset dictionary using the search word, and determines the search word as a first reserved word if the query result is not empty. If the query result is null, the error correction processing unit 5022 performs error correction processing on the search word based on a preset dictionary to generate a first replacement word. The information generating unit 5023 generates first input error correction information based on the first reserved word and/or the first replacement word.

The error correction processing comprises sound-like, homophonic or homomorphic error correction processing; the error correction processing unit 5022 determines the individual search words that make up the search word that needs to be error corrected. The error correction processing unit 5022 obtains a homophonic character table or a homomorphic character table in the preset dictionary, and obtains homophonic replacement characters or homomorphic replacement characters corresponding to the search characters based on the homophonic character table or the homomorphic character table. The error correction processing unit 5022 generates a candidate replacement word set corresponding to the search word from the homophonic replacement word or homomorphic replacement word and the search word. The error correction processing unit 5022 determines candidate replacement words in the candidate replacement word set according to the frequency of use of each candidate replacement word in the candidate replacement word set, and the candidate replacement words are used as replacement words.

In one embodiment, if the second understanding confusion value is greater than the upper limit value of the preset range, the error correction information generation module 503 acquires the error correction subsequent information set corresponding to the search input information; the error correction score determining module 504 determines difference information between each input candidate information in the error correction subsequent information set and the search input information, and calculates an error correction score of each input candidate information according to the difference information; the error correction information selection module 505 selects the input candidate information as second input error correction information based on the error correction score, so as to perform a corresponding search operation using the second input error correction information.

The error correction subsequent information set comprises a first error correction subsequent set and a second error correction subsequent set; as shown in fig. 12, the error correction information generation module 505 includes a first set acquisition unit 5051 and a second set acquisition unit 5052. The first set acquisition unit 5051 acquires query string information corresponding to search input information, performs query processing in a preset FST module based on the query string, and acquires a first error correction subsequent information set corresponding to the search input information.

The second set acquisition unit 5052 performs matching processing on the information or the shape-similar word pair information using the search word and a preset error-prone word pair to obtain a second reserved word and a second replacement word. The second set acquisition unit 5052 generates error correction input information corresponding to the search input information based on the second reserved word and the second replacement word and the keyword, and generates a second error correction subsequent information set based on the error correction input information.

The distance information includes edit distance information, and the error correction score determination module 504 calculates an error correction score of the input candidate information based on the edit distance information; if the first error correction subsequent set and the second error correction subsequent set have the same input candidate information, the error correction score determining module 504 determines the weighting factor of the same input candidate information, and calculates the error correction score of the same input candidate information based on the edit distance information and the weighting factor. The error correction information selection module 505 selects the error correction score from the first error correction subsequent set and the second error correction subsequent set to select the input candidate information corresponding to the maximum error correction score as the second input error correction information.

In one embodiment, as shown in fig. 8, the search input information error correction apparatus of the present disclosure further includes a module construction module 506, an error correction replacement information construction module 507, and an information preprocessing module 508. The module construction module 506 obtains the words or phrases with usage frequency exceeding the usage frequency threshold, and performs splitting processing on the words or phrases to obtain word strings corresponding to the words or phrases. Module construction module 506 constructs an FST branch based on the word strings and words or phrases and constructs an FST module from the FST branch.

The error correction replacement information construction module 507 generates error-prone word pair information or shape-similar word pair information based on the search log information and the search input modification information. The information preprocessing module 508 preprocesses the user input information, removes preset interference information, performs format adjustment processing, and generates search input information; the interference information includes emoticons, tab characters, nonsense characters, and the like.

Fig. 13 is a block diagram of one embodiment of an electronic device of the present disclosure, as shown in fig. 13, electronic device 131 includes one or more processors 1311 and memory 1312.

The processor 1311 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 131 to perform desired functions.

Memory 1312 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory, for example, may include: random Access Memory (RAM) and/or cache memory (cache), etc. The nonvolatile memory, for example, may include: read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 1311 to implement the search input information error correction methods of the various embodiments of the disclosure above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.

In one example, the electronic device 131 may further include: an input device 1313, and an output device 1314, among others, interconnected by a bus system and/or other form of connection mechanism (not shown). The input device 1313 may also include, for example, a keyboard, a mouse, and the like. The output unit 1314 may output various information to the outside. The output devices 1314 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the electronic device 131 relevant to the present disclosure are shown in fig. 13, omitting components such as buses, input/output interfaces, and the like. In addition, electronic device 131 may include any other suitable components, depending on the particular application.

In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the search input information error correction method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.

The computer program product may write program code for performing the operations of embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a search input information error correction method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

In one embodiment, the present disclosure provides a computer program comprising computer readable code, characterized in that when the computer readable code is run on a device, a processor in the device executes a method for implementing the search input information error correction method as in any of the above embodiments.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

In the method and the device for correcting the search input information, the electronic device and the storage medium in the embodiment, the search input information is detected by using the language detection model, and a first understanding confusion value is obtained; if the first understanding confusion value is larger than a preset range, generating first input error correction information; detecting the first input error correction information by using a language detection model to obtain a second understanding confusion value; if the second understanding confusion value is larger than the preset range, acquiring an error correction subsequent information set; determining distance information between the input candidate information and the search input information, and calculating an error correction score; determining second input error correction information based on the error correction score, and performing corresponding search operation; fine search guidance can be provided for a user, a search path is shortened, and search efficiency is improved; the corpus required by model training can be reduced, the cost is reduced, the online prediction stage can be carried out in real time, and the method is suitable for scenes such as commercial query and the like; the method can provide various error correction information for the user, can correct the search input of the user in time, improves the error correction accuracy, reduces the search cost of the user, and effectively improves the user experience.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, and systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," comprising, "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

24页详细技术资料下载

Search input information error correction method and device, electronic equipment and storage medium

相关技术

网友询问留言