Statement processing method, device, equipment and medium

文档序号:1043309 发布日期:2020-10-09 浏览:15次 中文

阅读说明:本技术 一种语句处理方法、装置、设备及介质 (Statement processing method, device, equipment and medium ) 是由 史斌斌 颜水成 于 2020-06-24 设计创作,主要内容包括:本申请公开了一种语句处理方法、装置、设备及介质,应用于自然语言处理技术领域,用以解决现有技术中语法纠错方法存在召回率较低、准确度较差的问题。具体为:获取待检测语句;基于语法规则错误检测模型,获得待检测语句中的语法规则错误,对语法规则错误进行纠正,得到第一纠正语句;基于词语使用错误检测分类器,获得第一纠正语句中的词语使用错误,对词语使用错误进行纠正,得到第二纠正语句;基于全局语法错误纠正模型,获得目标纠正语句。这样,通过语法规则错误检测模型、词语使用错误检测分类器和全局语法错误纠正模型,对待检测语句进行三层语法错误检测,实现了对待检测语句中语法错误的全面检测,提高了语法错误检测的召回率和准确度。(The application discloses a sentence processing method, a sentence processing device and a sentence processing medium, which are applied to the technical field of natural language processing and are used for solving the problems of low recall rate and poor accuracy of a grammar error correction method in the prior art. The method specifically comprises the following steps: acquiring a sentence to be detected; obtaining grammatical rule errors in the sentences to be detected based on the grammatical rule error detection model, and correcting the grammatical rule errors to obtain first corrected sentences; obtaining word use errors in the first corrected sentences based on the word use error detection classifier, and correcting the word use errors to obtain second corrected sentences; and obtaining a target correction statement based on the global syntax error correction model. Therefore, three-layer grammar error detection is carried out on the sentence to be detected through the grammar rule error detection model, the word use error detection classifier and the global grammar error correction model, so that the comprehensive detection of the grammar errors in the sentence to be detected is realized, and the recall rate and the accuracy of the grammar error detection are improved.)

1. A sentence processing method, comprising:

acquiring a sentence to be detected;

based on a grammar rule error detection model, carrying out grammar rule error detection on the sentence to be detected to obtain a grammar rule error in the sentence to be detected, and correcting the grammar rule error in the sentence to be detected to obtain a first corrected sentence;

performing word use error detection on the first corrected sentence based on a word use error detection classifier to obtain a word use error in the first corrected sentence, and correcting the word use error in the first corrected sentence to obtain a second corrected sentence;

and performing global syntax error correction on the second corrected statement based on a global syntax error correction model to obtain a target corrected statement.

2. The sentence processing method of claim 1, wherein performing grammar rule error detection on the sentence to be detected based on a grammar rule error detection model to obtain a grammar rule error in the sentence to be detected comprises:

based on a fault-tolerant grammar rule, carrying out grammar analysis on the sentence to be detected to obtain grammar structure data of the sentence to be detected;

and inputting the grammar structure data into the grammar rule error detection model to obtain grammar rule errors in the sentences to be detected.

3. The sentence processing method of claim 1, wherein correcting the grammatical rule error in the sentence to be detected to obtain a first corrected sentence comprises:

acquiring an error correction rule of the grammar rule errors in the sentence to be detected based on the error type of the grammar rule errors in the sentence to be detected;

and correcting the grammar rule errors in the sentence to be detected according to the error correction rule of the grammar rule errors in the sentence to be detected to obtain the first corrected sentence.

4. The sentence processing method of claim 1 wherein performing word usage error detection on the first corrected sentence based on a word usage error detection classifier to obtain a word usage error in the first corrected sentence comprises:

performing word segmentation processing on the first correction sentence to obtain each word segmentation of the first correction sentence;

and inputting each participle of the first corrected sentence into a word use error detection classifier which is respectively established aiming at various word use errors, so as to obtain the word use errors in the first corrected sentence.

5. The sentence processing method of claim 1 wherein correcting the word usage error in the first corrected sentence to obtain a second corrected sentence comprises:

acquiring an error correction rule of the word use error in the first correction statement based on the error type of the word use error in the first correction statement;

and correcting the word use errors in the first corrected sentence according to the error correction rule of the word use errors in the first corrected sentence to obtain the second corrected sentence.

6. The sentence processing method of claim 1 wherein performing global syntax error correction on the second corrected sentence based on a global syntax error correction model to obtain a target corrected sentence comprises:

performing word segmentation processing on the second correction sentence to obtain each word segmentation of the second correction sentence;

and inputting each participle of the second corrected sentence into the global grammar error correction model to obtain the target corrected sentence.

7. The sentence processing method of any one of claims 1-6, wherein after performing global syntax error correction on the second correction sentence based on a global syntax error correction model to obtain a target correction sentence, further comprising:

obtaining each sentence difference between the second correction sentence and the target correction sentence by adopting a minimum edit distance algorithm;

and acquiring the grammar errors corresponding to the sentence differences based on the incidence relation between the sentence differences and the grammar errors, and determining the grammar errors corresponding to the sentence differences as the global grammar errors in the second corrected sentences.

8. A sentence processing apparatus, comprising:

the sentence acquisition unit is used for acquiring a sentence to be detected;

the first detection unit is used for carrying out grammar rule error detection on the sentence to be detected based on a grammar rule error detection model to obtain a grammar rule error in the sentence to be detected;

the first correcting unit is used for correcting grammatical rule errors in the sentences to be detected to obtain first corrected sentences;

the second detection unit is used for carrying out word use error detection on the first correction statement based on the word use error detection classifier to obtain word use errors in the first correction statement;

the second correction unit is used for correcting the word use errors in the first correction statement to obtain a second correction statement;

and the third correcting unit is used for carrying out global syntax error correction on the second correcting statement based on the global syntax error correcting model to obtain a target correcting statement.

9. A sentence processing apparatus, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the statement processing method according to any of claims 1-7 when executing the computer program.

10. A computer-readable storage medium storing computer instructions which, when executed by a processor, implement the sentence processing method of any of claims 1-7.

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a method, an apparatus, a device, and a medium for processing a sentence.

Background

Grammar error correction refers to detecting and correcting grammar errors in a language text, and is widely applied to a plurality of fields such as composition scoring, grammar learning and the like.

However, the recall rate and accuracy of the current syntax error correction method in the syntax error correction process are low, and how to improve the recall rate and accuracy of the syntax error correction is a current problem to be considered.

Disclosure of Invention

The embodiment of the application provides a statement processing method, a statement processing device, a statement processing apparatus and a statement processing medium, which are used for solving the problems of low recall rate and poor accuracy of a grammar error correction method in the prior art.

The technical scheme provided by the embodiment of the application is as follows:

in one aspect, an embodiment of the present application provides a statement processing method, including:

acquiring a sentence to be detected;

on the basis of the grammar rule error detection model, carrying out grammar rule error detection on the sentence to be detected to obtain a grammar rule error in the sentence to be detected, and correcting the grammar rule error in the sentence to be detected to obtain a first corrected sentence;

performing word use error detection on the first corrected sentence based on the word use error detection classifier to obtain a word use error in the first corrected sentence, and correcting the word use error in the first corrected sentence to obtain a second corrected sentence;

and performing global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement.

In a possible implementation, the obtaining a sentence to be detected includes:

carrying out sentence end punctuation identification on the language text to be detected to obtain sentence end punctuation in the language text to be detected;

dividing the language text to be detected by taking the sentence end punctuation in the language text to be detected as a dividing point to obtain at least one sentence;

and respectively determining each statement in at least one statement as a statement to be detected.

In a possible implementation manner, based on the grammar rule error detection model, performing grammar rule error detection on the sentence to be detected to obtain a grammar rule error in the sentence to be detected, including:

on the basis of a fault-tolerant grammar rule, performing grammar analysis on the sentence to be detected to obtain grammar structure data of the sentence to be detected;

and inputting the grammar structure data into a grammar rule error detection model to obtain grammar rule errors in the sentences to be detected.

In a possible implementation manner, correcting a syntax rule error in a sentence to be detected to obtain a first corrected sentence includes:

acquiring error correction rules of grammar rule errors in the sentences to be detected based on the error types of the grammar rule errors in the sentences to be detected;

and correcting the grammatical rule errors in the sentence to be detected according to the error correction rule of the grammatical rule errors in the sentence to be detected to obtain a first corrected sentence.

In one possible implementation, performing word usage error detection on the first corrected sentence based on the word usage error detection classifier to obtain a word usage error in the first corrected sentence, includes:

performing word segmentation processing on the first correction sentence to obtain each word segmentation of the first correction sentence;

and inputting each participle of the first corrected sentence into a word use error detection classifier which is respectively established aiming at the use errors of various words, so as to obtain the use errors of the words in the first corrected sentence.

In one possible implementation, correcting the word usage error in the first corrected statement to obtain a second corrected statement includes:

acquiring an error correction rule of the word use errors in the first correction statement based on the error type of the word use errors in the first correction statement;

and correcting the word use errors in the first corrected sentence according to the error correction rule of the word use errors in the first corrected sentence to obtain a second corrected sentence.

In a possible implementation manner, performing global syntax error correction on the second corrected sentence based on the global syntax error correction model to obtain a target corrected sentence, includes:

performing word segmentation processing on the second correction sentence to obtain each word segmentation of the second correction sentence;

and inputting each participle of the second corrected sentence into the global grammar error correction model to obtain a target corrected sentence.

In a possible implementation manner, after performing global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement, the method further includes:

obtaining each sentence difference between the second correction sentence and the target correction sentence by adopting a minimum edit distance algorithm;

and acquiring the grammar errors corresponding to the sentence differences based on the incidence relation between the sentence differences and the grammar errors, and determining the grammar errors corresponding to the sentence differences as the global grammar errors in the second corrected sentence.

In a possible implementation manner, the statement processing method provided in the embodiment of the present application further includes:

and marking grammar rule errors, word use errors and global grammar errors in the sentences to be detected according to a set marking mode.

In a possible implementation manner, the statement processing method provided in the embodiment of the present application further includes:

acquiring error detailed information corresponding to the grammar rule error, the word use error and the global grammar error respectively;

and displaying error detailed information corresponding to the grammar rule error, the word use error and the global grammar error in the sentence to be detected according to a set display mode.

In another aspect, an embodiment of the present application provides a sentence processing apparatus, including:

the sentence acquisition unit is used for acquiring a sentence to be detected;

the first detection unit is used for carrying out grammar rule error detection on the sentence to be detected based on the grammar rule error detection model to obtain grammar rule errors in the sentence to be detected;

the first correcting unit is used for correcting grammatical rule errors in the sentence to be detected to obtain a first corrected sentence;

the second detection unit is used for carrying out word use error detection on the first correction statement based on the word use error detection classifier to obtain word use errors in the first correction statement;

the second correction unit is used for correcting the word use errors in the first correction statement to obtain a second correction statement;

and the third correcting unit is used for carrying out global syntax error correction on the second correcting statement based on the global syntax error correcting model to obtain a target correcting statement.

In a possible implementation manner, when obtaining a sentence to be detected, the sentence obtaining unit is specifically configured to:

carrying out sentence end punctuation identification on the language text to be detected to obtain sentence end punctuation in the language text to be detected;

dividing the language text to be detected by taking the sentence end punctuation in the language text to be detected as a dividing point to obtain at least one sentence;

and respectively determining each statement in at least one statement as a statement to be detected.

In a possible implementation manner, when performing syntax rule error detection on a to-be-detected statement based on a syntax rule error detection model to obtain a syntax rule error in the to-be-detected statement, the first detection unit is specifically configured to:

on the basis of a fault-tolerant grammar rule, performing grammar analysis on the sentence to be detected to obtain grammar structure data of the sentence to be detected;

and inputting the grammar structure data into a grammar rule error detection model to obtain grammar rule errors in the sentences to be detected.

In a possible implementation manner, when a syntax rule error in a to-be-detected statement is corrected to obtain a first corrected statement, the first correcting unit is specifically configured to:

acquiring error correction rules of grammar rule errors in the sentences to be detected based on the error types of the grammar rule errors in the sentences to be detected;

and correcting the grammatical rule errors in the sentence to be detected according to the error correction rule of the grammatical rule errors in the sentence to be detected to obtain a first corrected sentence.

In a possible implementation manner, when the word usage error detection is performed on the first corrected sentence based on the word usage error detection classifier, and a word usage error in the first corrected sentence is obtained, the second detection unit is specifically configured to:

performing word segmentation processing on the first correction sentence to obtain each word segmentation of the first correction sentence;

and inputting each participle of the first corrected sentence into a word use error detection classifier which is respectively established aiming at the use errors of various words, so as to obtain the use errors of the words in the first corrected sentence.

In a possible implementation manner, when a word usage error in a first corrected sentence is corrected to obtain a second corrected sentence, the second correcting unit is specifically configured to:

acquiring an error correction rule of the word use errors in the first correction statement based on the error type of the word use errors in the first correction statement;

and correcting the word use errors in the first corrected sentence according to the error correction rule of the word use errors in the first corrected sentence to obtain a second corrected sentence.

In a possible implementation manner, when performing global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement, the third correcting unit is specifically configured to:

performing word segmentation processing on the second correction sentence to obtain each word segmentation of the second correction sentence;

and inputting each participle of the second corrected sentence into the global grammar error correction model to obtain a target corrected sentence.

In a possible implementation manner, the statement processing apparatus provided in an embodiment of the present application further includes:

and the third detection unit is used for performing global syntax error correction on the second corrected statement by the third correction unit based on the global syntax error correction model to obtain a target corrected statement, obtaining each statement difference between the second corrected statement and the target corrected statement by adopting a minimum edit distance algorithm, obtaining syntax errors corresponding to each statement difference based on the incidence relation between the statement difference and the syntax errors, and determining the syntax errors corresponding to each statement difference as the global syntax errors in the second corrected statement.

In a possible implementation manner, the statement processing apparatus provided in an embodiment of the present application further includes:

and the error marking unit is used for marking grammar rule errors, word use errors and global grammar errors in the sentences to be detected according to the set marking mode.

In a possible implementation manner, the statement processing apparatus provided in an embodiment of the present application further includes:

and the error interpretation unit is used for acquiring error detailed information corresponding to the grammar rule error, the word use error and the global grammar error respectively, and displaying the error detailed information corresponding to the grammar rule error, the word use error and the global grammar error respectively in the sentence to be detected according to a set display mode.

On the other hand, an embodiment of the present application provides a sentence processing apparatus, including: the system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the statement processing method provided by the embodiment of the application.

On the other hand, an embodiment of the present application further provides a computer-readable storage medium, where computer instructions are stored, and when the computer instructions are executed by a processor, the statement processing method provided in the embodiment of the present application is implemented.

The beneficial effects of the embodiment of the application are as follows:

in the embodiment of the application, three-layer grammar error detection is carried out on the sentence to be detected through the grammar rule error detection model, the word use error detection classifier and the global grammar error correction model, so that not only can comprehensive detection of grammar errors in the sentence to be detected be realized, but also accurate detection of the grammar errors in the sentence to be detected can be realized, the recall rate and accuracy of grammar error detection are improved, and the accuracy of finally obtained target correction sentences is improved.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a system architecture diagram of a statement processing system according to an embodiment of the present application;

FIG. 2 is a schematic flow chart illustrating an overview of a sentence processing method in an embodiment of the present application;

FIG. 3 is a schematic flow chart of a sentence processing method in an embodiment of the present application;

FIG. 4 is a functional structure diagram of a sentence processing apparatus in an embodiment of the present application;

fig. 5 is a schematic hardware configuration diagram of a statement processing device in an embodiment of the present application.

Detailed Description

In order to make the purpose, technical solution and advantages of the present application more clearly and clearly understood, the technical solution in the embodiments of the present application will be described below in detail and completely with reference to the accompanying drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

To facilitate a better understanding of the present application by those skilled in the art, a brief description of the technical terms involved in the present application will be given below.

1. And the grammar rule error is an error which does not accord with the grammar rule in the statement.

2. The grammar rule error detection model is a model which is established based on the fault-tolerant grammar rule and is used for detecting grammar rule errors in the sentences.

3. The term misusage is an improper term misusage in a sentence, and in the present application, the term misusage includes but is not limited to: word errors and part-of-speech errors.

4. The word use error detection classifier is a classifier which is established based on a deep neural network and used for detecting word use errors in a sentence.

5. Global syntax errors, are syntax errors remaining in the statement except for detected syntax rule errors and word usage errors.

6. And the global grammar error correction model is a model which is established based on a deep neural network and is used for detecting other grammar errors except the detected grammar rule errors and word use errors in the sentences.

7. The syntax error correction client is an application program which can be installed on terminal devices such as a mobile phone, a computer, a Personal Digital Assistant (PDA) and the like, can respectively perform syntax error correction on each statement in a language text to be detected, and supports user interaction.

8. The grammar error correction server is background running equipment which can provide various services such as database service, model building and optimization service and the like for the grammar error correction client.

It should be noted that, in the present application, the terms "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or sequence. It is to be understood that such terms are interchangeable under appropriate circumstances such that the embodiments described herein are capable of operation in sequences other than those illustrated or otherwise described herein.

After introducing the technical terms related to the present application, the following briefly introduces the application scenarios and design ideas of the embodiments of the present application.

In order to solve the problems of low recall rate and poor accuracy of the syntax error correction method in the prior art, in the embodiment of the present application, referring to fig. 1, a syntax error correction client 101 may be installed on a terminal device 102, and is in communication connection with a syntax error correction server 103 through the terminal device 102 and by using a communication network. In practical application, the grammar error correction server 103 may pre-build three models, namely a grammar rule error detection model, a word use error detection classifier and a global grammar error correction model, according to grammar error correction requirements, and configure the three models into the grammar error correction client 101; when receiving a language text to be detected submitted by a user, the grammar error correction client 101 segments the language text to be detected to obtain at least one sentence, respectively performs three-layer grammar error detection and correction on the at least one sentence based on a grammar rule error detection model, a word use error detection classifier and a global grammar error correction model configured by the grammar error correction server 103 to obtain a grammar error and a target correction sentence corresponding to the at least one sentence, and then displays the grammar error and the target correction sentence corresponding to the at least one sentence to the user. Therefore, three-layer grammar error detection is carried out on the sentence to be detected through the grammar rule error detection model, the word use error detection classifier and the global grammar error correction model, so that not only can the comprehensive detection of the grammar errors in the sentence to be detected be realized, but also the accurate detection of the grammar errors in the sentence to be detected can be realized, the recall rate and the accuracy of the grammar error detection are improved, and the accuracy of the finally obtained target correction sentence is further improved.

After introducing the application scenario and the design concept of the embodiment of the present application, the following describes in detail the technical solution provided by the embodiment of the present application.

The embodiment of the present application provides a statement processing method, which is applied to a syntax error correction client 101 in a statement processing system shown in fig. 1, and referring to fig. 2, an overview flow of the statement processing method provided by the embodiment of the present application is as follows:

step 201: and acquiring the sentence to be detected.

In practical application, a user can issue a grammar error correction request to the grammar error correction client 101 by submitting a language text to be detected on the grammar error correction client 101, and when the grammar error correction client 101 receives the grammar error correction request initiated by the user, the grammar error correction client 101 can obtain a sentence to be detected by performing sentence segmentation on the language text to be detected submitted by the user.

Specifically, the grammar error correction client 101 performs sentence segmentation on the language text to be detected submitted by the user, and when obtaining the sentence to be detected, the following method may be adopted, but is not limited to:

firstly, the syntax error correction client 101 performs sentence end punctuation recognition on a language text to be detected to obtain sentence end punctuation symbols in the language text to be detected, wherein the sentence end punctuation symbols include but are not limited to: periods, question marks, exclamation marks, and the like.

Then, the grammar error correction client 101 uses the sentence end punctuation in the language text to be detected as a segmentation point to segment the language text to be detected to obtain at least one sentence.

Finally, the syntax error correction client 101 determines each statement in the at least one statement as a to-be-detected statement.

Step 202: and based on the grammar rule error detection model, carrying out grammar rule error detection on the sentence to be detected to obtain a grammar rule error in the sentence to be detected, and correcting the grammar rule error in the sentence to be detected to obtain a first corrected sentence.

In practical application, in order to implement detection of a grammar rule error in a sentence to be detected, in the embodiment of the present application, the grammar error correction server 103 may set up a grammar rule error detection model in advance. Specifically, when building the grammar rule error detection model, the grammar error correction server 103 may adopt, but is not limited to, the following modes:

the first mode is as follows: machine learning approach.

First, the syntax error correction server 103 may collect a sample sentence set.

In practical application, the syntax error correction server 103 may collect various types of sample statements with syntax rule errors from the website and compose a sample statement set.

Then, the syntax error correction server 103 may perform syntax parsing on each sample statement included in the sample statement set based on the fault-tolerant syntax rule, to obtain syntax structure data of each sample statement.

Secondly, the syntax error correction server 103 may input the syntax structure data of each sample statement into the syntax rule error detection model to be trained, and obtain a predicted syntax rule error of each sample statement.

Thirdly, the syntax error correction server 103 may train the syntax rule error detection model to be trained by using a loss function based on the predicted syntax rule error and the actual syntax rule error of each sample sentence to obtain each model parameter, where the actual syntax rule error is obtained by labeling each syntax rule error in the sample sentence in advance.

Finally, the syntax error correction server 103 may generate a syntax rule error detection model based on the respective model parameters.

The second mode is as follows: and (4) manually constructing.

First, a model builder may set in advance a detection rule for each grammar rule error, and configure the rule to the grammar error correction server 103.

Then, the syntax error correction server 103 may generate a syntax rule error detection model according to the configured detection rule of each syntax rule error.

Further, after the syntax error correction server 103 builds the syntax rule error detection model, the syntax rule error detection model may be configured to the syntax error correction client 101, so that the syntax error correction client 101 may detect the syntax rule error in the to-be-detected statement based on the syntax rule error detection model. In addition, in this embodiment of the application, the syntax error correction server 103 may further optimize the syntax rule error detection model according to the latest collected sample statement set, and update the optimized syntax rule error detection model to the syntax error correction client 101, so that the syntax error correction client 101 may detect a syntax rule error in a statement to be detected based on the latest syntax rule error detection model.

In practical application, when detecting a syntax rule error in a statement to be detected based on the syntax rule error detection model, the syntax error correction client 101 may adopt, but is not limited to, the following modes:

firstly, the syntax error correction client 101 may perform syntax analysis on the to-be-detected sentence based on the fault-tolerant syntax rule to obtain the syntax structure data of the to-be-detected sentence.

Then, the syntax error correction client 101 may input the syntax structure data into the syntax rule error detection model to obtain a syntax rule error in the to-be-detected sentence.

Further, the grammar error correction client 101 can correct the grammar rule errors in the sentences to be detected after detecting the grammar rule errors in the sentences to be detected based on the grammar rule error detection model. Specifically, when the syntax error correction client 101 corrects the syntax rule error in the to-be-detected statement, the following manners may be adopted, but are not limited to:

first, the syntax error correction client 101 may obtain an error correction rule of a syntax rule error in a sentence to be detected based on an error type of the syntax rule error in the sentence to be detected.

Then, the syntax error correction client 101 corrects the syntax rule error in the sentence to be detected according to the error correction rule of the syntax rule error in the sentence to be detected, so as to obtain a first corrected sentence.

Step 203: and performing word use error detection on the first corrected sentence based on the word use error detection classifier to obtain a word use error in the first corrected sentence, and correcting the word use error in the first corrected sentence to obtain a second corrected sentence.

In practical application, in order to detect word use errors in the first corrected sentence, in the embodiment of the present application, the syntax error correction server 103 may respectively build a word use error detection classifier for each type of word use errors, so that one word use error detection classifier can detect one type of word use errors. Specifically, when the grammar error correction server 103 builds word use error detection classifiers for various word use errors, the following methods can be adopted, but are not limited to:

first, the syntax error correction server 103 may collect a correct sentence set, and replace correct words in each correct sentence included in the correct sentence set with erroneous words according to a probability distribution set in advance according to the use errors of the various words, to obtain various sample sentences having the use errors of the words.

In practical application, the grammar error correction server 103 may collect each correct sentence from the website and compose a correct sentence set.

Then, the syntax error correction server 103 may classify each sample statement according to the error type of the word use error to obtain a sample statement set of various word use errors, and obtain a participle of each sample statement in the sample statement set of the word use errors for each word use error.

Secondly, the syntax error correction server 103 can input the word to be trained with the wrong usage of the word to use the error detection classifier by segmenting the word of each sample sentence in the sample sentence set with the wrong usage of the word aiming at the wrong usage of various words, so as to obtain the predicted word usage error of each sample sentence in the sample sentence set with the wrong usage of the word.

Thirdly, the grammar error correction server 103, aiming at various word use errors, may use a loss function to train the words to be trained with the wrong use on the basis of the predicted word use errors and the real word use errors of each sample sentence in the sample sentence set with the wrong use of the words, so as to obtain each model parameter of the word use errors, wherein the real word use errors are obtained by labeling each word use error in the sample sentences in advance.

Finally, the grammar error correction server 103 may generate a word usage error detection classifier for the various types of word usage errors based on the model parameters of the word usage errors.

Further, after the grammar error correction server 103 builds word use error detection classifiers for various word use errors, the word use error detection classifiers for various word use errors can be configured in the grammar error correction client 101, so that the grammar error correction client 101 can detect the word use errors in the first corrected sentence based on the word use error detection classifiers for various word use errors. In addition, in this embodiment of the application, the syntax error correction server 103 may further optimize, according to the sample sentence set collected latest, the words with the wrong usage for each type of words using the error detection classifier, and update the optimized words with the wrong usage for each type of words using the error detection classifier into the syntax error correction client 101, so that the syntax error correction client 101 may detect the syntax rule error in the first corrected sentence based on the latest words with the wrong usage for each type of words using the error detection classifier.

In practical application, when the syntax error correction client 101 detects a word use error in a first corrected sentence based on a word use error detection classifier for various word use errors, the following methods can be adopted, but are not limited to:

first, the syntax error correction client 101 may perform word segmentation processing on the first correction sentence to obtain each word segmentation of the first correction sentence.

Then, the syntax error correction client 101 may input each participle of the first corrected sentence into a word use error detection classifier respectively established for each type of word use error, so as to obtain a word use error in the first corrected sentence.

Further, the syntax error correction client 101 may correct the syntax rule error in the first corrected sentence after detecting the word use error in the first corrected sentence based on the word use error detection classifier of each type of word use error. Specifically, when the syntax error correction client 101 corrects the syntax rule error in the first correction statement, the following manners may be adopted, but are not limited to:

first, the syntax error correction client 101 may obtain an error correction rule for a word usage error in the first correction sentence based on the error type of the word usage error in the first correction sentence.

Then, the syntax error correction client 101 may correct the word use error in the first corrected sentence according to the error correction rule of the word use error in the first corrected sentence, so as to obtain a second corrected sentence.

Step 204: and performing global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement.

In practical application, in order to implement global detection on a syntax error in the second corrected statement, in the embodiment of the present application, the syntax error correction server 103 may set up a global syntax error correction model in advance. Specifically, when constructing the global syntax error correction model, the syntax error correction server 103 may adopt, but is not limited to, the following manners:

first, the syntax error correction server 103 may collect a sample sentence set and a correction sentence set.

In practical application, the syntax error correction server 103 may collect various sample statements with syntax errors from a website and form a sample statement set, and collect real correction statements of each sample statement in the sample statement set and form a correction statement set, where the real correction statements are obtained by manually correcting each syntax error in the sample statement respectively.

Then, the syntax error correction server 103 may obtain the participles of each sample sentence in the sample sentence set.

Secondly, the syntax error correction server 103 may input the participle of each sample statement in the sample statement set into the global syntax error correction model to be trained, and obtain the predicted correction statement of each sample statement in the sample statement set.

Thirdly, the syntax error correction server 103 may train the global syntax error correction model to be trained by using a loss function based on the predicted correction statement of each sample statement in the sample statement set and the actual correction statement of each sample statement in the correction statement set, so as to obtain each model parameter.

Finally, the syntax error correction server 103 may generate a global syntax error correction model based on the respective model parameters.

Further, after building the global syntax error correction model, the syntax error correction server 103 may configure the global syntax error correction model into the syntax error correction client 101, so that the syntax error correction client 101 may perform global detection on syntax errors in the second correction statement based on the global syntax error correction model. In addition, in this embodiment of the application, the syntax error correction server 103 may further optimize the global syntax error correction model according to the latest acquired sample statement set, and update the optimized global syntax error correction model to the syntax error correction client 101, so that the syntax error correction client 101 may perform global detection on a syntax error in the second corrected statement based on the latest global syntax error correction model.

In practical application, when the syntax error correction client 101 performs global detection on a syntax error in the second corrected statement based on the global syntax error correction model, the following manners may be adopted, but are not limited to:

first, the syntax error correction client 101 may perform word segmentation processing on the second corrected sentence to obtain each word segmentation of the second corrected sentence.

Then, the syntax error correction client 101 may input each participle of the second corrected sentence into the global syntax error correction model, resulting in a target corrected sentence.

Further, the syntax error correction client 101 performs global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement, and may also position a syntax error in the second corrected statement based on a difference between the second corrected statement and the target corrected statement. Specifically, when the syntax error correction client 101 locates the global syntax error in the second corrected statement based on the difference between the second corrected statement and the target corrected statement, the following manners may be adopted, but are not limited to:

first, the syntax error correction client 101 may obtain each sentence difference between the second correction sentence and the target correction sentence by using a minimum edit distance algorithm.

Then, the syntax error correction client 101 may obtain syntax errors corresponding to the respective sentence differences based on the association relationship between the sentence differences and the syntax errors.

Finally, the syntax error correction client 101 may determine the syntax error corresponding to each sentence difference as the global syntax error in the second corrected sentence.

Further, in order to facilitate a user to check a grammar error in a sentence to be detected, in the embodiment of the present application, after the grammar error correction client 101 detects a grammar rule error, a word use error, and a global grammar error in the sentence to be detected, the grammar rule error, the word use error, and the global grammar error may be marked in the sentence to be detected according to a set marking manner. Specifically, the syntax error correction client 101 may use labeling modes such as highlighting labeling and/or thickening labeling, and label a syntax error, a word use error, and a global syntax error in the sentence to be detected.

In addition, in order to improve interpretability of syntax error correction, in the embodiment of the present application, after the syntax error correction client 101 detects a syntax rule error, a word use error, and a global syntax error in a sentence to be detected, error detailed solution information corresponding to the syntax rule error, the word use error, and the global syntax error in the sentence to be detected may also be obtained, and error detailed solution information corresponding to the syntax rule error, the use error, and the global syntax error is displayed in the sentence to be detected according to a set display manner. Specifically, the syntax error correction client 101 may respectively create annotation boxes for syntax rule errors, word use errors, and global syntax errors in the statements to be detected, and display error detailed information of corresponding syntax errors in the annotation boxes, where the error detailed information may include but is not limited to: error type, error cause, modification mode, etc.

In practical application, the sentence processing method provided by the embodiment of the present application can be applied to sentence processing of texts in multiple languages, such as english, chinese, french, and the like, and the following text is taken as an example, and the sentence processing method provided by the embodiment of the present application is further described in detail, with reference to fig. 3, and the specific flow of the sentence processing method provided by the embodiment of the present application is as follows:

step 301: when receiving a grammar error correction request initiated by submitting an English text by a user, the grammar error correction client 101 identifies sentence end punctuation symbols of the English text submitted by the user to obtain the sentence end punctuation symbols in the English text.

Step 302: the grammar error correction client 101 uses the sentence end punctuation in the English text as a segmentation point to segment the English text to obtain at least one English sentence.

In practical application, in order to improve the accuracy of sentence segmentation, in the process of identifying the sentence end punctuation in the english text, the syntax error correction client 101 may determine the sentence numbers in the detected decimal points and the sentence numbers in the english names as invalid sentence end punctuation symbols to be excluded, so that the validity of sentence end punctuation symbol detection may be improved, and the accuracy of sentence segmentation may be further improved.

Step 303: the syntax error correction client 101 determines each english sentence in the at least one english sentence as an english sentence to be detected, respectively.

Step 304: the grammar error correction client 101 performs grammar parsing on the english sentence to be detected based on the fault-tolerant grammar rule to obtain grammar structure data of the english sentence to be detected.

In practical application, the syntax error correction client 101 may perform syntax analysis on the english sentence to be detected based on an english resource syntax (ERG) syntax rule capable of tolerating errors, so as to obtain syntax structure data of the english sentence to be detected.

Step 305: and the grammar error correction client 101 inputs the grammar structure data into the grammar rule error detection model to obtain grammar rule errors in the English sentences to be detected.

Step 306: the grammar error correction client 101 obtains an error correction rule of grammar rule errors in the english sentence to be detected based on the error type of the grammar rule errors in the english sentence to be detected.

Step 307: the grammar error correction client 101 corrects the grammar rule errors in the english sentence to be detected according to the error correction rule of the grammar rule errors in the english sentence to be detected, so as to obtain a first corrected english sentence.

Step 308: the grammar error correction client 101 performs word segmentation processing on the first corrected english sentence to obtain each english word segmentation of the first corrected english sentence.

Step 309: the grammar error correction client 101 inputs each english participle of the first corrected english sentence into a word use error detection classifier respectively established for each word use error, and obtains a word use error in the first corrected english sentence.

Step 310: the syntax error correction client 101 obtains an error correction rule for first correcting a word usage error in the english sentence based on the error type of the first correcting the word usage error in the english sentence.

Step 311: the syntax error correction client 101 corrects the word use error in the first corrected english sentence according to the error correction rule for the word use error in the first corrected english sentence, so as to obtain a second corrected english sentence.

Step 312: the grammar error correction client 101 performs word segmentation processing on the second corrected english sentence to obtain each english word segmentation of the second corrected english sentence.

Step 313: the grammar error correction client 101 inputs each english participle of the second corrected english sentence into the global grammar error correction model to obtain a target corrected english sentence.

Step 314: the syntax error correction client 101 obtains each sentence difference between the second correction english sentence and the target correction english sentence by using the minimum edit distance algorithm.

Step 315: the syntax error correction client 101 obtains syntax errors corresponding to the sentence differences based on the association relationship between the sentence differences and the syntax errors.

Step 316: the syntax error correction client 101 determines the syntax error corresponding to each sentence difference as a global syntax error in the second corrected english sentence.

Step 317: the grammar error correction client 101 marks grammar rule errors, word use errors and global grammar errors in the English sentences to be detected by adopting marking modes such as highlighting marking and/or thickening marking.

Step 318: the grammar error correction client 101 obtains error detailed information such as error types, error reasons, modification modes and the like corresponding to grammar rule errors, word use errors and global grammar errors in English sentences to be detected.

Step 319: the grammar error correction client 101 creates annotation frames for grammar rule errors, word use errors and global grammar errors in the English sentences to be detected respectively, and displays error detailed information of corresponding grammar errors in the annotation frames.

Based on the foregoing embodiments, an embodiment of the present application provides a sentence processing apparatus, and referring to fig. 4, a sentence processing apparatus 400 provided by the embodiment of the present application at least includes:

a sentence acquisition unit 401, configured to acquire a sentence to be detected;

the first detection unit 402 is configured to perform syntax rule error detection on the sentence to be detected based on the syntax rule error detection model to obtain a syntax rule error in the sentence to be detected;

a first correcting unit 403, configured to correct a syntax rule error in a to-be-detected statement to obtain a first corrected statement;

a second detecting unit 404, configured to perform word use error detection on the first corrected sentence based on the word use error detection classifier, so as to obtain a word use error in the first corrected sentence;

a second correcting unit 405, configured to correct a word usage error in the first corrected sentence to obtain a second corrected sentence;

and a third correcting unit 406, configured to perform global syntax error correction on the second corrected statement based on the global syntax error correction model, so as to obtain a target corrected statement.

In a possible implementation manner, when obtaining the sentence to be detected, the sentence obtaining unit 401 is specifically configured to:

carrying out sentence end punctuation identification on the language text to be detected to obtain sentence end punctuation in the language text to be detected;

dividing the language text to be detected by taking the sentence end punctuation in the language text to be detected as a dividing point to obtain at least one sentence;

and respectively determining each statement in at least one statement as a statement to be detected.

In a possible implementation manner, when performing syntax rule error detection on a to-be-detected statement based on a syntax rule error detection model to obtain a syntax rule error in the to-be-detected statement, the first detecting unit 402 is specifically configured to:

on the basis of a fault-tolerant grammar rule, performing grammar analysis on the sentence to be detected to obtain grammar structure data of the sentence to be detected;

and inputting the grammar structure data into a grammar rule error detection model to obtain grammar rule errors in the sentences to be detected.

In a possible implementation manner, when a syntax rule error in a to-be-detected sentence is corrected to obtain a first corrected sentence, the first correcting unit 403 is specifically configured to:

acquiring error correction rules of grammar rule errors in the sentences to be detected based on the error types of the grammar rule errors in the sentences to be detected;

and correcting the grammatical rule errors in the sentence to be detected according to the error correction rule of the grammatical rule errors in the sentence to be detected to obtain a first corrected sentence.

In a possible implementation manner, when performing word usage error detection on the first corrected sentence based on the word usage error detection classifier to obtain a word usage error in the first corrected sentence, the second detecting unit 404 is specifically configured to:

performing word segmentation processing on the first correction sentence to obtain each word segmentation of the first correction sentence;

and inputting each participle of the first corrected sentence into a word use error detection classifier which is respectively established aiming at the use errors of various words, so as to obtain the use errors of the words in the first corrected sentence.

In a possible implementation manner, when a word usage error in a first corrected sentence is corrected to obtain a second corrected sentence, the second correcting unit 405 is specifically configured to:

acquiring an error correction rule of the word use errors in the first correction statement based on the error type of the word use errors in the first correction statement;

and correcting the word use errors in the first corrected sentence according to the error correction rule of the word use errors in the first corrected sentence to obtain a second corrected sentence.

In a possible implementation manner, when performing global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement, the third correcting unit 406 is specifically configured to:

performing word segmentation processing on the second correction sentence to obtain each word segmentation of the second correction sentence;

and inputting each participle of the second corrected sentence into the global grammar error correction model to obtain a target corrected sentence.

In a possible implementation manner, the sentence processing apparatus 400 provided in the embodiment of the present application further includes:

a third detecting unit 407, configured to, after the third correcting unit 406 performs global syntax error correction on the second corrected statement based on the global syntax error correction model to obtain a target corrected statement, obtain, by using a minimum edit distance algorithm, each statement difference between the second corrected statement and the target corrected statement, obtain, based on an association relationship between the statement difference and the syntax error, a syntax error corresponding to each statement difference, and determine, as a global syntax error in the second corrected statement, a syntax error corresponding to each statement difference.

In a possible implementation manner, the sentence processing apparatus 400 provided in the embodiment of the present application further includes:

and the error labeling unit 408 is configured to label a syntax rule error, a word use error, and a global syntax error in the sentence to be detected according to the set labeling manner.

In a possible implementation manner, the sentence processing apparatus 400 provided in the embodiment of the present application further includes:

the error interpretation unit 409 is configured to obtain error detailed information corresponding to the syntax rule error, the word use error, and the global syntax error, and display the error detailed information corresponding to the syntax rule error, the word use error, and the global syntax error in the to-be-detected sentence according to a set display manner.

It should be noted that the principle of the statement processing apparatus 400 provided in the embodiment of the present application for solving the technical problem is similar to that of the statement processing method provided in the embodiment of the present application, and therefore, for implementation of the statement processing apparatus 400 provided in the embodiment of the present application, reference may be made to implementation of the statement processing method provided in the embodiment of the present application, and repeated details are not described again.

After the statement processing method and apparatus provided by the embodiment of the present application are introduced, a brief introduction is made to the statement processing device provided by the embodiment of the present application.

Referring to fig. 5, a sentence processing apparatus 500 provided in an embodiment of the present application at least includes: the processor 501, the memory 502 and a computer program stored on the memory 502 and capable of running on the processor 501, when the processor 501 executes the computer program, the statement processing method provided by the embodiments of the present application is implemented.

It should be noted that the sentence processing device 500 shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

The sentence processing apparatus 500 provided by the embodiment of the present application may further include a bus 503 connecting different components (including the processor 501 and the memory 502). Bus 503 represents one or more of any of several types of bus structures, including a memory bus, a peripheral bus, a local bus, and the like.

The Memory 502 may include readable media in the form of volatile Memory, such as Random Access Memory (RAM) 5021 and/or cache Memory 5022, and may further include Read Only Memory (ROM) 5023.

The memory 502 may also include a program tool 5025 having a set (at least one) of program modules 5024, the program modules 5024 including, but not limited to: an operating subsystem, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

The sentence processing device 500 may also communicate with one or more external devices 504 (e.g., keyboard, remote control, etc.), with one or more devices that enable a user to interact with the sentence processing device 500 (e.g., cell phone, computer, etc.), and/or with any device that enables the sentence processing device 500 to communicate with one or more other sentence processing devices 500 (e.g., router, modem, etc.). Such communication may be through an Input/Output (I/O) interface 505. Also, the sentence processing apparatus 500 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network, such as the internet) through the Network adapter 506. As shown in fig. 5, the network adapter 506 communicates with the other modules of the sentence processing apparatus 500 through the bus 503. It should be understood that although not shown in FIG. 5, other hardware and/or software modules may be used in conjunction with statement processing apparatus 500, including but not limited to: microcode, device drivers, Redundant processors, external disk drive Arrays, disk array (RAID) subsystems, tape drives, and data backup storage subsystems, to name a few.

The following describes a computer-readable storage medium provided by embodiments of the present application. The computer-readable storage medium provided by the embodiment of the present application stores computer instructions, and the computer instructions, when executed by the processor, implement the statement processing method provided by the embodiment of the present application. Specifically, the executable program may be built in or installed in the sentence processing apparatus 500, so that the sentence processing apparatus 500 may implement the sentence processing method provided in the embodiment of the present application by executing the built-in or installed executable program.

Furthermore, the sentence processing method provided in the embodiment of the present application can also be implemented as a program product including program code for causing the sentence processing apparatus 500 to execute the sentence processing method provided in the embodiment of the present application when the program product can be run on the sentence processing apparatus 500.

The program product provided by the embodiments of the present application may be any combination of one or more readable media, where the readable media may be a readable signal medium or a readable storage medium, and the readable storage medium may be, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof, and in particular, more specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a RAM, a ROM, an Erasable Programmable Read-Only Memory (EPROM), an optical fiber, a portable Compact disk Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product provided by the embodiment of the application can adopt a CD-ROM and comprises program codes, and can run on a computing device. However, the program product provided by the embodiments of the present application is not limited thereto, and in the embodiments of the present application, the readable storage medium may be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus, or device.

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, according to embodiments of the application. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.

Further, while the operations of the methods of the present application are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, to the extent that such modifications and variations of the embodiments of the present application fall within the scope of the claims and their equivalents, it is intended that the present application also encompass such modifications and variations.

21页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于人工智能的文本纠错方法、装置、计算机设备及存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!