Western text error correction method and device, electronic equipment and storage medium

文档序号：1043311 发布日期：2020-10-09 浏览：7次中文

阅读说明：本技术 西文文本的纠错方法和装置、电子设备及存储介质 (Western text error correction method and device, electronic equipment and storage medium ) 是由潘旭崔路男李云聪尹存祥于 2020-06-29 设计创作，主要内容包括：本申请公开了西文文本的纠错方法和装置、电子设备及存储介质,涉及人工智能和自然语言处理技术领域。具体实现方案为：将待判别的西文语句中的单词转为小写后输入至已训练的大小写识别模型,得到待判别的西文语句中的单词在西文语句的标准表述中的大小写类型标签；根据待判别的西文语句中的单词的大小写类型标签对待判别的西文语句中的单词的大小写进行纠正；其中,大小写识别模型基于与待判别的西文语句相同语言类型的西文标准语料训练得出。该方案提升了西文语句大小写识别和纠错的准确性。(The application discloses a western text error correction method and device, electronic equipment and a storage medium, and relates to the technical field of artificial intelligence and natural language processing. The specific implementation scheme is as follows: converting the words in the western sentences to be distinguished into lower case, and inputting the lower case to the trained upper case recognition model to obtain upper case type labels and lower case type labels of the words in the western sentences to be distinguished in the standard expression of the western sentences; correcting the case of the word in the western sentence to be distinguished according to the case type label of the word in the western sentence to be distinguished; the case recognition model is obtained based on western standard corpus training of the same language type as western sentences to be distinguished. The scheme improves the accuracy of case recognition and error correction of western sentences.)

1. A method for correcting Western text, comprising:

converting words in the western sentences to be distinguished into lower case, inputting the lower case to a trained upper case recognition model, and obtaining upper case type labels and lower case type labels of the words in the western sentences to be distinguished in the standard expression of the western sentences;

correcting the case of the word in the western sentence to be distinguished according to the case type label of the word in the western sentence to be distinguished;

and the case recognition model is obtained by training based on the western standard corpus of the same language type as the western sentence to be distinguished.

2. The method of claim 1, wherein the method further comprises:

training the case recognition model based on the western language standard corpus with the same language type as the western language sentence to be distinguished, comprising the following steps:

performing word segmentation on the sentences in the western language standard corpus, and labeling each word according to the case type of each word;

converting words contained in sentences in the western language standard corpus into words with all letters in lower case to obtain normalized western language training data;

training the case recognition model based on the normalized western training data and the labels of each word in the corresponding sentence.

3. The method of claim 1, wherein the case recognition model comprises a word embedding layer;

the word embedding layer is used for embedding the serial number of each word in a word list of a western word sequence obtained by converting the input western sentence to obtain a first embedding vector, embedding the letter sequence contained in each word in the western word sequence obtained by converting the input western sentence to obtain a second embedding vector, and splicing the first embedding vector and the second embedding vector to obtain the characteristic vector of the western sentence input into the case model.

4. The method of claim 3, wherein the case training model further comprises an encoding layer and a classification layer;

the coding layer carries out bidirectional cyclic coding on the feature vectors of the Western sentences, and the classification layer carries out case type classification on each word in the Western sentences.

5. The method according to any one of claims 1 to 4, wherein the correcting the case of the word in the Western sentence to be discriminated according to the case type tag of the word in the Western sentence to be discriminated comprises:

marking the position of a word with the case inconsistent with the corresponding case type label in the sentence to be distinguished, and correcting the word according to the case type label.

6. An apparatus for correcting western text, comprising:

the prediction unit is configured to convert words in the western sentence to be distinguished into lower case and input the lower case to the trained capital and lower case recognition model to obtain capital and lower case type labels of the words in the western sentence to be distinguished in the standard expression of the western sentence;

the correcting unit is configured to correct the case of the word in the western sentence to be distinguished according to the case type label of the word in the western sentence to be distinguished;

and the case recognition model is obtained by training based on the western standard corpus of the same language type as the western sentence to be distinguished.

7. The apparatus of claim 6, wherein the apparatus further comprises:

a training unit configured to train the case recognition model based on a western standard corpus of the same language type as the western sentence to be discriminated;

the training unit includes:

the labeling module is configured to perform word segmentation on the sentences in the western language standard corpus and label each word according to the capital and lower case type of each word;

a conversion module configured to convert words contained in the sentences in the western language standard corpus into words of all letters in lower case, so as to obtain normalized western language training data;

a training submodule configured to train the case recognition model based on the normalized Western training data and the labels of each word in the corresponding sentence.

8. The apparatus of claim 6, wherein the case recognition model comprises a word embedding layer;

9. The apparatus of claim 8, wherein the case training model further comprises an encoding layer and a classification layer;

10. The apparatus according to any one of claims 6-9, wherein the correction unit comprises:

and the marking module is configured to mark the position of a word in the sentence to be distinguished, wherein the case of the sentence to be distinguished is inconsistent with the corresponding case type label, and the word is corrected according to the case type label.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a natural language processing technology, and in particular, to a method and an apparatus for correcting a western text, an electronic device, and a storage medium.

Background

Natural language processing is a branch of artificial intelligence, a technique that studies computers to understand and process human languages. The natural language processing object is a human language text, the specific application mode comprises machine translation, text classification, Chinese word segmentation, entity recognition and the like, and the method can be applied to scenes such as public opinion monitoring, intelligent conversation and the like.

The error correction of the text has important significance for accurately understanding the meaning and correctly classifying the text by a machine. In texts with letters as basic character units, such as english, german, etc., case errors of words are a common type of errors. The aspect of capital and small but word capital and small errors is wider, such as capitalization of names of people, place names, capitalization of proper nouns, and the like. The current method is to use rules and dictionaries to identify case errors, and if the sentence contains words hitting the dictionary, then case identification is performed.

Disclosure of Invention

The disclosure provides a western text error correction method and device, an electronic device and a storage medium.

According to an aspect of the present disclosure, there is provided a method for correcting a western text, including: converting the words in the western sentences to be distinguished into lower case, and inputting the lower case to the trained upper case recognition model to obtain upper case type labels and lower case type labels of the words in the western sentences to be distinguished in the standard expression of the western sentences; correcting the case of the word in the western sentence to be distinguished according to the case type label of the word in the western sentence to be distinguished; the case recognition model is obtained based on western standard corpus training of the same language type as western sentences to be distinguished.

According to a second aspect of the present disclosure, there is provided a western text error correction apparatus, comprising:

the prediction unit is configured to convert words in the western sentences to be distinguished into lower case words and input the lower case words into the trained upper and lower case recognition model to obtain upper and lower case type labels of the words in the western sentences to be distinguished in the standard expression of the western sentences; the correcting unit is configured to correct the case of the word in the western sentence to be distinguished according to the case type label of the word in the western sentence to be distinguished; the case recognition model is obtained based on western standard corpus training of the same language type as western sentences to be distinguished.

According to a third aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method of correcting western text provided in the first aspect.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method for correcting an error of western text provided in the first aspect.

The technology according to the application realizes the correction of word size in western text. .

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a schematic flow chart diagram of one embodiment of a method for correcting western text in accordance with the present application;

FIG. 2 is a schematic diagram of a training process for a case recognition model according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of a case recognition model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an embodiment of a Western text correction device according to the present application;

fig. 5 is a block diagram of an electronic device for implementing a method for correcting western text according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The method for correcting the western text can be applied to a system architecture comprising a client, a network and a server. In the system architecture, a client transmits western language sentences to be distinguished to a server through a network. The server side operates the Western capital and small case recognition model, after the Western sentences sent by the client side are segmented, the capital and small case conditions of each word in the standard expression of the Western sentences are recognized by the Western capital and small case recognition model, the recognition results can be returned to the client side, or the Western sentences to be distinguished are corrected based on the recognition results and then the Western sentences with correct capital and small case expressions of the words are returned to the client side.

The server can also obtain a western language material expressed by the standard, and train a case recognition model by using the western language material.

The client may be software or hardware. When the client is implemented as hardware, it may be a terminal device such as a mobile phone and a computer. When the client is implemented as software, it may be a dedicated client installed in an electronic device such as a mobile phone or a computer to implement the western text error correction method, or may be a program embedded in the client, or may be implemented as a plurality of distributed software modules.

The server may also be implemented as hardware or software. When the server is implemented as hardware, it may be a server. When the server is implemented as software, it may be implemented as a plurality of software modules providing distributed services.

Generally, the western text error correction method according to the embodiment of the present application may be applied to the server or the client.

Please refer to fig. 1, which shows a flowchart of an embodiment of the method for correcting the western text of the present application. As shown in fig. 1, a flow 100 of the western text error correction method of this embodiment includes the following steps:

step 101, converting the words in the western sentence to be distinguished into lower case, and inputting the lower case to the trained upper case recognition model, so as to obtain the upper case type label of the words in the western sentence to be distinguished in the standard expression of the western sentence.

In this embodiment, an execution subject (e.g., a client or a server) of the western text error correction method may first obtain a western sentence to be distinguished. The western sentences to be distinguished can be text sentences of languages such as English, French, German and the like. The execution main body may receive a western language sentence to be discriminated, which is sent by a terminal, or the execution main body may acquire a western language sentence designated by a user as the western language sentence to be discriminated.

In an actual scenario, for example, in an inspection scenario of a news manuscript, a sentence in an edited western news manuscript may be used as a western sentence to be distinguished. And performing case correction on each western sentence in the news manuscript in turn.

The execution body may first convert all words in the western sentence to be discriminated into lower case to normalize all words into a format not including big and small writing information. Therefore, when the case recognition model is used for processing the western sentence to be distinguished, the case recognition model is not influenced by the case of the word in the western sentence to be distinguished, and the recognition result of the case recognition model can be prevented from being influenced by wrong case expression in the western sentence to be distinguished.

The case recognition model can be obtained based on western standard corpus training of the same language type as the western sentence to be distinguished. Here, the western language standard corpus may be obtained by capturing sentence texts of corresponding language types in news text corpus, standard dictionary or tool book of authoritative media. A case recognition model can be respectively constructed for each language type, and the case recognition model is trained based on the standard corpus of the language type. Or, a case recognition model that can distinguish the case format of the words of the western sentences of more than two languages can be constructed, that is, a single case recognition model can recognize the case of the words in the sentences of more than two language types, and at this time, the training data needs to collect the standard corpus of more than two language types.

The case and case recognition model can firstly split a sentence to be distinguished into a word sequence, and for each word in the word sequence, the case and case type of the word in the standard expression of the western sentence to be distinguished is recognized by combining the context, so that the case and case type label of each word is obtained. Wherein case type tags may include: an upper case label for the first letter, an upper case label for all letters, a lower case label for all letters, such as FirstC (upper case label for the first letter), AllC (upper case label for all letters), O (lower case label for all letters). Alternatively, "other" tags may also be included, corresponding to other types, such as upper case with middle letters and lower case with beginning and end letters, etc.

As an example, after the sentence to be distinguished "li gos home" is input into the case recognition model, the labels corresponding to the three words in the sentence are obtained as follows: first C, O.

And 102, correcting the case of the word in the western sentence to be distinguished according to the case type label of the word in the western sentence to be distinguished.

After the case type of the word in the western sentence to be distinguished is identified by using the case identification model, the case of the word in the western sentence to be distinguished can be corrected according to the case type label of each word.

Specifically, whether the case format of the word in the western sentence to be discriminated is consistent with the case type label of the word identified by the case identification model in step 101 may be sequentially determined, and if not, the corresponding word in the western sentence to be discriminated is corrected according to the case type label. For example, the first word "li" in the sentence "li gos home" to be discriminated is in all-letter lowercase format, and the case recognition model recognizes that the case type label of the word is "capitalized". The sentence corrected for the word is "Lily goes home".

By the method, standard capital and small cases of the words in the western sentences to be distinguished are more comprehensively and accurately identified by utilizing the trained capital and small case identification model, so that capital and small cases of the words in the western sentences can be more accurately corrected, irregular western expressions are converted into standard expressions, and automatic correction of the western texts is realized. In addition, the case recognition model in the method is obtained based on western language standard corpus training, and the context information of each word in the western language sentence can be learned, so that the case of each word can be recognized more accurately. For example, the initials are capitalized for "Rose" when used as the name of a person, and all the letters are lowercase when used to express the meaning of "Rose" and not at the beginning of a period. The case and capital writing model of the embodiment can learn the context and the semantics of each word according to the context, so that a better correction effect is achieved for the words possibly mixed based on the word list, and the recall rate of case and capital correction in western sentences can be improved.

In some embodiments, the method further comprises the step of training the case recognition model based on the western standard corpus of the same language type as the western sentence to be discriminated. In order to improve the training efficiency and the effectiveness of the training data, the training corpus of the case recognition model can be processed into data suitable for the case recognition model.

Specifically, please refer to FIG. 2, which illustrates a schematic diagram of a process of training a case recognition model. As shown in FIG. 2, the process 200 for training the case recognition model includes the following steps:

step 201, performing word segmentation on sentences in the western language standard corpus, and labeling each word according to the case type of each word.

The execution main body may perform word segmentation on the obtained sentences in the western standard corpus based on separators (e.g., spaces, connectors "-") between the words to obtain word sequences corresponding to the sentences in the western standard corpus.

Each word in the sequence of words obtained in step 201 may then be tagged according to its case type. Where labels characterize the case type of the word, alternative labels may include "first capital", "all lowercase", "other". The label of each word is one of the above-mentioned optional labels.

The execution body may identify the case type of each word in a manner that letters in the word are matched to the case format one by one, thereby automatically tagging each word. Or, the labeling of the word may be completed by a labeling person and then acquired by the execution main body. The labeling personnel only need to select one label as a word from the selectable labels, and the labeling efficiency is high.

Step 202, converting the words contained in the sentences in the western language standard corpus into words with all letters in lower case, and obtaining normalized western language training data.

In this embodiment, all words included in a sentence in the western language standard corpus may be converted into all lower-case words, that is, all upper-case letters in the sentence are converted into corresponding lower-case letters, thereby implementing normalization of training data to obtain normalized western language training data.

The normalized western language training data does not contain the capital and small case type information of the words, so that the model to be subjected to capital and small case recognition is not influenced by the capital and small cases of the words in the original sentence in the training process, and the method for judging the capital and small case types of the words based on the context can be accurately learned.

Step 203, training a case recognition model based on the normalized western training data and the labels of each word in the corresponding sentence.

The normalized western training data may be input to the case recognition model to be trained, and the case recognition model to be trained may perform word segmentation on the normalized western training data and then predict the case label of each word. Here, the case recognition model to be trained may be an attention-based neural network model, such as an attention-based recurrent neural network model. The case type label of the word in each sentence in the western language training data predicted based on the case recognition model to be trained may be compared with the case label of the word corresponding to the bit in step 202 to determine a prediction error of the case model, and the case recognition model to be trained may be supervised for iterative updating based on the prediction error.

Fig. 3 shows a schematic structural diagram of a case recognition model according to an embodiment of the present application. As shown in FIG. 3, the case recognition model may include a word embedding layer. The word embedding is used for embedding the serial number of each word in a word list of a western word sequence obtained by converting an input western sentence to obtain a first embedded vector, embedding a letter sequence contained in each word in the western word sequence obtained by converting the input western sentence to obtain a second embedded vector, and splicing the first embedded vector and the second embedded vector to obtain a characteristic vector of the western sentence with an input case model. Wherein, the second embedded vector can be obtained by processing the corresponding word by using a character convolutional neural network (char CNN). The character convolutional neural network may generate a second embedding vector composed of letters sequentially combined into a corresponding word based on the word embedding result of the single letter. The first and second embedded vectors are concatenated to form an embedded vector for the corresponding word.

The word embedding layer can embed the serial number of the word and embed the characters of the letter sequence contained in the word, so that the obtained embedded vector contains the overall characteristics of the word and the relation characteristics among the letters in the word, and the case and case recognition model can be helped to learn and recognize the case and case type of the word more accurately.

Further, the case recognition model further comprises an encoding layer and a classification layer. The coding layer carries out bidirectional cyclic coding on the feature vectors of the western sentences, and the classification layer carries out upper and lower case type classification on each word in the western sentences. Fig. 3 illustrates a bidirectional LSTM (Long Short Term Memory networks) encoder as an example of the encoding layer. The coding layer can further code word embedding results of the words and extract characteristics of the words which are useful for case recognition. The bidirectional LSTM can model the word embedding result, and the constructed model covers all language information in the sentence. The classification layer can classify the input word sequence into upper case type and lower case type based on the output of the coding layer, and output the upper case type label of each word in the word sequence. In fig. 3, a CRF (Conditional Random Field) layer is taken as an example of the classification layer. And the CRF layer carries out modeling on the result of the bidirectional LSTM layer by using a method based on a conditional random field to obtain the capital and small type labels corresponding to the words.

As can be seen from fig. 3, by adopting a case and case model structure based on sequential processing and modeling the characteristics of the word sequence by using the word embedding layer and the coding layer in a hierarchical manner, more accurate word and context characteristics can be extracted, thereby helping to improve the accuracy of the recognition result of the case and case recognition model.

When the performance (such as precision, time delay and the like) of the case and case recognition model after iterative adjustment based on the normalized western language training data reaches a certain condition, or the iteration times reach a time threshold, the training can be stopped, and the obtained case and case recognition model is the trained case and case recognition model.

The process of training the case recognition model of the embodiment is to construct the standard training corpus and process the standard training corpus to enable the training data to be suitable for being input into the case recognition model for processing.

In some optional implementations of the foregoing embodiment, the step of correcting the case of the word in the western sentence to be distinguished may be implemented as follows: marking the position of a word with the case inconsistent with the corresponding case type label in the sentence to be distinguished, and correcting the word according to the case type label.

The position of the word in the sentence to be distinguished, wherein the case of the sentence to be distinguished is inconsistent with the corresponding case type label, can be marked by the serial number of the word in the sentence, and the correction result of the word is the standard expression of the word in the corresponding sentence.

For example, for the example "li goes home" above, the correction result is: ("lili", 0, "Lily"), where "Lily" is the standard expression for "lili" after correction, and "0" identifies the location of "lili" in the entire sentence.

The realization method can simply and definitely mark the position of the word with the wrong size in the sentence and the corresponding correct expression, so that the word with the mistake can be accurately found according to the position mark when the whole sentence is corrected, the word is replaced by the corresponding correct mark, and the case correction of the western language sentence is quickly finished.

Referring to fig. 4, as an implementation of the method for correcting the western text, the present disclosure provides an embodiment of a device for correcting the western text, where the device embodiment corresponds to the above method embodiments, and the device may be applied to various electronic devices.

As shown in fig. 4, the western text error correction apparatus 400 of the present embodiment includes a prediction unit 401 and a correction unit 402. The prediction unit 401 converts the words in the western sentence to be distinguished into lower case, and inputs the lower case to the trained capital and lower case recognition model to obtain capital and lower case type labels of the words in the western sentence to be distinguished in the standard expression of the western sentence; the correcting unit 402 corrects the case of the word in the western sentence to be discriminated according to the case type label of the word in the western sentence to be discriminated; the case recognition model is obtained based on western standard corpus training of the same language type as western sentences to be distinguished.

In some embodiments, the above apparatus further comprises: a training unit configured to train a case recognition model based on a western standard corpus of the same language type as a western sentence to be discriminated; the training unit comprises: the labeling module is configured to perform word segmentation on sentences in the western language standard corpus and label each word according to the capital and lower case type of each word;

the conversion module is configured to convert words contained in sentences in the western language standard corpus into words with all letters in lower case, so as to obtain normalized western language training data; a training submodule configured to train a case recognition model based on the normalized western training data and the labels of each word in the corresponding sentence.

In some embodiments, the case recognition model includes a word embedding layer; the word embedding layer is used for embedding the serial number of each word in a western word sequence obtained by converting the input western sentence in a word list to obtain a first embedding vector, embedding the letter sequence contained in each word in the western word sequence obtained by converting the input western sentence to obtain a second embedding vector, and splicing the first embedding vector and the second embedding vector to obtain the characteristic vector of the western sentence with the input case model.

In some embodiments, the case training model further comprises an encoding layer and a classification layer; the coding layer carries out bidirectional cyclic coding on the feature vectors of the western sentences, and the classification layer carries out upper and lower case type classification on each word in the western sentences.

In some embodiments, the correcting unit 402 includes: and the marking module is configured to mark the position of a word with a case inconsistent with the corresponding case type label in the sentence to be distinguished, and the correction result of the word according to the case type label.

The above-described apparatus 400 corresponds to the steps in the foregoing method embodiments. Thus, the operations, features and technical effects described above for the error correction method for the western text are also applicable to the apparatus 400 and the units included therein, and are not described herein again.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 5, is a block diagram of an electronic device according to an embodiment of the application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.

Memory 502 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for correcting western text provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the method of correcting western text provided herein.

The memory 502, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the prediction unit 401 and the correction unit 402 shown in fig. 4) corresponding to the western text error correction method in the embodiments of the present application. The processor 501 executes various functional applications of the server and data processing, i.e., implements the western text error correction method in the above method embodiment, by running non-transitory software programs, instructions, and modules stored in the memory 502.

The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of an electronic device that performs an error correction method of western text, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 optionally includes memory located remotely from processor 501, which may be connected over a network to an electronic device that performs the method for correcting western text. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device performing the method of correcting a western text may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and are exemplified by a bus 505 in fig. 5.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic apparatus that performs the error correction method for western texts, such as an input device such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or the like. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, the accurate recognition and correction of the nonstandard case and case conditions in the western sentences are realized.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

14页详细技术资料下载

Western text error correction method and device, electronic equipment and storage medium

相关技术

网友询问留言