System and method for providing natural language processing service

文档序号:1544891 发布日期:2020-01-17 浏览:21次 中文

阅读说明:本技术 一种提供自然语言处理服务的系统及方法 (System and method for providing natural language processing service ) 是由 林廷懋 钟伊妮 郭泽颖 柯颖 陈铭新 李晓敦 赵世辉 于 2019-09-27 设计创作,主要内容包括:本发明公开了一种提供自然语言处理服务的系统及方法,涉及自然语言处理技术领域。该系统一具体实施方式包括:标注平台、训练平台、自然语言处理应用平台、数据平台;所述标注平台,用于将标注后的数据存储至所述数据平台;所述训练平台,用于对所述标注后的数据进行训练用以生成自然语言处理模型;所述自然语言处理应用平台,用于使用所述自然语言处理模型对待识别文本提供标签,将生成的含有所述待识别文本的服务日志存储至所述数据平台,以使得所述标注平台从所述服务日中获取新的待标注的数据。该实施方式可以源源不断地扩充待标注的数据,进而为用户提供持续的自然语言处理服务。(The invention discloses a system and a method for providing natural language processing service, relating to the technical field of natural language processing. One embodiment of the system comprises: the system comprises a marking platform, a training platform, a natural language processing application platform and a data platform; the marking platform is used for storing marked data to the data platform; the training platform is used for training the marked data to generate a natural language processing model; the natural language processing application platform is used for providing a label for a text to be recognized by using the natural language processing model, and storing the generated service log containing the text to be recognized to the data platform, so that the labeling platform acquires new data to be labeled from the service date. The implementation mode can continuously expand the data to be labeled, and further provides continuous natural language processing service for the user.)

1. A system for providing natural language processing services, comprising: the system comprises a marking platform, a training platform, a natural language processing application platform and a data platform; wherein the content of the first and second substances,

the marking platform is used for acquiring data to be marked from the data platform, marking the data to be marked and storing the marked data to the data platform;

the training platform is used for acquiring the labeled data from the data platform, training the labeled data to generate a natural language processing model, and storing the natural language processing model to the data platform;

the natural language processing application platform is used for acquiring the natural language processing model from the data platform, providing a label for a text to be recognized by using the natural language processing model, and storing the generated service log containing the text to be recognized to the data platform so that the labeling platform acquires new data to be labeled from the service date;

the data platform is used for storing the data to be labeled, the labeled data, the natural language processing model and the service log.

2. The system for providing natural language processing services of claim 1 wherein the natural language processing application platform is configured to receive a natural language processing task from a model caller, the natural language processing task indicating the text to be recognized.

3. The system for providing natural language processing services of claim 2,

and the natural language processing application platform is used for sending the label corresponding to the text to be recognized to the model caller and receiving the label corresponding to the text to be recognized after the model caller is calibrated.

4. The system for providing natural language processing services of claim 3 wherein said training platform evaluates and optimizes said natural language processing model using calibrated tags.

5. The system for providing natural language processing services of claim 1,

the natural language processing model is provided with a model identifier, and the data to be labeled for generating the natural language processing model is provided with a labeling task identifier;

correspondingly storing the model identification, the labeling task identification, the text to be recognized and the label.

6. A method of providing natural language processing services, comprising:

acquiring data to be marked from a data platform, marking the data to be marked, and storing the marked data to the data platform;

acquiring the labeled data from the data platform, training the labeled data to generate a natural language processing model, and storing the natural language processing model to the data platform;

and acquiring a file corresponding to the natural language processing service model from the data platform, providing a label for a text to be recognized by using the natural language processing model according to the natural language processing service model, and storing a generated service log containing the text to be recognized to the data platform so as to acquire new data to be labeled from the service log.

7. The method of providing natural language processing services of claim 6 wherein a natural language processing task of a model caller is received, the natural language processing task indicating the text to be recognized.

8. The method of claim 7, wherein the tag corresponding to the text to be recognized is sent to the model caller, and the tag corresponding to the text to be recognized calibrated by the model caller is received.

9. The method of providing natural language processing services of claim 8 wherein the natural language processing model is evaluated and optimized using calibrated tags.

10. The method of claim 6, wherein the natural language processing model has a model identification, and the data to be labeled used to generate the natural language processing model has a labeling task identification;

correspondingly storing the model identification, the labeling task identification, the text to be recognized and the label.

11. A server for providing natural language processing services, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 6-10.

12. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 6-10.

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a system and a method for providing a natural language processing service.

Background

In actual landing, Natural Language Processing (NLP) has many problems, such as less labeled data, more service scenes, frequent adjustment of service scenes, and the like. Therefore, how to solve the problem of business knowledge accumulation in the actual landing and application processes and continuously using the knowledge accumulation to promote the natural language processing model obtained based on the existing small amount of prior knowledge becomes the key point of research.

Currently, although a solution for providing a natural language processing Service based on SAAS (Software-as-a-Service) is available, that is, a Software Service is provided through a network, the need for providing a continuous and high-quality natural language processing Service to a user cannot be met.

Disclosure of Invention

In view of this, the present invention provides a system and a method for providing natural language processing service, which can provide natural language processing service for a user, and continuously obtain new data to be labeled or training data based on a service log containing a text to be recognized, which is generated when the natural language processing service is provided, so as to continuously improve or improve a natural language processing model obtained by training, thereby providing continuous and high-quality natural language processing service for the user.

To achieve the above object, according to one aspect of the present invention, there is provided a system for providing a natural language processing service, including: the system comprises a marking platform, a training platform, a natural language processing application platform and a data platform; the marking platform is used for acquiring data to be marked from the data platform, marking the data to be marked and storing the marked data to the data platform; the training platform is used for acquiring the labeled data from the data platform, training the labeled data to generate a natural language processing model, and storing the natural language processing model to the data platform; the natural language processing application platform is used for acquiring the natural language processing model from the data platform, providing a label for a text to be recognized by using the natural language processing model, and storing the generated service log containing the text to be recognized to the data platform so that the labeling platform acquires new data to be labeled from the service date; the data platform is used for storing the data to be labeled, the labeled data, the natural language processing model and the service log.

Optionally, the natural language processing application platform is configured to receive a natural language processing task of a model caller, where the natural language processing task indicates the text to be recognized.

Optionally, the natural language processing application platform is configured to send the tag corresponding to the text to be recognized to the model caller, and receive the tag corresponding to the text to be recognized after calibration by the model caller.

Optionally, the training platform evaluates and optimizes the natural language processing model using the calibrated tags.

Optionally, the natural language processing model has a model identifier, and the data to be labeled for generating the natural language processing model has a labeling task identifier; correspondingly storing the model identification, the labeling task identification, the text to be recognized and the label.

To achieve the above object, according to another aspect of the present invention, there is provided a method of providing a natural language processing service, including: acquiring data to be marked from a data platform, marking the data to be marked, and storing the marked data to the data platform; acquiring the labeled data from the data platform, training the labeled data to generate a natural language processing model, and storing the natural language processing model to the data platform; and acquiring a file corresponding to the natural language processing service model from the data platform, providing a label for a text to be recognized by using the natural language processing model according to the natural language processing service model, and storing a generated service log containing the text to be recognized to the data platform so as to acquire new data to be labeled from the service log.

Optionally, a natural language processing task of a model caller is received, the natural language processing task indicating the text to be recognized.

Optionally, the label corresponding to the text to be recognized is sent to the model caller, and the label corresponding to the text to be recognized after calibration by the model caller is received.

Optionally, the natural language processing model is evaluated and optimized using the calibrated tags.

Optionally, the natural language processing model has a model identifier, and the data to be labeled for generating the natural language processing model has a labeling task identifier; correspondingly storing the model identification, the labeling task identification, the text to be recognized and the label.

To achieve the above object, according to still another aspect of the present invention, there is provided a server for providing a natural language processing service, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement any of the methods of providing natural language processing services as described above.

To achieve the above object, according to still another aspect of the present invention, there is provided a computer readable medium having stored thereon a computer program characterized in that the program implements any one of the methods of providing a natural language processing service as described above when executed by a processor.

The technical scheme provided by the invention has the following advantages or beneficial effects: because the service log which is generated when the natural language processing service is provided and contains the text to be identified is stored in the data platform, the system for providing the natural language processing service can continuously acquire new data to be labeled from the service log, thereby realizing the accumulation of knowledge, further continuously updating or improving the natural language processing model according to the new data to be labeled, and further providing continuous and efficient natural language processing service for users. In addition, the natural language processing model can be evaluated or optimized according to the label corresponding to the text to be recognized after the model calling party is calibrated, so that the quality of the updated natural language processing model is higher, and the user experience can be improved.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

fig. 1 is a schematic diagram of a main structure of a system for providing a natural language processing service according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a main flow of a method of providing natural language processing services according to an embodiment of the present invention;

FIG. 3 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 4 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of a main structure of a system for providing a natural language processing service according to an embodiment of the present invention, and as shown in fig. 1, a system 100 for a natural language processing service includes: a marking platform 101, a training platform 102, a natural language processing application platform 103 and a data platform 104; the marking platform 101 is configured to obtain data to be marked from the data platform, mark the data to be marked, and store the marked data to the data platform; the training platform 102 is configured to obtain the labeled data from the data platform, train the labeled data to generate a natural language processing model, and store the natural language processing model to the data platform; the natural language processing application platform 103 is configured to obtain the natural language processing model from the data platform, provide a tag for a text to be recognized using the natural language processing model, and store a generated service log containing the text to be recognized to the data platform, so that the tagging platform obtains new data to be tagged from the service date; the data platform 104 is configured to store the data to be labeled, the labeled data, the natural language processing model, and the service log.

It is understood that the data platform 104 may store any data obtained from other business platforms or databases, etc., in addition to the data to be annotated, the annotated data, the natural language processing model, and the service log. In addition, under the condition that the data to be labeled is preprocessed or labeled by means of a knowledge graph, a custom rule (such as a regular expression, a domain-specific language DSL, and the like), an industry thesaurus, a dictionary (a national, county, city and place name library), and the like, the data platform is further used for storing the rule, the thesaurus, the knowledge graph, and the like so that the data labeling platform 101 can obtain the corresponding rule, the dictionary, and the like from the data platform 104 according to actual needs to improve the efficiency and the quality of data labeling. In addition, in order to facilitate indexing or to quickly obtain corresponding data from the data platform, the data in the data platform 104 is stored in the form of files and directories, for example, the annotation platform 102 stores the annotated data as a single file to the data platform 104, and the like, so that the data platform 104 can provide an API for accessing the data platform 104 based on the files and the directories to the annotation platform 101, the training platform 102, and the natural language processing application platform 103.

In an alternative embodiment, the natural language processing application platform 103 is configured to receive a natural language processing task of a model caller, where the natural language processing task indicates the text to be recognized.

It will be appreciated that the system 100 providing natural language processing services can provide a variety of natural language processing tasks, such as machine translation, language generation, language understanding, and the like. The natural language processing models generated by training to provide different types of natural language processing services are different. Therefore, the model caller can selectively call the corresponding natural language processing model according to the actual requirement, and send the natural language processing task containing the text to be recognized to the natural language processing application platform 103 when calling the natural language processing model, so that the natural language processing model gives the corresponding label according to the corresponding text to be recognized. For example, the text to be recognized "hello world" is translated into Chinese, and after a corresponding English-to-Chinese machine translation model is called, the label of the text to be recognized "hello world" is given as "hello world".

In an optional implementation manner, the natural language processing application platform 103 is configured to send the tag corresponding to the text to be recognized to the model caller, and receive the tag corresponding to the text to be recognized after calibration by the model caller.

Due to the limited training data when generating the natural language processing model, the correct label cannot be given exactly one hundred percent according to the text to be recognized indicated by the natural language processing task sent by the model caller. Therefore, when the model caller receives the label of the text to be recognized returned by the natural language processing model, the label can be manually calibrated according to the actual situation.

It can be understood that although the tag of the text to be recognized given by the natural language processing model has a certain error, the text to be recognized and the corresponding tag can be used as the data to be labeled, and after calibration of the labeling platform, the training data of the natural language processing model is extended.

In an alternative embodiment, the training platform 102 uses calibrated tags to evaluate and optimize the natural language processing model.

Because the label reliability corresponding to the manually calibrated text to be recognized is high, the training platform providing the natural language processing system 100 can optimize the natural language processing model by using the text to be recognized and the calibrated label after receiving the text to be recognized and the calibrated label returned by the model caller in a log reflux manner and the like, so that a high-quality natural language processing model which better meets the requirements of the user or the model caller can be obtained, and continuous and high-quality natural language processing service is provided for the model caller.

In an optional implementation manner, the natural language processing model has a model identifier, and the data to be labeled for generating the natural language processing model has a labeling task identifier; correspondingly storing the model identification, the labeling task identification, the text to be recognized and the label.

It can be appreciated that, in order to improve the user experience and facilitate the annotation platform or the training platform to obtain the data in the service log from the data platform 104, the model identifier, the annotation task identifier, the text to be recognized, and the tag are correspondingly stored in the data platform 104. Specifically, as shown in table 1 below, after receiving a service log containing a text to be recognized and a calibrated tag returned by a model caller or generating a service log containing a text to be recognized and a tag, the natural language processing application platform 103 may store the model identifier, the labeling task identifier, the text to be recognized and the tag in the data platform 104 according to the model identifier indicated by the service log and the corresponding relationship between the model identifier and the labeling task identifier stored in the natural language processing application platform 103, so that the labeling platform 101 and the training platform 102 may obtain the text to be recognized and/or the tag from the data platform 104 according to the storage limit labeling task identifier.

Table 1 data storage format of service log containing text to be recognized

Model identification Annotating task identifiers Text to be recognized Label (R) Others

It is to be noted that, in order to save the storage space of the data platform 104, the natural language processing application platform 103 may also correspondingly store the annotation task identifier, the text to be recognized, and the tag (see table 2 for details) in the data platform 104 according to the model identifier and the annotation task identifier correspondingly stored therein, so that the annotation platform 101 and the training platform 102 may obtain the text and/or the tag to be recognized from the data platform 104 according to the storage limit annotation task identifier.

Table 2 another data storage format for service logs containing text to be recognized

Annotating task identifiers Text to be recognized Label (R) Others

It can be understood that the service log containing the text to be recognized stored in the data platform 104 may also store other information such as a timestamp according to actual requirements, in addition to the model identifier, the labeling task identifier, the text to be recognized and the tag.

On the basis of the embodiment, the system for providing the natural language processing service stores the service log which is generated when the natural language processing service is provided and contains the text to be recognized into the data platform, so that the system for providing the natural language processing service can continuously acquire new data to be labeled from the service log, the accumulation of knowledge is realized, and then the natural language processing model can be continuously updated or improved according to the new data to be labeled, and continuous and efficient natural language processing service can be provided for users. In addition, the natural language processing model can be evaluated or optimized according to the label corresponding to the text to be recognized after the model calling party is calibrated, so that the quality of the updated natural language processing model is higher, and the user experience can be improved.

On the basis of the foregoing embodiments, an embodiment of the present invention provides a method for providing a natural language processing service, which may specifically include the following steps:

step S201, obtaining data to be marked from a data platform, marking the data to be marked, and storing the marked data to the data platform.

Step S202, obtaining the labeled data from the data platform, training the labeled data to generate a natural language processing model, and storing the natural language processing model to the data platform.

Step S203, acquiring a file corresponding to the natural language processing service model from the data platform, providing a label for a text to be recognized by using the natural language processing model according to the natural language processing service model, and storing a generated service log containing the text to be recognized to the data platform so as to acquire new data to be labeled from the service log.

In an alternative embodiment, a natural language processing task of a model caller is received, the natural language processing task indicating the text to be recognized.

In an optional implementation manner, the label corresponding to the text to be recognized is sent to the model caller, and the label corresponding to the text to be recognized after calibration by the model caller is received.

In an alternative embodiment, the natural language processing model is evaluated and optimized using calibrated tags.

In an optional implementation manner, the natural language processing model has a model identifier, and the data to be labeled for generating the natural language processing model has a labeling task identifier; correspondingly storing the model identification, the labeling task identification, the text to be recognized and the label.

Fig. 3 illustrates an exemplary system architecture 300 to which the method of providing natural language processing services of embodiments of the present invention may be applied.

As shown in fig. 3, the system architecture 300 may include terminal devices 301, 302, 303, a network 304, and a server 305. The network 304 serves as a medium for providing communication links between the terminal devices 301, 302, 303 and the server 305. Network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the terminal device 301, 302, 303 to interact with the server 305 via the network 304 to receive or send messages or the like. The terminal devices 301, 302, 303 may have various communication client applications installed thereon, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like.

The terminal devices 301, 302, 303 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 305 may be a server providing various services, such as a background management server providing support for shopping websites browsed by the user using the terminal devices 301, 302, 303. The background management server can analyze and process the received data such as the product information query request and feed back the processing result (such as the label of the text to be recognized) to the terminal device.

It should be noted that the method for providing natural language processing service provided by the embodiment of the present invention is generally executed by the server 305.

It should be understood that the number of terminal devices, networks, and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 4, a block diagram of a computer system 400 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 4, the computer system 400 includes a Central Processing Unit (CPU)401 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the system 400 are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as necessary, so that a computer program read out therefrom is mounted into the storage section 408 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the system of the present invention when executed by a Central Processing Unit (CPU) 401.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring data to be marked from a data platform, marking the data to be marked, and storing the marked data to the data platform; acquiring the labeled data from the data platform, training the labeled data to generate a natural language processing model, and storing the natural language processing model to the data platform; and acquiring a file corresponding to the natural language processing service model from the data platform, providing a label for a text to be recognized by using the natural language processing model according to the natural language processing service model, and storing a generated service log containing the text to be recognized to the data platform so as to acquire new data to be labeled from the service log.

According to the technical scheme of the embodiment of the invention, the service log which is generated when the natural language processing service is provided and contains the text to be identified is stored in the data platform, so that the system for providing the natural language processing service can continuously acquire new data to be labeled from the service log, the accumulation of knowledge is realized, and the natural language processing model can be continuously updated or promoted according to the new data to be labeled, thereby providing continuous and efficient natural language processing service for users. In addition, the natural language processing model can be evaluated or optimized according to the label corresponding to the text to be recognized after the model calling party is calibrated, so that the quality of the updated natural language processing model is higher, and the user experience can be improved.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

12页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种面向汽车发动机故障诊断的命名实体识别方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!