Entity extraction method, device, equipment and computer readable storage medium

文档序号:1889898 发布日期:2021-11-26 浏览:7次 中文

阅读说明:本技术 实体提取方法、装置、设备及计算机可读存储介质 (Entity extraction method, device, equipment and computer readable storage medium ) 是由 王明 于 2021-03-23 设计创作,主要内容包括:本申请提供了一种实体提取方法、装置、设备及计算机可读存储介质;方法包括:获取待提取文本中包含的至少一个字符向量以及至少一个扩展词向量;至少一个扩展词向量包含至少一个预设实体向量;至少一个预设实体向量为待提取文本在预设实体字典中对应的实体的向量信息;基于至少一个字符向量以及至少一个扩展词向量进行编解码变换,得到待提取文本对应的至少一个目标实体;至少一个目标实体用于实现对待提取文本的自然语言处理。通过本申请,能够在保证实体提取准确性的基础上,提高实体提取的效率。(The application provides an entity extraction method, device, equipment and a computer readable storage medium; the method comprises the following steps: acquiring at least one character vector and at least one expansion word vector contained in a text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary; performing coding and decoding transformation based on at least one character vector and at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted; and the at least one target entity is used for realizing natural language processing of the text to be extracted. Through the method and the device, the efficiency of entity extraction can be improved on the basis of ensuring the accuracy of entity extraction.)

1. An entity extraction method, comprising:

acquiring at least one character vector and at least one expansion word vector contained in a text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; the at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary;

performing coding and decoding transformation on the basis of the at least one character vector and the at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted; the at least one target entity is used for realizing natural language processing of the text to be extracted.

2. The method of claim 1, wherein the at least one expanded word vector further comprises: the obtaining of at least one character vector and at least one expansion word vector contained in the text to be extracted includes:

in at least one character contained in the text to be extracted, obtaining at least one character vector according to the vector information of each single character;

performing word segmentation processing on the text to be extracted to obtain at least one word segmentation vector;

performing relevance matching in the preset entity dictionary by using the text to be extracted to obtain at least one piece of preset entity information matched with the text to be extracted; each preset entity information in the at least one preset entity information comprises at least one of a preset entity and a preset entity alias;

and taking the vector corresponding to the at least one preset entity information as the at least one preset entity vector.

3. The method according to claim 1 or 2, wherein the performing coding/decoding transformation based on the at least one character vector and the at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted includes any one of:

performing coding and decoding transformation on the at least one character vector and the at least one expansion word vector to obtain at least one first entity; and treating the at least one first entity as the at least one target entity;

obtaining boundary information corresponding to each character vector and each expansion word vector from the at least one character vector and the at least one expansion word vector; combining the boundary information, and performing coding and decoding transformation on the at least one character vector and the at least one expansion word vector to obtain at least one second entity; and treating the at least one second entity as the at least one target entity.

4. The method of claim 3, wherein the boundary information comprises: first boundary information corresponding to each expansion word vector and second boundary information corresponding to each character vector; the obtaining of the boundary information corresponding to each character vector and each expansion word vector includes:

obtaining the first boundary information according to the head and tail character positions in each expanded word vector;

and taking the position of each character vector as the second boundary information.

5. The method of claim 3, wherein the codec converting the at least one character vector and the at least one expanded word vector to obtain the at least one first entity comprises:

performing attention coding on the at least one character vector and the at least one expansion word vector to obtain a first attention coding vector set;

decoding and predicting the first attention coding vector set to obtain a first position prediction sequence; the first position prediction sequence is used to indicate in the at least one character vector the position of the character belonging to the at least one first entity;

and obtaining the at least one first entity according to the first position prediction sequence.

6. The method according to claim 4, wherein said performing a codec transformation on the at least one character vector and the at least one expanded word vector in combination with the boundary information to obtain the at least one second entity comprises:

respectively identifying the at least one character vector and the at least one expansion word vector to obtain at least one coding identification;

obtaining the length of a vector to be coded corresponding to each code identifier in the at least one code identifier according to the first boundary information and the second boundary information;

performing attention coding on the character vector or the expansion word vector corresponding to each coding identifier by combining the length of the vector to be coded to obtain a second attention coding vector set;

intercepting the second attention coding vector set according to the sentence length of the request sentence in the text to be extracted to obtain an intercepted coding vector;

decoding and predicting the intercepted coding vector to obtain a second position prediction sequence; the second position pre-sequencing column is a sequence formed by at least one second position prediction tag; the at least one second location prediction tag is used to indicate in the at least one character vector the location of the character belonging to the at least one second entity;

and connecting the character vectors indicated by the at least one second position prediction label at the positions to obtain the at least one second entity.

7. The method of claim 6, wherein the performing a codec transformation on the at least one character vector and the at least one expanded word vector in combination with the boundary information to obtain the at least one second entity comprises:

performing coding and decoding transformation on the at least one character vector and the at least one expansion word vector by using an attention coding and decoding transformation model and combining the boundary information to obtain at least one second entity; the attention coding and decoding transformation model is obtained by performing data mining processing on a real interaction log to obtain a training sample set and performing model training on an initial attention coding and decoding transformation model by using the training sample set;

wherein the data mining process comprises at least one of log data mining, label replacement process and auxiliary synonym replacement process.

8. An entity extraction apparatus, comprising:

the acquisition module is used for acquiring at least one character vector and at least one expansion word vector contained in the text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; the at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary;

the encoding and decoding transformation module is used for performing encoding and decoding transformation on the basis of the at least one character vector and the at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted; the at least one target entity is used for realizing natural language processing of the text to be extracted.

9. An electronic device, comprising:

a memory for storing executable instructions;

a processor for implementing the method of any one of claims 1 to 7 when executing executable instructions stored in the memory.

10. A computer-readable storage medium having stored thereon executable instructions for, when executed by a processor, implementing the method of any one of claims 1 to 7.

Technical Field

The present application relates to artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for entity extraction.

Background

Currently, with the advent of transform technology based on the self-attention mechanism, so that the training process can be parallelized, transform-based Bidirectional Encoder characterization (BERT) model gradually replaces the previous Long-Short Term Memory network (LSTM), and is widely used in Natural Language Processing (NLP) tasks. The BERT model is pre-trained based on a super-large corpus, when the entity extraction is carried out on the text, word vector information related to the context scene of the semantic environment is extracted from the text, then 12-layer network is adopted to carry out Transformer coding-decoding on the word vector information, and the entity in the text is output. Due to the complex structure of the BERT model, the training time and the time consumption of online inference (inference) for real user interactive sentences are long in the model training stage and the model application stage, so that the entity extraction efficiency is reduced.

Disclosure of Invention

The embodiment of the application provides an entity extraction method, an entity extraction device, an entity extraction equipment and a computer readable storage medium, which can improve the entity extraction efficiency on the basis of ensuring the entity extraction accuracy.

The technical scheme of the embodiment of the application is realized as follows:

the embodiment of the application provides an entity extraction method, which comprises the following steps:

acquiring at least one character vector and at least one expansion word vector contained in a text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; the at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary;

performing coding and decoding transformation on the basis of the at least one character vector and the at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted; the at least one target entity is used for realizing natural language processing of the text to be extracted.

An embodiment of the present application provides an entity extraction apparatus, including:

the acquisition module is used for acquiring at least one character vector and at least one expansion word vector contained in the text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; the at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary;

the encoding and decoding transformation module is used for performing encoding and decoding transformation on the basis of the at least one character vector and the at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted; the at least one target entity is used for realizing natural language processing of the text to be extracted.

In the above apparatus, the at least one expanded word vector further includes: the acquisition module is further used for acquiring at least one character vector in at least one character contained in the text to be extracted according to the vector information of each single character; performing word segmentation processing on the text to be extracted to obtain at least one word segmentation vector; performing relevance matching in the preset entity dictionary by using the text to be extracted to obtain at least one piece of preset entity information matched with the text to be extracted; each preset entity information in the at least one preset entity information comprises at least one of a preset entity and a preset entity alias; and taking the vector corresponding to the at least one preset entity information as the at least one preset entity vector.

In the above apparatus, the encoding/decoding transformation module is further configured to perform encoding/decoding transformation on the at least one character vector and the at least one expanded word vector to obtain the at least one first entity; taking the at least one first entity as the at least one target entity; or acquiring boundary information corresponding to each character vector and each expansion word vector from the at least one character vector and the at least one expansion word vector; combining the boundary information, and performing coding and decoding transformation on the at least one character vector and the at least one expansion word vector to obtain at least one second entity; the at least one second entity is taken as the at least one target entity.

In the above apparatus, the boundary information includes: first boundary information corresponding to each expansion word vector and second boundary information corresponding to each character vector; the obtaining module is further configured to obtain the first boundary information according to the head and tail character positions in each extended word vector; and taking the position of each character vector as the second boundary information.

In the above apparatus, the encoding and decoding module is further configured to perform attention coding on the at least one character vector and the at least one expanded word vector to obtain a first attention coding vector set; decoding and predicting the first attention coding vector set to obtain a first position prediction sequence; the first position prediction sequence is used to indicate in the at least one character vector the position of the character belonging to the at least one first entity; and obtaining the at least one first entity according to the first position prediction sequence.

In the above apparatus, the encoding and decoding module is further configured to identify the at least one character vector and the at least one expanded word vector, respectively, to obtain at least one encoded identifier; obtaining the length of a vector to be coded corresponding to each code identifier in the at least one code identifier according to the first boundary information and the second boundary information; performing attention coding on the character vector or the expansion word vector corresponding to each coding identifier by combining the length of the vector to be coded to obtain a second attention coding vector set; intercepting the second attention coding vector set according to the sentence length of the request sentence in the text to be extracted to obtain an intercepted coding vector; decoding and predicting the intercepted coding vector to obtain a second position prediction sequence; the second position pre-sequencing column is a sequence formed by at least one second position prediction tag; the at least one second location prediction tag is used to indicate in the at least one character vector the location of the character belonging to the at least one second entity; and connecting the character vectors indicated by the at least one second position prediction label at the positions to obtain the at least one second entity.

In the above apparatus, the entity extracting apparatus further includes an attention coding/decoding transformation model, where the attention coding/decoding transformation model is configured to perform coding/decoding transformation on the at least one character vector and the at least one expanded word vector in combination with the boundary information to obtain the at least one second entity; the attention coding and decoding transformation model is obtained by performing data mining processing on a real interaction log to obtain a training sample set and performing model training on an initial attention coding and decoding transformation model by using the training sample set; wherein the data mining process comprises at least one of log data mining, label replacement process and auxiliary synonym replacement process.

An embodiment of the present application provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.

Embodiments of the present application provide a computer-readable storage medium, which stores executable instructions for causing a processor to implement the method provided by the embodiments of the present application when the processor executes the executable instructions.

The embodiment of the application has the following beneficial effects:

in the embodiment of the application, at least one expansion word vector and at least one character vector are fused together to perform coding and decoding transformation, more potential word vector information in the text to be extracted can be introduced in the coding process, and the accuracy of entity extraction is further increased by utilizing the potential word vector information, so that the purpose that at least one target entity can be extracted from the text to be extracted through fewer network layers and fewer processing processes is achieved, and the accuracy of entity extraction is ensured. The method in the embodiment of the application can reduce the workload of network training and network computing, reduce the time consumption of network training and network computing, realize the balance of accuracy and efficiency, and improve the efficiency of entity extraction on the basis of ensuring the accuracy of entity extraction.

Drawings

FIG. 1 is a schematic diagram of a corpus organization method of the current CRF + + feature engineering;

FIG. 2 is a schematic diagram of the current process of entity extraction by the BERT model;

FIG. 3 is an alternative architecture diagram of the voice interaction system 100 based on entity extraction provided in the embodiment of the present application;

fig. 4 is an alternative structural diagram of the server 200 provided in the embodiment of the present application;

FIG. 5 is a schematic flow chart of an alternative entity extraction method provided in the embodiments of the present application;

fig. 6 is an alternative flow chart of the entity extraction method provided in the embodiment of the present application;

FIG. 7 is a schematic flow chart diagram illustrating an alternative entity extraction method according to an embodiment of the present disclosure;

FIG. 8 is a schematic flow chart diagram illustrating an alternative entity extraction method according to an embodiment of the present disclosure;

FIG. 9 is a process diagram of an entity extraction method according to an embodiment of the present application;

FIG. 10 is a schematic diagram illustrating an alternative effect of a configuration interface of a preset entity dictionary according to an embodiment of the present application;

fig. 11 is a schematic diagram of a network structure and an application processing flow of a coding/decoding transformation model provided in an embodiment of the present application.

Detailed Description

In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

Where similar language of "first/second" appears in the specification, the following description is added, and where reference is made to the term "first \ second \ third" merely for distinguishing between similar items and not for indicating a particular ordering of items, it is to be understood that "first \ second \ third" may be interchanged both in particular order or sequence as appropriate, so that embodiments of the application described herein may be practiced in other than the order illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.

1) Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

2) Natural language processing is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

3) A request statement: a short text request (query) entered by a user in the intelligent assistant typically contains only one intent expectation of the user. For example: "Song 2 of the coming singer 1"; "i want to watch movie 3", etc.

4) Entity: basic concepts in the field of natural language processing are generally basic words in a certain field. In the task-based dialog system, the method is used for expressing important information in the query input by the user. In a query such as "song 2 of the first singer 1", the query itself is intended to indicate that the user wants to listen to the song (music.play), and entities such as "song 1" and "song 2" are also designed to indicate specific important information in the query, so that the follow-up service can use the structured information obtained by semantic understanding to make feedback to the user query. The entity is allowed to have an alias, for example, "Liu somebody" is an entity in the music field, and the names representing singers, "guo," "Andy Liu," and the corresponding pinyin character strings of "Liu somebody" can be the aliases thereof. An entity may belong to multiple domains, so "Liu somebody" may also be an entity in a video domain, representing the name of an actor.

5) Skill (kill), similar to APP, refers to the ability to provide one or more specific functions and services to a user through voice interaction. Different skills provide different services. For example, skills such as music, weather, joke, news, etc. can provide the user with functions of listening to songs, checking weather, and speaking joke. To enable speech interaction and understanding, skills are required to build the required dialogue model. To provide a particular service, skills are required to perform the relevant service acquisition and configuration.

6) A media asset class entity: similar to the definition of the above entities, such as sys.music.song entity in music skill, sys.video.file & sys.video.tvseries & sys.video.carton entity in video skill, and sys.fm.album entity in fm skill can all represent entities of media information class. The media information entities have a certain similarity with each other, and the content of the entities also has an intersection, and the user query method is similar and can be defined as the media information entity.

7) A solid dictionary: for a field design expert in task-based dialog, when designing a new skill intent, a collection of entity instances is typically provided that inform the boundaries and rules of the set of entities involved in the new skill. This is a very important predefined characteristic information for the extraction of entities. Entity instances of the same nature may constitute an entity library, such as a singer library, an actor library, or the like. The entity dictionary may include at least one entity library.

8) The BERT algorithm: the pre-training language model proposed by Google in 2018 is implemented based on transforms technology, and is often applied to a general NLP task in consideration of context information. In the context of Entity extraction, the pre-training results of BERT are typically used as a feature extraction component to optimize the effect of a Named Entity Recognition (NER) model.

9) Conditional Random Field (CRF) algorithm: in 2001, the method is proposed by John Lafferty, is commonly used in scenes such as word segmentation and entity extraction in NLP,

10) and (5) corpora. In order to express a clear intention, a user can speak some commonly used questions, for example, the user speaks "today weather hum", "tomorrow Shenzhen temperature, and" today air exponent how much ", and the backs of the sentences all want to express a certain intention. These sentences are called corpora.

11) Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

At present, an entity extraction technology in natural language processing is implemented by manually constructing a CRF + + feature project and implementing a CRF algorithm in a C + + language to implement entity extraction. The execution flow of the CRF + + method is as follows: the corpus is organized into an example form shown in fig. 1, wherein a first column is a word feature, a second column is a bigram feature, a third column is a part-of-speech feature, a fourth column is an entity information feature, and a fifth column is a prediction label l predicted by the CRF + + feature engineering according to the first four columns of data, and the prediction label l is labeled by using a BIO label system, wherein B represents begin, I represents inter, and O represents other.

Currently, another implementation of the entity extraction technology, i.e. implementation of the BERT model, can be as shown in fig. 2. In fig. 2, the bottom layer is the feature extraction layer 20 of BERT, and its inputs tok1, tok2, and tokn are each word ID of the current request sentence (query), and [ cls ] is a preset dedicated token for marking the beginning of the sentence. Here, in the scenario of entity identification, the feature extraction layer 20 of BERT generally includes a 6-layer coding layer and a 6-layer decoding layer, and is configured to perform layer-by-layer progressive processing on the word vector of each token, that is, the vector information of each single word, and output 768-dimensional information of the word vector and 768-dimensional information of the [ cls ] portion of each token. Then, through the intermediate layer 21, on the basis of 768-dimensional information output by the feature extraction layer 20, splicing custom dictionary information of each token with 40 dimensions (corresponding to the 4 th column of features in fig. 1, for example, three words of "forgetting water" exist in a song-type entity dictionary, then for the three words of "forgetting", "feeling" and "water", the features of a B-dictionary & I-dictionary exist, and accordingly, the three words can be converted into 3 vectors with 40 dimensions), thereby completing the splicing process of BERT output and custom dictionary features, and inputting each token after splicing into the CRF decoding layer. And (3) performing probability labeling on each token by considering vector information of each token and transition matrix information of each prediction label (label) through a CRF decoding layer, and predicting the probability of which character in the entity to be finally extracted each token. And then labeling according to the probability of each token to obtain the entity to be finally extracted.

It can be seen that, in the mainstream technical solution at present, the method for CRF + + feature engineering is characterized in that the feature engineering cannot be automatically constructed, or the capability of automatically capturing features is limited, and a user needs to manually construct the feature engineering to determine which features to use specifically through the results of multiple tests. In the process of model development and tuning, the feature engineering is very time-consuming and has certain threshold requirements on model tuners, so the efficiency of entity extraction through the CRF + + feature engineering is low. Although the workload of feature engineering can be simplified for the method of introducing the BERT model into the CRF algorithm, since the BERT processes a single character and mainly inspects the relation between tokens, the relation between word sequences is lost, and the accuracy can only be ensured by increasing the number of layers. However, too many layers result in low efficiency of training and prediction, so the whole BERT model network is time-consuming to calculate and the efficiency of entity extraction is low. In summary, the efficiency of the existing entity extraction method is low, and the method is not easy to be widely spread and used in a real scene with high concurrency and large flow.

Embodiments of the present application provide an entity extraction method, apparatus, device, and computer-readable storage medium, which can improve the efficiency of entity extraction on the basis of ensuring the accuracy of entity extraction, and an exemplary application of the electronic device provided in the embodiments of the present application is described below. In the following, an exemplary application will be explained when the device is implemented as a server.

Referring to fig. 3, fig. 3 is an optional architecture diagram of the voice interaction system 100 based on entity extraction according to the embodiment of the present application, in order to support a voice interaction application, a terminal, such as a smart speaker 400, is connected to the server 200 through a network 300, where the network 300 may be a wide area network or a local area network, or a combination of the two.

In an application scenario where the smart speaker is instructed by voice to play a corresponding song, a user may initiate a voice command or a voice request "play a song of singer 1" through smart speaker 400, and smart speaker 400 forwards the request statement to server 200 after receiving the request statement. The server 200 performs voice-to-text conversion on the request sentence to obtain a text to be extracted, and further obtains at least one character vector and at least one expansion word vector contained in the text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary; and performing coding and decoding transformation based on at least one character vector and at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted. In some embodiments, the at least one target entity may include a skill type entity and an intention type entity, and the server 200 may identify a play song (play) intention that is a music (music) skill according to the skill type entity and the intention type entity in the at least one target entity, and extract singer information "singer 1" from a singer type entity in the at least one target entity, thereby implementing parsing of the request sentence based on the at least one target entity. The server 200 obtains corresponding target song audio and target song information from the music database 500 according to the identified at least one target entity, wherein the target song information may include information such as song pictures, albums, lyrics, song playing addresses, and the like. Server 200 may reorganize and convert the target song information into a streaming media card that may be played on a smart speaker with a screen, or into a reply language suitable for natural human-computer interaction played by a smart speaker without a screen, and finally feed back the song audio and the song information to smart speaker 400 for the user.

In some embodiments, the server 200 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a server 200 according to an embodiment of the present application, where the server 200 shown in fig. 4 includes: at least one processor 410, memory 450, at least one network interface 420, and a user interface 430. The various components in server 200 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable communications among the components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 440 in fig. 4.

The Processor 410 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The user interface 430 includes one or more output devices 431, including one or more speakers and/or one or more visual displays, that enable the presentation of media content. The user interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 450 optionally includes one or more storage devices physically located remote from processor 410.

The memory 450 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 450 described in embodiments herein is intended to comprise any suitable type of memory.

In some embodiments, memory 450 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.

An operating system 451, including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

a network communication module 452 for communicating to other computing devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

a presentation module 453 for enabling presentation of information (e.g., user interfaces for operating peripherals and displaying content and information) via one or more output devices 431 (e.g., display screens, speakers, etc.) associated with user interface 430;

an input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.

In some embodiments, the apparatus provided in the embodiments of the present application may be implemented in software, and fig. 4 illustrates an entity extracting apparatus 455 stored in the memory 450, which may be software in the form of programs and plug-ins, and includes the following software modules: an acquiring module 4551 and an encoding module 4552, which are logical and thus may be arbitrarily combined or further split depending on the functions implemented.

The functions of the respective modules will be explained below.

In other embodiments, the apparatus provided in the embodiments of the present Application may be implemented in hardware, and for example, the apparatus provided in the embodiments of the present Application may be a processor in the form of a hardware decoding processor, which is programmed to execute the entity extraction method provided in the embodiments of the present Application, for example, the processor in the form of the hardware decoding processor may be one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.

The entity extraction method provided by the embodiment of the present application will be described in conjunction with exemplary applications and implementations of the server provided by the embodiment of the present application.

Referring to fig. 5, fig. 5 is an alternative flowchart of the entity extraction method provided in the embodiment of the present application, and will be described with reference to the steps shown in fig. 5.

S101, obtaining at least one character vector and at least one expansion word vector contained in a text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; and at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in the preset entity dictionary.

The entity extraction method provided by the embodiment of the application can be applied to natural language processing scenes based on artificial intelligence, such as voice interaction, voice synthesis, Internet of things interaction, system feedback, voice awakening, far-field voice recognition and the like, and is specifically selected according to actual conditions, and the embodiment of the application is not limited.

In the embodiment of the application, the text to be extracted may be text information input into the electronic device by the user in a text input manner, or may be a text obtained by performing text recognition on the user voice by the voice-to-text conversion system. The text to be extracted contains at least one character.

In this embodiment of the application, the entity extraction device may perform word segmentation on a text to be extracted, perform relevance matching in a preset entity dictionary using the text to be extracted, and use characters or word segments obtained by word segmentation and vector information corresponding to at least one preset entity matched in the preset entity dictionary by the text to be extracted, such as embedding information of semantic representation, as at least one character vector and at least one expanded word vector. Wherein, the at least one expanded word vector comprises at least one preset entity vector.

In some embodiments, the preset entity dictionary includes at least one preset entity, and the at least one preset entity may be a character string such as a keyword, an indicator word, a direction word, a position word (such as a tail word), a central word, and the like designed by a linguistic expert aiming at various natural language processing scenes according to statistical information under specific scenes. For example, the preset entity dictionary may include general preset entities that can be directly used by various skills such as commonly used quantity words, dates, times, regions, and the like; the system can also include preset entity vectors corresponding to specific skills, such as a "telephone number library", "room names", "buddy lists", and the like. And at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in the preset entity dictionary.

S102, performing coding and decoding transformation based on at least one character vector and at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted; and the at least one target entity is used for realizing natural language processing of the text to be extracted.

In the embodiment of the application, the electronic device may perform encoding and decoding transformation based on the text content of the at least one character vector and the at least one expansion word vector to obtain at least one target entity corresponding to the text to be extracted.

In some embodiments, through at least one expanded word vector, a potential phrase and a potential preset entity contained in the text to be extracted can be introduced in the encoding and decoding process. Therefore, when the entity extraction equipment carries out coding and decoding transformation on at least one character vector and at least one expansion word vector, such as transformer coding and decoding transformation based on a multi-head self-attention mechanism, the coded input information can be richer and more comprehensive, and the association between each character can be more accurately identified in the coding and decoding process according to the respective potential separable word group in the input text to be extracted and the corresponding preset entity vector in the preset entity dictionary, so that at least one target entity extracted after the coding and decoding transformation is more accurate, and the accuracy of entity extraction is ensured without a mode of carrying out stacking processing through a multilayer network.

In some embodiments, to further improve the accuracy and avoid resource consumption of the multi-layer encoding and decoding process, the entity extracting apparatus may also obtain boundary information of each character vector and each expansion word vector from at least one character vector and at least one expansion word vector. When the encoding and decoding transformation is carried out on at least one character vector and at least one expansion word vector, the lengths of the corresponding character vector and the corresponding expansion word vector can be obtained according to the boundary information of each character vector and each expansion word vector, so that the accuracy of the encoding and decoding transformation can be improved through richer input information, the accurate entity extraction can be realized through fewer processing procedures, and at least one target entity can be obtained.

In some embodiments, the boundary information may be the position where each character vector is located, and the boundary information of the vector indicated by the positions of the first character and the last character in each expansion word vector.

In the embodiment of the application, at least one target entity may include entities with different skill types, and the electronic device or the downstream device for entity extraction may implement a natural language processing process for a text to be extracted through the at least one target entity.

In some embodiments, for some demand scenarios with explicit tasks, with obvious intentions in the demand session, it is desirable to return a specific service, such as "i want XXX", "give me XXX", etc., at least one target entity may include entities with task-type skills, so that the specific demand task can be implemented based on the entities with task-type skills, for example, a reminder, an alarm, a taxi driver, an air ticket booking, a hotel booking, music playing, media control, etc.

In some embodiments, for some question-answer type scenarios, i.e. without obvious intention tendency in the demand dialogue, the demander mainly wants to consult some questions, and expects to return answer scenarios, such as the most knowledgeable question-answer in the world: "what is the highest mountain in the world", or a personalized question-answer built by itself: "where your aged family", "what name you mom called", at least one target entity may contain the entity of the task-based skill. The natural language processing method can utilize entities of task-type skills to realize voice interaction in a question-answer scene.

In some embodiments, for content, such as a book listening scenario, a story listening scenario, or a news report scenario, the at least one target entity may comprise an entity of content briefing-type skills specifically tailored to the content.

In some embodiments, for smart home scenarios, home devices are controlled, device modes are adjusted, etc. through voice conversations. For example, for a smart light bulb, it may be said that a light of a living room is turned on, a brightness-up point is turned on, a sleep mode is turned on, etc., and at least one target entity may include an entity of smart home type skills.

It can be understood that, in the embodiment of the present application, by fusing at least one expanded word vector and at least one character vector together for encoding and decoding transformation, more potential word vector information in a text to be extracted can be introduced in an encoding process, and further, the accuracy of entity extraction is increased by using the potential word vector information, so that at least one target entity can be extracted from the text to be extracted through fewer network layers and fewer processing processes, and the accuracy of entity extraction is ensured at the same time. The method in the embodiment of the application can reduce the workload of network training and network computing, reduce the time consumption of network training and network computing, realize the balance of accuracy and efficiency, improve the efficiency of entity extraction on the basis of ensuring the accuracy of entity extraction, and has practical value in a real large-flow application scene.

In some embodiments, referring to fig. 6, fig. 6 is an optional flowchart of the entity extraction method provided in the embodiment of the present application, where the at least one expanded word vector further includes: at least one word segmentation vector, S101 shown in fig. 5, can be implemented by S1011-S1014, which will be described in conjunction with the steps.

S1011, in at least one character contained in the text to be extracted, at least one character vector is obtained according to the vector information of each single character.

In some embodiments, the entity extraction device may divide the text to be extracted into single characters, and obtain one character vector according to each character, thereby obtaining at least one character vector.

Illustratively, the character extraction means divides "i love china" into four individual character vectors (unigram) of "i", "love", "middle", "country", as at least one character vector.

And S1012, performing word segmentation on the text to be extracted to obtain at least one word segmentation vector.

In the embodiment of the application, the entity extraction device may also perform word segmentation on the text to be extracted, such as binary word segmentation, and form each two words in the text to be extracted into a biword (bigram) from beginning to end to obtain at least one word segmentation vector; or, forming a word (trigram) by every three words in the sentence in the text to be extracted from beginning to end to obtain at least one word segmentation vector. The embodiment of the application does not limit the specific division mode and division level of the word processing.

S1013, performing relevance matching in a preset entity dictionary by using the text to be extracted to obtain at least one piece of preset entity information matched with the text to be extracted; each preset entity information in the at least one preset entity information comprises at least one of a preset entity and a preset entity alias.

In this embodiment of the application, the entity extraction device may perform association degree matching on the text to be processed in at least one preset entity included in the preset entity dictionary to determine whether the text to be processed includes a potential preset entity, so as to obtain at least one piece of preset entity information matched with the text to be extracted. Each preset entity information in the at least one preset entity information comprises at least one of a preset entity and a preset entity alias.

In some embodiments, the method for association matching may be a suffix dictionary matching method, or may also be a forward maximum matching, which is specifically selected according to the actual situation, and the embodiment of the present application is not limited.

And S1014, taking the vector corresponding to the at least one piece of preset entity information as the at least one piece of preset entity vector.

In the embodiment of the application, the entity extraction device takes vector information corresponding to at least one piece of preset entity information matched with the text to be extracted in the preset entity dictionary as at least one preset entity vector.

It can be understood that, in the embodiment of the present application, through matching the word segmentation processing with the association degree of the preset entity dictionary, the entity extraction device can obtain and fully utilize the potential word segmentation phrases and the potential preset entities in the text to be extracted, so that the input information in the encoding and decoding process is richer and more comprehensive, the current multilayer processing flow is simplified on the basis of ensuring the accuracy of entity extraction, the time consumed for calculation is reduced, and the efficiency of entity extraction is improved.

In some embodiments, referring to fig. 7, fig. 7 is an optional flowchart of the entity extraction method provided in the embodiment of the present application, and S102 shown in fig. 5 or fig. 6 may be implemented by executing any one of method flows in S201-S202 or S301-S303, which will be described with reference to the steps.

S201, performing coding and decoding transformation on at least one character vector and at least one expansion word vector to obtain at least one first entity.

In this embodiment, the entity extraction device may perform encoding and decoding transformation on at least one character vector and at least one expansion word vector to obtain at least one first entity.

In some embodiments, S201 may be implemented by performing the processes of S2011-S2013, which will be described in conjunction with the steps.

And S2011, performing attention coding on at least one character vector and at least one expansion word vector to obtain a first attention coding vector set.

In this embodiment, the entity extraction device may organize at least one character vector and at least one expansion word vector into a feature vector matrix in a matrix form, perform linear transformation on the feature vector matrix, generate three matrices, i.e., a request matrix Q (query), a key matrix K (key), and a value matrix v (value), respectively, and then obtain an attention matrix according to the request matrix Q and the key matrix K. Here, note that the moment matrix represents a probability distribution of the attention weight, and each line of the moment matrix refers to, for example, a correlation probability between a character vector or an expanded word vector corresponding to the line number and other respective character vectors or expanded word vectors. The entity extraction device may further weight the value matrix V using each attention weight included in the attention moment matrix, and then perform softmax normalization on the weighting result, so that the sum of the attention weight of each character vector or expanded word vector and each other character vector or expanded word vector is 1, thereby obtaining a first attention coding vector set.

Here, each row of the value matrix V represents a mathematical expression of each character vector or each expanded word vector, and weighted linear combination of the mathematical expressions is performed with attention weight, so that each character vector or each expanded word vector can contain information of all character vectors and expanded word vectors in the current sentence in the text to be extracted.

In some embodiments, before weighting the value matrix V with the attention matrix, the entity extraction device may further perform a standard normal distribution process on the attention moment matrix to make the result after the softmax normalization more stable.

S2012, decoding and predicting the first attention coding vector set to obtain a first position prediction sequence; the first position prediction sequence is used to indicate the position of the character belonging to the at least one first entity in the at least one character vector.

In this embodiment of the application, the entity extraction device may perform decoding prediction on the attention coding vector set, predict, in at least one character vector included in the text to be extracted, a probability that each character vector belongs to a character at a certain position in at least one first entity, and obtain a first position prediction sequence according to a probability corresponding to each character vector.

In an embodiment of the application, the first location prediction sequence is a sequence of first location prediction tags, and is used to indicate, in at least one character vector, a location of a character belonging to at least one first entity. In some embodiments, when labeling using the BIO tag hierarchy, the first position prediction sequence may include at least one start tag (B-stable) for marking a start character of each of the at least one first entity; the first position prediction sequence may further comprise at least one end tag (E-capable) for marking an end character of each of the at least one first entity, or the first position prediction sequence may further comprise an intermediate tag (I-capable) for marking an intermediate character of each of the at least one first entity.

In some embodiments, the entity extraction device may perform decoding prediction on the first attention coding vector set by using a CRF decoding method to obtain the first position prediction sequence, or may use other decoding methods, which are specifically selected according to actual situations, and the embodiments of the present application are not limited thereto.

S2013, obtaining at least one first entity according to the first position prediction sequence.

S202, taking at least one first entity as at least one target entity.

In this embodiment of the application, the entity extraction device may perform character vector combination according to the character vector and the position order marked by each first position prediction tag in the first position prediction sequence to obtain at least one first entity, and use the at least one first entity as at least one target entity.

It can be understood that, in the embodiment of the present application, after at least one expanded word vector is introduced, the entity extraction device may extract at least one target entity from at least one character vector through fewer processing flows, and does not need to perform multi-layer encoding and decoding conversion, thereby improving the accuracy of entity extraction.

S301, obtaining boundary information corresponding to each character vector and each expansion word vector from at least one character vector and at least one expansion word vector.

In this embodiment, the entity extraction device may obtain boundary information corresponding to each character vector and each extension word vector, where for a single character vector, the boundary information may be a position where the character vector appears, and for an extension word vector including a plurality of characters, the boundary information may be a boundary of the extension word vector defined by a position where a first character, i.e., a first character, and a last character, i.e., a last character, respectively appears.

In some embodiments, the entity extraction device may obtain the first boundary information according to the head and tail character positions in each expanded word vector; and the position of each character vector is taken as second boundary information.

S302, combining the boundary information, performing coding and decoding transformation on at least one character vector and at least one expansion word vector to obtain at least one second entity.

In this embodiment of the application, the entity extraction device may further enrich input information of encoding and decoding transformation in combination with the boundary information, and perform encoding and decoding transformation on at least one character vector and at least one extension word vector to obtain at least one second entity.

In some embodiments, referring to fig. 8, fig. 8 is an optional flowchart of the entity extraction method provided in the embodiment of the present application, and S302 shown in fig. 7 may be implemented by executing the processes of S3021 to S3026, which will be described with reference to the steps.

S3021, respectively identifying at least one character vector and at least one expansion word vector to obtain at least one coding identifier.

In the embodiment of the application, the entity extraction device can identify each expansion word vector and each character vector in an ID or token manner, so as to obtain at least one character vector and at least one coding identifier corresponding to at least one expansion word vector, and perform the next coding and decoding conversion processing according to the at least one coding identifier.

And S3022, obtaining the length of the vector to be coded corresponding to each code identifier in the at least one code identifier according to the first boundary information and the second boundary information.

In this embodiment of the application, in order to utilize the vector length corresponding to each ID or token in the encoding/decoding transformation process, the entity extraction device may obtain the vector length to be encoded corresponding to each encoding identifier according to the first boundary information and the second boundary information.

Illustratively, when the encoding identifier corresponds to the first boundary information, that is, the head and tail positions included in the boundary information are the same, the length of the vector to be encoded corresponding to the encoding identifier is 1. When the code identifier corresponds to the second boundary information, the entity extraction device may obtain the length of the vector to be coded corresponding to the code identifier according to the total number of characters defined by the beginning and end characters. Illustratively, when the second boundary information is [3,5], the length of the vector to be encoded is 3.

And S3023, performing attention coding on the character vector or the expansion word vector corresponding to each coding identifier by combining the length of the vector to be coded to obtain a second attention coding vector set.

In the embodiment of the application, in the encoding process, the entity extraction device can fully identify and embody the relation between word sequences in combination with the length of the vector to be encoded, and further perform attention encoding on the character vector or the expanded word vector corresponding to each encoding identifier to obtain a second attention encoding vector set.

In some embodiments, the entity extracting apparatus may use the character vector or the expanded word vector corresponding to each encoding identifier and the corresponding first boundary information or second boundary information as a column of data, thereby forming an input vector matrix, and perform attention encoding on the input vector matrix to obtain a second attention encoding vector set.

And S3024, intercepting the second attention coding vector set according to the sentence length of the request sentence in the text to be extracted to obtain an intercepted coding vector.

In the embodiment of the application, in a voice instruction interaction scenario, the text to be extracted usually includes a request statement, and the entity extraction device may intercept the second attention coding vector set according to the statement length of the request statement to obtain an intercepted coding vector.

In the embodiment of the application, for the case that the text to be extracted further includes a plurality of request statements, the entity extraction device may process the plurality of request statements for a plurality of times, and in each processing process, intercept the second attention coding vector set according to the statement length of the current request statement to obtain an intercepted coding vector.

S3025, decoding and predicting the intercepted coding vector to obtain a second position prediction sequence; the second position pre-sequencing column is a sequence formed by at least one second position prediction label; the at least one second position prediction tag is for indicating a position of a character belonging to the at least one second entity in the at least one character vector.

In the embodiment of the application, the entity extraction device performs decoding prediction on the intercepted encoding vector to obtain the probability that each character belongs to each position in the second entity, and performs prediction label labeling according to the prediction probability of each character to obtain a second position prediction sequence.

Here, the second position pre-sequencing column is a sequence of at least one second position prediction tag for indicating a position of a character belonging to the at least one second entity in the at least one character vector.

And S3026, connecting the character vectors indicated by the at least one second position prediction label at the positions.

In this embodiment, since the at least one second position prediction tag already indicates the position of the character belonging to the at least one second entity in the at least one character vector, the entity extraction device may connect the character vectors indicated by the at least one second position prediction tag at the respective positions to obtain the at least one second entity.

In some embodiments, referring to fig. 9, fig. 9 is a schematic processing procedure diagram of the entity extraction method provided in the embodiment of the present application, for "Chongqing people and pharmacy," the current BERT model only uses information of six words, namely "Chongqing", "Qing", "people", "and", "medicine" and "shop" when performing coding and decoding transformation, and the entity extraction apparatus in the embodiment of the present application may introduce, by introducing at least one extension word vector, more phrases that may appear in "Chongqing people and pharmacy" such as "Chongqing", "people and pharmacy", "pharmacy", and an english alias of each character vector and the extension word vector. Further, the embodiment of the application also introduces position information corresponding to each character vector and the expanded word vector, for example, for a request sentence of "Chongqing and pharmacy", the position of the character vector "person" is 3, and the first boundary information corresponding to the character vector "person" is [3,3 ]; in the expanded word vector "Chongqing", the position of the first character "Chong" is 1, and the position of the tail character "Chong" is 2, and then the second boundary information corresponding to the expanded word vector "Chongqing" is [1,2 ]. The entity extraction device performs encoding and decoding transformation on each character vector and the corresponding first boundary information thereof and each expansion word vector and the corresponding second boundary information thereof to obtain a second position prediction sequence as shown in fig. 9. B-LOC, I-LOC and E-LOC are second position prediction labels, wherein B-LOC is used for indicating a starting character of a second entity, E-LOC after B-LOC is used for indicating an ending character of the second entity, and I-LOC is used for indicating an intermediate character in the second entity. According to the second position prediction sequence shown in fig. 9, the entity extraction means may extract two second entities "Chongqing", "people and pharmacy".

S303, taking at least one second entity as at least one target entity.

In this embodiment, the entity extracting apparatus may use at least one second entity obtained by introducing the boundary information as at least one target entity.

It can be understood that, in the embodiment of the present application, by introducing the boundary information, a self-attention mechanism in the encoding and decoding process can better capture the relationship between word sequences, so that on the basis of ensuring the accuracy of encoding and decoding processing, a multi-layer processing process can be reduced, and the efficiency of entity extraction is improved.

In some embodiments, the performing of S302 may be implemented by using an attention codec transform model, wherein the attention codec transform model may be obtained by model training an initial attention codec transform model using a training sample set mined from a real interaction log.

In some embodiments, for scenarios where deep learning requires a large number of training samples, the method of data mining processing may include at least one of log data mining, label replacement processing, and assisted synonym replacement processing. The log data mining can extract representative interaction data as a training sample based on analysis and statistical processing of a real interaction log; the labeling replacement processing can replace the entity labeling parts in the training samples mutually to obtain more training samples; the auxiliary similar meaning word replacing process can replace the corpus (corpus) part except the entity label with the auxiliary similar meaning word, such as replacing the ' television play ' with the similar meaning word ' episode ', series play ' and the like, so as to obtain more training samples. The entity extraction device can obtain a training sample set containing a large amount of real data through at least one data mining processing method so as to further optimize the effect of the coding and decoding transformation model trained through the training sample set. And moreover, a training sample set is obtained through automatic collection and data mining, so that the workload of sample collection can be further reduced, and the efficiency of training the coding and decoding transformation model is improved.

Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.

In some embodiments, when a designer of the preset entity dictionary, such as a skill expert, designs a skill intention in an actual application scenario, the preset entity dictionary may be created and updated through a configuration interface of the preset entity dictionary as shown in fig. 10. Fig. 10 is a schematic interface diagram illustrating addition of an entity sample to an animated cartoon entity type sys. The skill expert can import through the control 12 on the configuration interface 10 or manually add a preset entity set related to the related skill intention through the control 11 according to the actual requirement, and simultaneously can set characters contained in the preset entity through the control 13 and set a corresponding alias for the preset entity through the control 14 so as to meet the diversity of the entity expression.

In some embodiments, the network structure and the application processing flow of the codec transformation model provided in the embodiments of the present application may be as shown in fig. 11. In some embodiments, a network model implemented by a pytore framework, including a transform encoding layer 21 and a CRF decoding layer 22, may be used as the initial codec transform model. The entity extraction device can use the collected entity corpus as a training corpus, use the log data of a real user as a test corpus, and then perform data labeling on the training corpus and the test corpus to respectively obtain a training sample set and a test sample set for training an initial coding and decoding transformation model. For example, the distribution of the quantities of the training corpora and the testing corpora may be as shown in table 1, as follows:

TABLE 1

Entity name Training corpus Test corpus
fm.album.txt 36692 528
music.song.txt 207001 4623
video.cartoon.txt 13239 2349
video.film.txt 26186 874
video.tvseries.txt 21733 1619

The entity extraction device performs off-line training on the initial coding and decoding transformation model through a training sample set and a testing sample set, and in each training process, for each training sample in the training sample set, each training sample comprises at least one training character vector and at least one training expansion word vector, wherein the at least one expansion word vector comprises at least one training word segmentation vector obtained by segmenting the training sample, and at least one preset entity vector obtained by performing correlation matching on a training text in the preset entity dictionary. The entity extraction device further obtains at least one training character vector and boundary information corresponding to each training character vector and each training expansion word vector in at least one training expansion word vector, the boundary information and the corresponding training character vector or training expansion word vector are input into the coding layer 21 together for attention coding, and the attention coding result is decoded and predicted through the decoding layer 22 to obtain a training prediction position sequence. The entity extraction device can obtain at least one prediction entity according to the training prediction position sequence, compare the at least one prediction entity with the entity label of each training sample, obtain training loss according to the comparison result, further adjust the network parameters of the initial coding and decoding transformation model according to the training loss, thus use the training sample set to carry out iterative training, until reaching the preset training cut-off condition, use the test sample set to verify the entity extraction result of the candidate coding and decoding transformation model obtained by the last training, and when meeting each preset verification index, finish the training to obtain the coding and decoding transformation model 20.

In some embodiments, the entity extraction device may perform the network training process offline. After the coding and decoding transformation model is obtained, the entity extraction device can convert the coding and decoding transformation model into a script mode suitable for online deployment, and then the coding and decoding transformation model of the script mode is deployed and loaded into a natural language processing system in a real scene so as to analyze and process a text to be extracted in the real scene, and the method for extracting the entity in the embodiment of the application is realized.

Here, the comparison between the entity extraction method provided in the embodiment of the present application and the indexes of the entity extraction effect of the Base1 model, such as CRF + + feature engineering, and the Base2 model, such as BERT model, may be shown in table 2 as follows:

TABLE 2

Wherein, an epoch is a process of training all training samples once. The P value is the accuracy of the entity extraction result, the R value is the recall of the entity extraction result, and the F value is the harmonic mean value after integrating the P value and the R value, and in some embodiments, the F value can be obtained by calculating the formula F ═ P × R × 2/(P + R).

Compared with CRF + +, the method in the embodiment of the application has the advantages that the overall F value is obviously improved, and the time consumption of online analysis is less, so that the efficiency of entity extraction is improved, and the method is convenient for large-area spreading in actual engineering; compared with BERT, the online analysis time consumption is greatly improved, so that balance between effects and time consumption is realized, the online analysis time consumption can be well controlled on the premise of ensuring the effects, practical significance is realized on engineering realization, and the offline training time of the method is shorter. The entity extraction method in the embodiment of the application can be used for comprehensively releasing large-flow interactive data in a real scene, and has engineering practical value.

Continuing with the exemplary structure of the entity extraction device 455 provided by the embodiments of the present application as software modules, in some embodiments, as shown in fig. 4, the software modules stored in the entity extraction device 455 of the memory 450 may include:

an obtaining module 4551, configured to obtain at least one character vector and at least one expanded word vector included in a text to be extracted; the at least one expanded word vector comprises at least one preset entity vector; the at least one preset entity vector is vector information of an entity corresponding to the text to be extracted in a preset entity dictionary;

a coding and decoding transformation module 4552, configured to perform coding and decoding transformation based on the at least one character vector and the at least one expanded word vector to obtain at least one target entity corresponding to the text to be extracted; the at least one target entity is used for realizing natural language processing of the text to be extracted.

In some embodiments, the at least one expanded word vector further comprises: the obtaining module 4551 is further configured to obtain, in at least one character included in the text to be extracted, the at least one character vector according to vector information of each single character; performing word segmentation processing on the text to be extracted to obtain at least one word segmentation vector; performing relevance matching in the preset entity dictionary by using the text to be extracted to obtain at least one piece of preset entity information matched with the text to be extracted; each preset entity information in the at least one preset entity information comprises at least one of a preset entity and a preset entity alias; and taking the vector corresponding to the at least one preset entity information as the at least one preset entity vector.

In some embodiments, the codec transform module 4552 is further configured to perform codec transform on the at least one character vector and the at least one expansion word vector to obtain the at least one first entity; taking the at least one first entity as the at least one target entity; or acquiring boundary information corresponding to each character vector and each expansion word vector from the at least one character vector and the at least one expansion word vector; combining the boundary information, and performing coding and decoding transformation on the at least one character vector and the at least one expansion word vector to obtain at least one second entity; the at least one second entity is taken as the at least one target entity.

In some embodiments, the boundary information comprises: first boundary information corresponding to each expansion word vector and second boundary information corresponding to each character vector; the obtaining module 4551 is further configured to obtain the first boundary information according to the head and tail character positions in each extended word vector; and taking the position of each character vector as the second boundary information.

In some embodiments, the encoding and decoding module 4552 is further configured to perform attention coding on the at least one character vector and the at least one expansion word vector to obtain a first attention coding vector set; decoding and predicting the first attention coding vector set to obtain a first position prediction sequence; the first position prediction sequence is used to indicate in the at least one character vector the position of the character belonging to the at least one first entity; and obtaining the at least one first entity according to the first position prediction sequence.

In some embodiments, the encoding and decoding module 4552 is further configured to identify the at least one character vector and the at least one expanded word vector respectively, so as to obtain at least one encoded identifier; obtaining the length of a vector to be coded corresponding to each code identifier in the at least one code identifier according to the first boundary information and the second boundary information; performing attention coding on the character vector or the expansion word vector corresponding to each coding identifier by combining the length of the vector to be coded to obtain a second attention coding vector set; intercepting the second attention coding vector set according to the sentence length of the request sentence in the text to be extracted to obtain an intercepted coding vector; decoding and predicting the intercepted coding vector to obtain a second position prediction sequence; the second position pre-sequencing column is a sequence formed by at least one second position prediction tag; the at least one second location prediction tag is used to indicate in the at least one character vector the location of the character belonging to the at least one second entity; and connecting the character vectors indicated by the at least one second position prediction label at the positions to obtain the at least one second entity.

In some embodiments, the entity extracting apparatus further includes an attention coding and decoding transformation model, where the attention coding and decoding transformation model is configured to perform coding and decoding transformation on the at least one character vector and the at least one expansion word vector in combination with the boundary information to obtain the at least one second entity; the attention coding and decoding transformation model is obtained by performing data mining processing on a real interaction log to obtain a training sample set and performing model training on an initial attention coding and decoding transformation model by using the training sample set;

wherein the data mining process comprises at least one of log data mining, label replacement process and auxiliary synonym replacement process.

It should be noted that the above description of the embodiment of the apparatus, similar to the above description of the embodiment of the method, has similar beneficial effects as the embodiment of the method. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the entity extraction method described in the embodiment of the present application.

Embodiments of the present application provide a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, the method as illustrated in fig. 5-8.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

In summary, in the embodiment of the present application, encoding and decoding transformation is performed by fusing at least one extended word vector and at least one character vector together, so that more potential word vector information in a text to be extracted can be introduced in an encoding process, and accuracy of entity extraction is increased by using the potential word vector information, thereby achieving the purpose of extracting at least one target entity from the text to be extracted by using fewer network layers and fewer processing processes, and simultaneously ensuring accuracy of entity extraction. The method in the embodiment of the application can reduce the workload of network training and network computing, reduce the time consumption of network training and network computing, realize the balance of accuracy and efficiency, and improve the efficiency of entity extraction on the basis of ensuring the accuracy of entity extraction. Compared with CRF + +, the method in the embodiment of the application has the advantages that the overall F value is obviously improved, and the time consumption for online analysis is low, so that the efficiency of entity extraction is improved, and the method is convenient to spread in a large area in actual engineering; compared with BERT, the online analysis time consumption is greatly improved, so that balance between effects and time consumption is realized, the online analysis time consumption can be well controlled on the premise of ensuring the effects, practical significance is realized on engineering realization, and the offline training time of the method is shorter. The entity extraction method in the embodiment of the application can be used for comprehensively releasing large-flow interactive data in a real scene, and has engineering practical value.

The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

28页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种文本处理方法、装置、计算机设备以及可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!