Named entity recognition model, telephone exchange switching extension method and system

文档序号:1170291 发布日期:2020-09-18 浏览:8次 中文

阅读说明:本技术 命名实体识别模型、电话总机转接分机方法及系统 (Named entity recognition model, telephone exchange switching extension method and system ) 是由 沈燕 陈屹峰 戴蓓蓉 陆炜 王一腾 孙璐 于 2020-05-11 设计创作,主要内容包括:本发明公开了一种命名实体识别模型,其基于注意力机制的双向长短时记忆单元-条件随机场包括:嵌入层为本模型使用的预训练的词向量;双向LSTM层进行特征提取,每个词获得同时包含前向和后向信息表示;自注意层捕获句子内部词依赖关系;全连接层将双向LSTM层和自注意层的输出映射是一个维度为输出标签数量的向量;CRF层用于学习标签之间的依赖关系。本发明还公开了一种电话总机转接分机方法和一种电话总机转接分机系统。本发明的命名实体识别模型能快速准确的识别实体信息。本发明的电话总机转接方法/系统能够依据客户需求,准确、快速地为客户检索到欲联系的分机号并进行转接,支持为多客户同时提供分机转接服务,提供优质、高效的总机转接服务体验。(The invention discloses a named entity recognition model, which is based on a bidirectional long-short time memory unit of an attention mechanism-a conditional random field and comprises the following components: the embedded layer is a pre-trained word vector used by the model; the bidirectional LSTM layer performs feature extraction, and each word simultaneously comprises forward information representation and backward information representation; capturing word dependency relations in sentences from an attention layer; the fully connected layer maps the output of the bidirectional LSTM layer and the self-attention layer into a vector with one dimension being the number of output labels; the CRF layer is used to learn the dependencies between labels. The invention also discloses a telephone exchange switching extension method and a telephone exchange switching extension system. The named entity recognition model can quickly and accurately recognize entity information. The telephone exchange switching method/system can accurately and quickly retrieve the extension number to be contacted for the client and switch according to the client requirement, support the simultaneous provision of extension switching service for a plurality of clients and provide high-quality and high-efficiency exchange switching service experience.)

1. A named entity recognition model based on attention-based two-way long-short term memory cell-conditional random fields, comprising:

an embedding layer, which is a pre-trained word vector used by the model, which is continuously updated with the iteration of the model;

a bi-directional LSTM layer adapted to perform feature extraction, obtaining for each word a representation containing both forward and backward information;

a self-attention layer adapted to capture word dependencies inside sentences;

a fully connected layer adapted to map the output of the bi-directional LSTM layer and the self-attention layer into a vector with one dimension being the number of output labels;

a CRF layer having two types of scores, an emission score and a transfer score, which are suitable for learning the dependency relationship between tags;

the transmission score is the probability value of each word mapped to the label, namely the output of the full connection layer;

the transition score is the transition probability of the first label transitioning to the second label.

2. The named entity recognition model of claim 1, wherein: the bi-directional LSTM layer obtains for each word a representation containing both forward and backward information as follows:

bi-directional LSTM is a two-layer neural network, the first layer being the starting input to the series from the right,

Figure FDA0002486225050000011

The second layer serves as the starting input for the series from the left,

Figure FDA0002486225050000012

Figure FDA0002486225050000014

hi=[fhi,bhi]。

3. the named entity recognition model of claim 2, wherein: the self-attention layer captures word dependency relations inside sentences in the following mode;

at each time step i, calculating the current hidden layer state hiAnd all hidden layer states h ═ h1,h2,...hT]The similarity of (a) is obtained, T is the sequence length, then normalization is carried out to obtain a similarity score α, and α is used for carrying out weighted summation on h to obtain a context vector ci

Figure FDA0002486225050000021

Figure FDA0002486225050000022

4. The named entity recognition model of claim 3, wherein: the full-connection layer output vector is the prediction score of the current time step i for all the labels;

pi=Wi([hi,ci])+bi

wherein, WiAnd biThe parameters to be learned for the model are initialized to follow the standard normal distribution, piThe vector output for the fully-connected layer is also the prediction score for all tags for the current time step i.

5. The named entity recognition model of claim 4, wherein: and the CRF layer can add a constraint condition to improve the accuracy of the prediction result, and the constraint condition is obtained by the automatic learning of the CRF layer during the training of data.

6. The named entity recognition model of claim 5, wherein model training is performed by the following steps;

s1, preprocessing data, including removing designated useless symbols, segmenting text words, removing designated stop words and constructing a feature dictionary;

s2, inputting data construction, including converting the text sequence after word segmentation by using the generated feature dictionary, converting the word sequence into an index sequence, dividing a training set and a verification set according to a proportion, and storing the training set and the verification set as input files;

and S3, model training, including parameter setting, reading the training set and the verification set to perform model training and verification, storing the training result of the model, and returning the training and verification result.

7. A telephone exchange extension switching method using the named entity recognition model of claim 1, comprising the steps of:

s4, converting the voice information into text;

s5, extracting entity information in the text based on the named entity recognition model;

s6, retrieving the extension number based on the similarity analysis;

and S7, selecting the one with the highest similarity to execute the switching.

8. The method of claim 7, wherein the following steps are used to extract entity information for training the named entity recognition model;

s5.1, loading a model file generated by training;

s5.2, performing data processing on the text information of the client to generate a word index sequence;

and S5.3, inputting the generated word index sequence into the trained named entity recognition model, and returning the extracted entity information.

9. The method according to claim 7, wherein step S6 includes the following sub-steps:

s6.1, reading all department names in the database;

s6.2, calculating the similarity between the extracted department names and all department names in the database, wherein the department name similarity is the weighted sum of the text semantic similarity, the Chinese character similarity and the pinyin similarity;

s6.3, calculating the similarity between the extracted names and all the names under the selected department;

s6.4, calculating the overall similarity of the department names and the person names, and selecting the department names and the person names with the highest overall similarity;

calculating the integral similarity of the department names and the person names, namely the similarity of the department names and the similarity of the person names;

s6.5, returning the extension number or turning to a preset telephone operation.

10. The method according to claim 7, wherein step S7 includes the following sub-steps:

s7.1, setting an overall similarity threshold, and returning the extension number of the person to the system if the calculated overall similarity is greater than or equal to the overall similarity threshold;

if the calculated overall similarity is smaller than the overall similarity threshold value, guiding the client to speak the information of the desired contact again by using a preset dialect, and returning to execute voice information to convert the text;

and S7.2, if the number of times of returning and executing the voice information text transfer is larger than the transfer threshold value, transferring the voice information text transfer manually.

11. A telephone exchange extension system using the named entity recognition model of claim 1, comprising:

the voice recognition module is used for converting the user voice information into text;

an information extraction module that extracts entity information in the text based on the named entity recognition model;

the extension retrieval module is used for analyzing and retrieving the extension number based on the similarity;

and the extension switching module is used for selecting the switching with the highest similarity.

12. The telephone exchange switching extension system of claim 11, wherein: the information extraction module obtains a representation of each word containing both forward and backward information from the bi-directional LSTM layer of the named entity recognition model as follows:

bi-directional LSTM is a two-layer neural network, the first layer being the starting input to the series from the right,

Figure FDA0002486225050000041

The second layer serves as the starting input for the series from the left,

Figure FDA0002486225050000042

Figure FDA0002486225050000043

Figure FDA0002486225050000044

hi=[fhi,bhi]。

13. the telephone exchange switching extension system of claim 11, wherein: the information extraction module captures word dependency inside sentences by adopting the following mode for a self-attention layer of the named entity recognition model;

at each time step i, calculating the current hidden layer state hiAnd all hidden layer states h ═ h1,h2,...hT]The similarity of (a) is obtained, T is the sequence length, then normalization is carried out to obtain a similarity score α, and α is used for carrying out weighted summation on h to obtain a context vector ci

14. The telephone exchange switching extension system of claim 11, wherein: the information extraction module defines that the output vector of the full-connection layer of the named entity recognition model is the prediction score of the current time step i for all the labels;

pi=Wi([hi,ci])+bi

wherein, WiAnd biThe parameters to be learned for the model are initialized to follow the standard normal distribution, piThe vector output for the fully-connected layer is also the prediction score for all tags for the current time step i.

15. The telephone exchange switching extension system of claim 11, wherein: the information extraction module can add constraint conditions to a CRF layer of the entity recognition model to improve the accuracy of the prediction result, and the constraint conditions can be obtained by the automatic learning of the CRF layer during data training.

16. The telephone exchange switching extension system of claim 11, wherein: the information extraction module can carry out model training on the entity recognition model in the following way;

data preprocessing, including removing designated useless symbols, text word segmentation, removing designated stop words and constructing a feature dictionary;

the input data construction comprises the steps of converting the text sequence after word segmentation by using the generated feature dictionary, converting the word sequence into an index sequence, dividing a training set and a verification set according to a proportion, and storing the training set and the verification set as input files;

and model training, which comprises setting parameters, reading the training set and the verification set to perform model training and verification, storing the training result of the model, and returning the training and verification result.

17. The telephone exchange switching extension system of claim 16, wherein: the information extraction module extracts department name and person name information of the trained named entity recognition model in the following way;

loading a model file generated by training;

performing data processing on the text information of the client to generate a word index sequence;

and inputting the generated word index sequence into a trained named entity recognition model, and returning the extracted department name and person name information.

18. The telephone exchange switching extension system of claim 11, wherein: the extension retrieval module completes extension retrieval in the following mode;

reading all department names in a database;

calculating the similarity between the extracted department names and all department names in the database, wherein the department name similarity is the weighted sum of the text semantic similarity, the Chinese character similarity and the pinyin similarity;

calculating the similarity between the extracted names and all the names in the selected department;

calculating the overall similarity of the department names and the person names, and selecting the department names and the person names with the highest overall similarity;

calculating the integral similarity of the department names and the person names, namely the similarity of the department names and the similarity of the person names;

returning the extension number or turning to the preset telephone operation.

19. The telephone exchange switching extension system of claim 11, wherein: the extension switching module executes extension switching in the following mode;

setting an overall similarity threshold, and returning the extension number of the person to the system if the calculated overall similarity is greater than or equal to the overall similarity threshold;

if the calculated overall similarity is smaller than the overall similarity threshold value, guiding the client to speak the information of the desired contact again by using a preset dialect, and returning to execute voice information to convert the text;

if the number of times of returning and executing the voice information text transferring is larger than the transferring threshold value, manual transferring is carried out.

Technical Field

The invention relates to the field of communication, in particular to a named entity recognition model of a bidirectional long and short time memory unit-conditional random field based on an attention mechanism. The invention also relates to a telephone exchange extension switching method and a telephone exchange extension switching system by utilizing the named entity recognition model.

Background

The telephone of a general enterprise company has a switchboard and an extension, and the switchboard system can ensure that the enterprise only externally publishes one telephone number, and after the telephone number calls in, each service is switched to different extensions for answering according to the voice navigation set by the enterprise. Or when someone dials the switchboard to search the extension number, the switchboard personnel can directly transfer the telephone traffic to the corresponding extension personnel. When the dialer does not know the extension number of the company, the dialer can call the switchboard to inquire, and can directly inform the extension number to redial. In the process, the same service may correspond to a plurality of extension numbers (service personnel), which causes a working condition: when a customer calls a customer service for multiple times aiming at the same problem, the extension number of a person to be contacted cannot be found, one thing may need to be repeated for multiple times, and the experience of the customer is greatly influenced. The waste of enterprise resources is caused, and the working efficiency of enterprises is reduced.

Disclosure of Invention

In this summary, a series of simplified form concepts are introduced that are simplifications of the prior art in this field, which will be described in further detail in the detailed description. This summary of the invention is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

The invention aims to provide a novel named entity recognition model which is based on a bidirectional long-short time memory unit of an attention mechanism, namely a conditional random field and can quickly and accurately recognize an entity.

The invention also provides a telephone exchange extension switching method which can quickly and accurately search the extension and complete switching by using the named entity identification model.

The invention aims to solve the technical problem of providing a telephone exchange switching extension system which can quickly and accurately search the extension and complete switching by using the named entity identification model.

Named Entity Recognition (NER), also called "proper name Recognition", refers to recognizing entities with specific meaning in text, mainly including names of people, places, organizations, proper nouns, etc. Simply, the boundaries and categories of entity designations in natural text are identified. Named entity recognition is essentially a sequence tagging problem, where each character in a given text is tagged.

In order to solve the above technical problems, the present invention provides an Attention-Based named entity recognition model of two-way long and short term memory unit-conditional random field (Attention-Based blst-CRF), comprising:

an embedding layer, which is a pre-trained word vector used by the model, which is continuously updated with the iteration of the model;

a bi-directional LSTM layer adapted to perform feature extraction, obtaining for each word a representation containing both forward and backward information; feature extraction is performed using bi-directional LSTM, and for each word a representation is obtained that contains both forward and backward information. Bi-directional LSTM can be viewed as a two-layer neural network, the first layer being the beginning input of the series from the right, which can be understood in text processing as the last of a sentenceTaking a word as an input, and outputting the word as bh at each time step iiWhile the second layer is input from the left as the beginning of the series, it can be understood in text processing that the input starts from the beginning of the sentence, and the output is fh at each time step iiThe final concatenation of hidden states of the output layer LSTM:

Figure BDA0002486225060000021

hi=[fhi,bhi];

a self-attention layer adapted to capture word dependencies inside sentences;

although the bidirectional LSTM can acquire forward and backward information and has longer distance dependence than RNN, the LSTM cannot well retain information with longer distance after passing through multiple layers when the sentence sequence is longer. The invention introduces a Self-Attention (Self-Attention) mechanism to capture the word dependency relationship in the sentence, and calculates the current hidden layer state h at each time step iiAnd all hidden layer states h ═ h1,h2,...hT]The similarity of (a) is obtained, T is the sequence length, then normalization is carried out to obtain a similarity score α, and α is used for carrying out weighted summation on h to obtain a context vector ciAs follows

And a full-connection layer, wherein the output of the bidirectional LSTM layer and the self-attention layer is mapped into a vector with one dimension as the number of output labels, and the vector is the predicted score of the current time step i for all labels:

pi=Wi([hi,ci])+bi

wherein, WiAnd biThe parameters to be learned for the model are initialized to follow the standard normal distribution, piThe vector output for the full connection layer is also used for predicting the prediction scores of the current time step i for all the labels;

the CRF layer, includes two types of scores, an emission score and a transfer score. The transmission score is the probability value of each word mapped to a label, i.e. the output of the fully-connected layer, and the output matrix of the bi-directional LSTM layer is set to P, where P isijRepresentative word xiMapping to tagj,tagjRepresenting the non-normalized probability of the jth label with j value range from 0 to label number-1, which is similar to the emission probability matrix in the CRF model; transfer fraction of tagiLabel transfer to tagjThe transition probability of the label is set as A, AijRepresents tagiTransfer to tagjThe transition probability of (2). For all possible output tag sequences y corresponding to the input sequence X, a score is defined as:

Figure BDA0002486225060000031

the goal is to learn a set of conditional probability distribution models, i.e., find a set of parameters θ, such that the probability of a true tag sequence in the training data is maximized:

Figure BDA0002486225060000032

Figure BDA0002486225060000033

where S is the normalization of scores score for all possible output tag sequences calculated y, y' is for each possible tag sequence, θ*Then a set of parameters that maximize the probability of a true tag sequence;

the tag sequence y with the highest score calculated during prediction*

Where y' is each possible tag sequence.

Optionally, training a named entity recognition model by adopting the following steps;

s1, preprocessing data, including removing designated useless symbols, segmenting text words, removing designated stop words and constructing a feature dictionary;

optionally, designated useless symbols are removed, namely redundant spaces and other meaningless symbols in the input text are useless for the model, and the regular expression is used for removing in advance;

alternatively, text segmentation: and (3) carrying out jieba word segmentation, carrying out word segmentation on the text by using a jieba word segmentation library, and processing the input text into a word sequence. In the word segmentation process, a user-defined dictionary is established for special words in some fields which may appear or words which do not want jieba to split, and fixed words in the dictionary are reserved when the jieba word segmentation is used;

optionally, removing the designated stop word: in the word sequence generated by word segmentation, a plurality of words which are not meaningful, such as 'worded', 'worsted', and the like, are called stop words, and certainly, some words which are meaningless to the model can be defined by users as stop words, a stop word dictionary is established, and the stop words are removed after word segmentation;

optionally, constructing a dictionary: counting training data word segmentation results and constructing a dictionary;

s2, inputting data construction, including converting the text sequence after word segmentation by using the generated feature dictionary, converting the word sequence into an index sequence, dividing a training set and a verification set according to a proportion, and storing the training set and the verification set as input files;

s3, model training, including setting parameters, reading the training set and the verification set to perform model training and verification, storing the training result of the model, and returning the training and verification result;

optionally, setting model parameters: word embedding dimension: 300 dimensions; LSTM parameters: hidden layer state number 128 (i.e., the dimension corresponding to each word output by the LSTM layer), layer number 1; full connectivity layer output dimension: text sequence length tag number;

optionally, the entity information of the model comprises department names and person names, and the model has 5 types of labels;

wherein the label is: a beginning part of a person name, a middle part of a person name, a beginning part of a department name, a middle part of a department name, and non-entity information.

Alternatively, the labels are as follows:

beginning part of B-Person name

Middle part of I-Person name

Beginning part of the B-department name

Intermediate part of the department name of I-department

And O non-entity information.

For example, "help me transfer information portion lisk", after word segmentation is "help me transfer information portion lisk", and after labeling by the named entity recognition model, the output is "O B-department B-Person". The department name and the person name which need to be extracted by the model are respectively 'information part' and 'plum red'.

Optionally, the CRF layer can add a constraint condition to improve the accuracy of the prediction result, and the constraint condition is obtained by the CRF layer through automatic learning during data training. Possible constraints are:

1) the beginning of the sentence should be "B-" or "O" instead of "I-";

2) "B-label 1I-label 2I-label 3 …", in this mode, categories 1,2,3 should be the same entity category, e.g., "B-Person I-Person" is correct, while "B-Person I-Hospital" is incorrect;

3) "O I-label" is erroneous, and the beginning of the named entity should be "B-" rather than "I-"

The invention provides a telephone exchange extension switching method using the named entity recognition model, which comprises the following steps:

s4, converting the voice information into text;

optionally, the existing intelligent voice interaction platform is used for converting voice information into text information, for example, an Aliyun intelligent voice interaction platform;

s5, extracting entity information in the text based on the named entity recognition model;

optionally, the following substeps are adopted to extract entity information (department name and person name information) from the trained named entity recognition model;

s5.1, loading a model file generated by training, wherein the model file comprises a dictionary, a label and a training model;

s5.2, performing data processing on the text information of the client to generate a word index sequence; the specific steps of data processing are similar to those of a model training part, and only a constructed dictionary is replaced by a loaded characteristic dictionary file;

and S5.3, inputting the generated word index sequence into the trained named entity recognition model, and returning the extracted entity information (department name and person name information).

S6, retrieving the extension number based on the similarity analysis;

optionally, the following substeps are adopted to complete the analysis and retrieval of the extension number based on the similarity;

s6.1, reading all department names in the database;

s6.2, calculating the similarity between the extracted department names and all department names in the database, wherein the department name similarity is the weighted sum of the text semantic similarity, the Chinese character similarity and the pinyin similarity;

Figure BDA0002486225060000062

Figure BDA0002486225060000064

wherein, the similarity between the department name and all the department names in the database is sim (pred, all _ departii), the all _ departii is the department name in the database, the weights α, β and gamma are set empirically according to a plurality of experiments, when the similarity is calculated with the ith department name in the database,for the semantic similarity of the two to each other,

Figure BDA0002486225060000066

is the similarity of the two Chinese characters, and

Figure BDA0002486225060000067

the pinyin similarity between the two is shown. editDistance is an edit distance algorithm that resolves the similarity problem of two strings into the cost of converting one string into the other. The higher the cost of the conversion, the lower the similarity of the two strings. And setting a trigger word mechanism when calculating the similarity of the Chinese characters and the similarity of the pinyin, namely directly setting the similarity of the Chinese characters and the similarity of the pinyin of the department name as a highest value 1 when the extracted department name can be directly matched in the database.

Optionally, the similarity sim (pred, all _ deprti) between the extracted department names and all the department names in the database is sorted, and the top 3 real (in the database) department names with the highest similarity are selected.

S6.3, calculating the similarity between the extracted names and all the names under the selected department;

the name similarity does not include semantic similarity, and only consists of two parts, namely Chinese character similarity and pinyin similarity. And setting a trigger word mechanism when calculating the similarity, namely directly setting the Chinese character similarity and the pinyin similarity of the name as the highest value 1 when the extracted name can be directly matched in the database.

And (4) sorting the similarity of all the names under each department name selected in the step 6.2, and selecting 3 names with the highest similarity.

S6.4, calculating the overall similarity of the department names and the person names, and selecting the department names and the person names with the highest overall similarity;

calculating the integral similarity of the department names and the person names, namely the similarity of the department names and the similarity of the person names;

3 × 3 ═ 9 department names and person names are selected, and the overall similarity is calculated as follows:

simi=sim(depart,all_departi)+sim(name,all_departi_namej);

i.e. all _ part for each department name selected in step 6.2iCalculating the similarity of the department name obtained in step 6.2 and the similarity sim (name, all _ department) of the name under the department obtained in step 6.3i_namej) Sum simi. And finally, selecting the department name and the person name with the highest overall similarity.

S6.5, returning the extension number or turning to a preset telephone operation.

And S7, selecting the one with the highest similarity to execute the switching.

S7.1, setting an overall similarity threshold, and returning the extension number of the person to the system if the calculated overall similarity is greater than or equal to the overall similarity threshold;

if the calculated overall similarity is smaller than the overall similarity threshold value, guiding the client to speak the information of the desired contact again by using a preset dialect, and returning to execute voice information to convert the text;

optionally, S7.2, if the number of times of returning and executing the voice message to transfer text is greater than the transfer threshold, transferring manually. The switching threshold value can be set according to actual conditions, for example, more than 3 times.

The invention provides a telephone exchange switching extension system using the named entity recognition model, which comprises:

the voice recognition module is used for converting the user voice information into text;

an information extraction module that extracts entity information in the text based on the named entity recognition model;

the extension retrieval module is used for analyzing and retrieving the extension number based on the similarity;

and the extension switching module is used for selecting the switching with the highest similarity.

Optionally, the information extraction module obtains a representation of each word containing both forward and backward information from the bi-directional LSTM layer of the named entity recognition model as follows:

bi-directional LSTM is a two-layer neural network, the first layer being the starting input to the series from the right,indicating that the output is bh at each time step i from the last word of the sentence as inputi

The second layer serves as the starting input for the series from the left,

Figure BDA0002486225060000072

indicates that the input starts from the beginning of the sentence and the output is fh at each time step iiCascading of hidden states h of the final output layer LSTMiComprises the following steps:

Figure BDA0002486225060000073

hi=[fhi,bhi]。

optionally, the information extraction module captures word dependency inside the sentence in the following manner for a self-attention layer of the named entity recognition model;

at each time step i, calculating the current hidden layer state hiAnd all hidden layer states h ═ h1,h2,...hT]The similarity of (a) is obtained, T is the sequence length, then normalization is carried out to obtain a similarity score α, and α is used for carrying out weighted summation on h to obtain a context vector ci

Figure BDA0002486225060000083

Optionally, the information extraction module defines that the full-connection layer output vector of the named entity recognition model is the prediction score of the current time step i for all the tags;

pi=Wi([hi,ci])+bi

wherein, WiAnd biThe parameters to be learned for the model are initialized to follow the standard normal distribution, piThe vector output for the fully-connected layer is also the prediction score for all tags for the current time step i.

Optionally, the information extraction module can add a constraint condition to the CRF layer of the entity recognition model to improve the accuracy of the prediction result, and the constraint condition can be obtained by the automatic learning of the CRF layer during data training.

Optionally, the information extraction module can perform model training on the entity recognition model in the following manner;

data preprocessing, including removing designated useless symbols, text word segmentation, removing designated stop words and constructing a feature dictionary;

the input data construction comprises the steps of converting the text sequence after word segmentation by using the generated feature dictionary, converting the word sequence into an index sequence, dividing a training set and a verification set according to a proportion, and storing the training set and the verification set as input files;

and model training, which comprises setting parameters, reading the training set and the verification set to perform model training and verification, storing the training result of the model, and returning the training and verification result.

Optionally, the information extraction module extracts entity information of the entity identification model, which includes department names and person names, and has 5 types of tags;

wherein the label is: a beginning part of a person name, a middle part of a person name, a beginning part of a department name, a middle part of a department name, and non-entity information.

Optionally, the information extraction module extracts department name and person name information of the trained named entity recognition model in the following way;

loading a model file generated by training;

performing data processing on the text information of the client to generate a word index sequence;

and inputting the generated word index sequence into a trained named entity recognition model, and returning the extracted department name and person name information.

Optionally, the extension retrieval module completes extension retrieval in the following manner;

reading all department names in a database;

calculating the similarity between the extracted department names and all department names in the database, wherein the department name similarity is the weighted sum of the text semantic similarity, the Chinese character similarity and the pinyin similarity;

calculating the similarity between the extracted names and all the names in the selected department;

calculating the overall similarity of the department names and the person names, and selecting the department names and the person names with the highest overall similarity;

calculating the integral similarity of the department names and the person names, namely the similarity of the department names and the similarity of the person names;

returning the extension number or turning to the preset telephone operation.

Optionally, the extension switching module performs extension switching in the following manner;

setting an overall similarity threshold, and returning the extension number of the person to the system if the calculated overall similarity is greater than or equal to the overall similarity threshold;

if the calculated overall similarity is smaller than the overall similarity threshold value, guiding the client to speak the information of the desired contact again by using a preset dialect, and returning to execute voice information to convert the text;

if the number of times of returning and executing the voice information text transferring is larger than the transferring threshold value, manual transferring is carried out.

The telephone exchange switching method/system converts the voice information of the client into the text in real time, and extracts the entity information in the text, such as the name of a person, the name of a department and the like by using the named entity recognition technology. The named entity recognition technology is to assign each word in a continuous sequence to a corresponding semantic category label so as to recognize entity information, and in order to improve the accuracy of the named entity recognition technology, the invention provides a named entity recognition model of an Attention-Based bidirectional long and short time memory unit-conditional random field (Attention-Based BilSt-CRF). Due to the problems of dialect, poor tone quality of telephone access, error in text conversion from voice and the like, the extracted information may not be accurate enough, for example, "establishing" is extracted, which is actually "Fujian", and the information cannot be retrieved if the "establishing" is directly used for retrieval in the database, and the switching of the extension set cannot be completed as long as the client says "establishing" no matter how many turns, so that the system flexibility is poor, and the client experience is influenced. Therefore, the invention adopts similarity analysis to retrieve the extension number, calculates the similarity between the extracted information and the corresponding information in the database, selects the optimal matching, inquires the extension number and carries out automatic switching.

By adopting the technical scheme of the invention, the received voice information of the client can be converted into the text information in real time, the department name and the person name are extracted from the text information, the extension number is retrieved according to the extracted department name and person name and is transferred to the client, and the problems that the client can not contact the desired contact person, the same problem is repeatedly stated, the process is complicated and the efficiency is low are avoided. The invention can accurately and quickly retrieve the extension number of the contact person for the client and carry out switching according to the requirement of the client, supports the simultaneous provision of extension switching service for a plurality of clients, is convenient to make and communicate at any time, has strong flexibility, and provides high-quality and high-efficiency switchboard switching service experience for the client by combining the assistance of manual switchboard customer service.

The invention adopts a named entity recognition model of a two-way long and short time memory unit-conditional random field based on an Attention mechanism, the two-way long and short time memory unit (BilSTM) can extract forward and backward information, and a Self-Attention (Self-Attention) mechanism is introduced to capture the long distance word dependency relationship, so that the semantic understanding capability of the model is stronger, the dependency relationship between learning labels of a Conditional Random Field (CRF) layer limits the label sequence, and the recognition accuracy of the model to the entity is higher. In addition, the invention searches the extension number based on the similarity analysis, allows the problems of a certain dialect, poor tone quality of a telephone channel and error of converting voice into text, and has strong flexibility and high accuracy.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention, are incorporated in and constitute a part of this specification. The drawings are not necessarily to scale, however, and may not be intended to accurately reflect the precise structural or performance characteristics of any given embodiment, and should not be construed as limiting or restricting the scope of values or properties encompassed by exemplary embodiments in accordance with the invention. The invention will be described in further detail with reference to the following detailed description and accompanying drawings:

FIG. 1 is a diagram of a named entity recognition model structure according to the present invention.

FIG. 2 is a schematic diagram of a training process of the named entity recognition model according to the present invention.

Fig. 3 is a flow chart of the method for switching the extension telephone from the telephone exchange.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and technical effects of the present invention will be fully apparent to those skilled in the art from the disclosure in the specification. The invention is capable of other embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the general spirit of the invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. The following exemplary embodiments of the present invention may be embodied in many different forms and should not be construed as limited to the specific embodiments set forth herein. It is to be understood that these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the technical solutions of these exemplary embodiments to those skilled in the art.

In a first embodiment, the present invention provides a named entity recognition model of an Attention-Based two-way long and short time memory unit-conditional random field (Attention-Based BilSt-CRF), where entity information of the model includes department names and person names, and has 5 types of labels;

wherein the label is: the starting part of the person name, the middle part of the person name, the starting part of the department name, the middle part of the department name and the non-entity information are labeled as follows:

beginning part of B-Person name

Middle part of I-Person name

Beginning part of the B-department name

Intermediate part of the department name of I-department

And O non-entity information.

For example, "help me transfer information portion lisk", after word segmentation is "help me transfer information portion lisk", and after labeling by the named entity recognition model, the output is "O B-department B-Person". The department name and the person name which need to be extracted by the model are respectively 'information part' and 'plum red'.

As shown in FIG. 1, the named entity recognition model includes:

an embedding layer, which is a pre-trained word vector used by the model, which is continuously updated with the iteration of the model;

a bi-directional LSTM layer adapted to perform feature extraction, obtaining for each word a representation containing both forward and backward information; feature extraction is performed using bi-directional LSTM, and for each word a representation is obtained that contains both forward and backward information. Bidirectional LSTM can be viewed as a two-layer neural network, the first layer being the beginning input of the series from the right, and in text processing it can be understood that the last word of a sentence is the input, and the output at each time step i is bhiWhile the second layer is input from the left as the beginning of the series, it can be understood in text processing that the input starts from the beginning of the sentence, and the output is fh at each time step iiThe final concatenation of hidden states of the output layer LSTM:

Figure BDA0002486225060000121

Figure BDA0002486225060000122

hi=[fhi,bhi];

a self-attention layer adapted to capture word dependencies inside sentences;

although the bidirectional LSTM can acquire forward and backward information and has longer distance dependence than RNN, the LSTM cannot well retain information with longer distance after passing through multiple layers when the sentence sequence is longer. The invention introduces a Self-Attention (Self-Attention) mechanism to capture the word dependency relationship in the sentence, and calculates the current hidden layer state h at each time step iiAnd all hidden layer states h ═ h1,h2,...hT]The similarity of (a) is obtained, T is the sequence length, then normalization is carried out to obtain a similarity score α, and α is used for carrying out weighted summation on h to obtain a context vector ciAs follows

And a full-connection layer, wherein the output of the bidirectional LSTM layer and the self-attention layer is mapped into a vector with one dimension as the number of output labels, and the vector is the predicted score of the current time step i for all labels: (ii) a

pi=Wi([hi,ci])+bi

Wherein, WiAnd biThe parameters to be learned for the model are initialized to follow the standard normal distribution, piVectors output for fully-connected layers, also predicting the current time stepi predicted scores for all tags;

the CRF layer, includes two types of scores, an emission score and a transfer score. The transmission score is the probability value of each word mapped to a label, i.e. the output of the fully-connected layer, and the output matrix of the bi-directional LSTM layer is set to P, where P isijRepresentative word xiMapping to tagj,tagjRepresenting the non-normalized probability of the jth label with j value range from 0 to label number-1, which is similar to the emission probability matrix in the CRF model; transfer fraction of tagiLabel transfer to tagjThe transition probability of the label is set as A, AijRepresents tagiTransfer to tagjThe transition probability of (2). For the output tag sequence y corresponding to the input sequence X, the score is defined as:

Figure BDA0002486225060000131

the goal is to learn a set of conditional probability distribution models, i.e., find a set of parameters θ, such that the probability of a true tag sequence in the training data is maximized:

Figure BDA0002486225060000132

Figure BDA0002486225060000133

where S is the normalization of scores score for all possible output tag sequences calculated y, y' is for each possible tag sequence, θ*Then a set of parameters that maximize the probability of a true tag sequence;

the tag sequence y with the highest score calculated during prediction*

Where y' is each possible tag sequence.

A second embodiment, which is further improved on the first embodiment, and adds a step of training a named entity recognition model, and the same parts as those in the first embodiment are not described again; as shown in fig. 2, the named entity recognition model training is performed by the following steps;

s1, preprocessing data, including removing designated useless symbols, segmenting text words, removing designated stop words and constructing a feature dictionary; removing specified useless symbols, namely removing redundant spaces and other meaningless symbols in the input text which are useless for the model by using a regular expression in advance;

text word segmentation: and (3) carrying out jieba word segmentation, carrying out word segmentation on the text by using a jieba word segmentation library, and processing the input text into a word sequence. In the word segmentation process, a user-defined dictionary is established for special words in some fields which may appear or words which do not want jieba to split, and fixed words in the dictionary are reserved when the jieba word segmentation is used;

removing the specified stop word: in the word sequence generated by word segmentation, a plurality of words which are not meaningful, such as 'worded', 'worsted', and the like, are called stop words, and certainly, some words which are meaningless to the model can be defined by users as stop words, a stop word dictionary is established, and the stop words are removed after word segmentation;

constructing a dictionary: counting training data word segmentation results and constructing a dictionary;

s2, inputting data construction, including converting the text sequence after word segmentation by using the generated feature dictionary, converting the word sequence into an index sequence, dividing a training set and a verification set according to a proportion, and storing the training set and the verification set as input files;

s3, model training, including setting parameters, reading the training set and the verification set to perform model training and verification, storing the training result of the model, and returning the training and verification result;

setting model parameters: word embedding dimension: 300 dimensions; LSTM parameters: number of hidden states 128, number of layers 1; full connectivity layer output dimension: text sequence length tag number;

in a further improvement of the second embodiment, the CRF layer can add a constraint condition to improve the accuracy of the prediction result, and the constraint condition is obtained by the CRF layer through automatic learning during data training. Possible constraints are:

1) the beginning of the sentence should be "B-" or "O" instead of "I-";

2) "B-label 1I-label 2I-label 3 …", in this mode, categories 1,2,3 should be the same entity category, e.g., "B-Person I-Person" is correct, while "B-Person I-Hospital" is incorrect;

3) "O I-label" is erroneous, and the beginning of the named entity should be "B-" rather than "I-"

In a third embodiment, the present invention provides a method for the named entity recognition model of the first or second embodiment, comprising the steps of:

s4, converting the voice information into text, and converting the voice information into text information by using the existing intelligent voice interaction platform, such as an Ali cloud intelligent voice interaction platform;

s5, extracting entity information in the text based on the named entity recognition model, including the following substeps;

s5.1, loading a model file generated by training, wherein the model file comprises a dictionary, a label and a training model;

s5.2, performing data processing on the text information of the client to generate a word index sequence; the specific steps of data processing are similar to those of a model training part, and only a constructed dictionary is replaced by a loaded characteristic dictionary file;

and S5.3, inputting the generated word index sequence into the trained named entity recognition model, and returning the extracted entity information (department name and person name information).

S6, based on similarity analysis, searching extension number, including the following substeps;

s6.1, reading all department names in the database;

s6.2, calculating the similarity between the extracted department names and all department names in the database, wherein the department name similarity is the weighted sum of the text semantic similarity, the Chinese character similarity and the pinyin similarity;

wherein the similarity between the department name and all department names in the database is sim (pred, all _ department)i),all_departiThe weights α, β, and gamma are empirically set for the department names in the database based on multiple experiments, and when the similarity is calculated with the ith department name in the database,

Figure BDA0002486225060000155

for the semantic similarity of the two to each other,is the similarity of the two Chinese characters, and

Figure BDA0002486225060000157

the pinyin similarity between the two is shown. editDistance is an edit distance algorithm that resolves the similarity problem of two strings into the cost of converting one string into the other. The higher the cost of the conversion, the lower the similarity of the two strings. And setting a trigger word mechanism when calculating the similarity of the Chinese characters and the similarity of the pinyin, namely directly setting the similarity of the Chinese characters and the similarity of the pinyin of the department name as a highest value 1 when the extracted department name can be directly matched in the database.

Similarity sim (pred, all _ department) between the extracted department name and all department names in the databasei) Sorting is performed, and the top 3 real department names (in the database) with the highest similarity are selected.

S6.3, calculating the similarity between the extracted names and all the names under the selected department;

the name similarity does not include semantic similarity, and only consists of two parts, namely Chinese character similarity and pinyin similarity. And setting a trigger word mechanism when calculating the similarity, namely directly setting the Chinese character similarity and the pinyin similarity of the name as the highest value 1 when the extracted name can be directly matched in the database.

And (4) sorting the similarity of all the names under each department name selected in the step 6.2, and selecting 3 names with the highest similarity.

S6.4, calculating the overall similarity of the department names and the person names, and selecting the department names and the person names with the highest overall similarity;

calculating the integral similarity of the department names and the person names, namely the similarity of the department names and the similarity of the person names;

3 × 3 ═ 9 department names and person names are selected, and the overall similarity is calculated as follows:

simi=sim(depart,all_departi)+sim(name,all_departi_namej);

i.e. all _ part for each department name selected in step 6.2iCalculating the similarity of the department name obtained in step 6.2 and the similarity sim (name, all _ department) of the name under the department obtained in step 6.3i_namej) Sum simi. And finally, selecting the department name and the person name with the highest overall similarity.

S6.5, returning the extension number or turning to a preset telephone operation.

And S7, selecting the one with the highest similarity to execute the switching.

S7.1, setting an overall similarity threshold, and returning the extension number of the person to the system if the calculated overall similarity is greater than or equal to the overall similarity threshold;

if the calculated overall similarity is smaller than the overall similarity threshold value, guiding the client to speak the information of the desired contact again by using a preset dialect, and returning to execute voice information to convert the text;

and S7.2, if the number of times of returning and executing the voice information text transfer is larger than the transfer threshold value, transferring the voice information text transfer manually.

In a fourth embodiment, the present invention provides a telephone exchange extension switching system using the named entity recognition model, including:

the voice recognition module is used for converting the user voice information into text;

the information extraction module obtains a representation of each word containing both forward and backward information from the bi-directional LSTM layer of the named entity recognition model as follows:

bi-directional LSTM is a two-layer neural network, the first layer being the starting input to the series from the right,representing the output from the last word of the sentence as bhi at each time step i;

the second layer serves as the starting input for the series from the left,

Figure BDA0002486225060000162

the cascade hi representing the hidden states of the input starting from the beginning of the sentence, at each time step i, is fhi, and the final output layer LSTM is:

hi=[fhi,bhi]。

the information extraction module captures word dependency inside sentences by adopting the following mode for a self-attention layer of the named entity recognition model;

at each time step i, calculating the current hidden layer state hiAnd all hidden layer states h ═ h1,h2,...hT]The similarity of (a) is obtained, T is the sequence length, the normalization is carried out to obtain a similarity score α, and α is used for carrying out weighted summation on h to obtain a context vector ci

Figure BDA0002486225060000174

The information extraction module defines that the output vector of the full-connection layer of the named entity recognition model is the prediction score of the current time step i for all the labels;

pi=Wi([hici])+bi

wherein, WiAnd biThe parameters to be learned for the model are initialized to follow the standard normal distribution, piThe vector output for the fully-connected layer is also the prediction score for all tags for the current time step i.

The extension retrieval module completes extension retrieval in the following mode;

reading all department names in a database;

calculating the similarity between the extracted department names and all department names in the database, wherein the department name similarity is the weighted sum of the text semantic similarity, the Chinese character similarity and the pinyin similarity;

calculating the similarity between the extracted names and all the names in the selected department;

calculating the overall similarity of the department names and the person names, and selecting the department names and the person names with the highest overall similarity;

calculating the integral similarity of the department names and the person names, namely the similarity of the department names and the similarity of the person names;

returning the extension number or turning to the preset telephone operation.

The extension switching module executes extension switching in the following mode;

setting an overall similarity threshold, and returning the extension number of the person to the system if the calculated overall similarity is greater than or equal to the overall similarity threshold;

if the calculated overall similarity is smaller than the overall similarity threshold value, guiding the client to speak the information of the desired contact again by using a preset dialect, and returning to execute voice information to convert the text;

if the number of times of returning and executing the voice information text transferring is larger than the transferring threshold value, manual transferring is carried out.

A fifth embodiment, which is further improved from the fourth embodiment, and the same parts as those in the fourth embodiment are not described again; the information extraction module can add constraint conditions to a CRF layer of the entity recognition model to improve the accuracy of the prediction result, the constraint conditions can be obtained by the automatic learning of the CRF layer during data training, and the constraint conditions can be obtained by the automatic learning of the CRF layer during data training. Possible constraints are:

1) the beginning of the sentence should be "B-" or "O" instead of "I-";

2) "B-label 1I-label 2I-label 3 …", in this mode, categories 1,2,3 should be the same entity category, e.g., "B-Person I-Person" is correct, while "B-Person I-Hospital" is incorrect;

3) "O I-label" is erroneous, and the beginning of the named entity should be "B-" rather than "I-"

The information extraction module can carry out model training on the entity recognition model in the following way;

data preprocessing, including removing designated useless symbols, text word segmentation, removing designated stop words and constructing a feature dictionary;

the input data construction comprises the steps of converting the text sequence after word segmentation by using the generated feature dictionary, converting the word sequence into an index sequence, dividing a training set and a verification set according to a proportion, and storing the training set and the verification set as input files;

and model training, which comprises setting parameters, reading the training set and the verification set to perform model training and verification, storing the training result of the model, and returning the training and verification result.

The information extraction module extracts entity information of the entity identification model, including department names and person names, and having 5 types of labels;

wherein the label is: a beginning part of a person name, a middle part of a person name, a beginning part of a department name, a middle part of a department name, and non-entity information.

The information extraction module extracts department name and person name information of the trained named entity recognition model in the following way;

loading a model file generated by training;

performing data processing on the text information of the client to generate a word index sequence;

and inputting the generated word index sequence into a trained named entity recognition model, and returning the extracted department name and person name information.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

The present invention has been described in detail with reference to the specific embodiments and examples, but these are not intended to limit the present invention. Many variations and modifications may be made by one of ordinary skill in the art without departing from the principles of the present invention, which should also be considered as within the scope of the present invention.

20页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:特征信息的识别方法、装置及计算机可读存储介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!