Sentence-breaking method, device, equipment and storage medium based on natural language

文档序号:661718 发布日期:2021-04-27 浏览:2次 中文

阅读说明:本技术 基于自然语言的断句方法、装置、设备及存储介质 (Sentence-breaking method, device, equipment and storage medium based on natural language ) 是由 赵焕丽 徐国强 于 2020-12-23 设计创作,主要内容包括:本发明涉及大数据技术领域,公开了基于自然语言的断句方法、装置、设备及存储介质,用于采用自然语言处理算法进行断句,从而提高断句的灵活性和准确性。基于自然语言的断句方法包括:从第一业务场景中获取待处理语音数据,或者从第二业务场景中获取待处理文本数据;当获取待处理语音数据时,将待处理语音数据输入预置的语音识别模型中结合自然语言处理算法进行特征筛选与断句,生成目标文本断句数据;当获取待处理文本数据时,将待处理文本数据输入预先训练好的文本断句模型进行特征筛选与断句,生成目标文本断句数据;根据目标文本断句数据和场景配置生成目标应答数据。此外,本发明还涉及区块链技术,待处理语音数据可存储于区块链中。(The invention relates to the technical field of big data, and discloses a sentence-breaking method, a sentence-breaking device, sentence-breaking equipment and a storage medium based on natural language, which are used for breaking sentences by adopting a natural language processing algorithm, so that the flexibility and the accuracy of the sentence-breaking are improved. The sentence-breaking method based on the natural language comprises the following steps: acquiring voice data to be processed from a first service scene, or acquiring text data to be processed from a second service scene; when the voice data to be processed is obtained, inputting the voice data to be processed into a preset voice recognition model, and combining a natural language processing algorithm to perform feature screening and sentence breaking to generate target text sentence breaking data; when text data to be processed is obtained, inputting the text data to be processed into a pre-trained text punctuation model for feature screening and punctuation, and generating target text punctuation data; and generating target response data according to the target text sentence break data and the scene configuration. In addition, the invention also relates to a block chain technology, and the voice data to be processed can be stored in the block chain.)

1. A sentence-breaking method based on natural language is characterized in that the sentence-breaking method based on natural language comprises the following steps:

acquiring voice data to be processed from a first service scene, or acquiring text data to be processed from a second service scene;

when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model to generate a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of text characters;

when text data to be processed is acquired from a second service scene, inputting the text data to be processed into a pre-trained text sentence break model, and performing feature screening and sentence break on the text data to be processed by combining a natural language processing algorithm to generate target text sentence break data;

and generating target response data according to the target text sentence break data and the corresponding scene configuration, and transmitting the target response data to a target terminal, wherein the scene configuration is a preset scene configuration.

2. The sentence breaking method based on natural language according to claim 1, wherein when the voice data to be processed is obtained from the first service scenario, inputting the voice data to be processed into a preset voice recognition model, generating a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining with a natural language processing algorithm, so as to generate target text sentence breaking data, wherein the text sequence to be recognized includes a plurality of text characters:

when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing feature extraction to generate voice signal features;

and processing the characteristics of the voice signal to generate a text sequence to be recognized, and performing characteristic screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of target text characters.

3. The sentence-breaking method based on natural language according to claim 2, wherein when the voice data to be processed is obtained from the first service scene, the voice data to be processed is input into a preset voice recognition model for feature extraction, and generating the voice signal features comprises:

when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing noise elimination processing to generate voice data with noise eliminated;

performing signal enhancement processing on the voice data after the noise is eliminated to generate enhanced voice data;

and performing feature extraction on the voice data after the enhancement processing to generate voice signal features.

4. The sentence breaking method based on natural language according to claim 2, wherein the processing the speech signal features to generate a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining with a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized includes a plurality of target text characters comprises:

inputting the voice signal characteristics into an acoustic model of the voice recognition model for scoring, and generating a plurality of acoustic model scores;

inputting the voice signal characteristics into a language model of the voice recognition model for scoring, and generating a plurality of language model scores;

searching a target acoustic model score and a target language model score with the highest score in the plurality of acoustic model scores and the plurality of language model scores, and determining a text sequence to be recognized based on the target acoustic model score and the target language model score, wherein the text sequence to be recognized comprises a plurality of target text characters;

and carrying out sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data.

5. The sentence breaking method based on natural language according to claim 1, wherein when the text data to be processed is obtained from the second service scenario, inputting the text data to be processed into a text sentence breaking model trained in advance, and performing feature screening and sentence breaking on the text data to be processed by combining a natural language processing algorithm, and generating the target text sentence breaking data comprises:

when text data to be processed is obtained from a second service scene, inputting the text data to be processed into a pre-trained text sentence breaking model, and performing feature screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation label sequence;

and performing sentence breaking based on the text observation sequence and the observation label sequence to generate target text sentence breaking data.

6. The sentence breaking method based on natural language according to claim 5, wherein when the text data to be processed is obtained from the second service scenario, inputting the text data to be processed into a text sentence breaking model trained in advance, and performing feature screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation tag sequence comprises:

when text data to be processed is obtained from a second service scene, inputting the text data to be processed into an embedding layer of a pre-trained text sentence-breaking model for vector mapping to generate a vector sequence, wherein the vector sequence does not include a blank space;

inputting the vector sequence into a bidirectional long-time and short-time memory recurrent neural network, and performing feature screening to generate a vector sequence after feature screening;

inputting the vector sequence with the screened features into a conditional random field to generate a text observation sequence and an observation label sequence, wherein the text observation sequence comprises a plurality of characters, the observation label sequence comprises a plurality of observation labels, and the characters correspond to the observation labels one by one.

7. The natural language based sentence breaking method according to claim 5, wherein the sentence breaking based on the text observation sequence and the observation tag sequence, and generating target text sentence breaking data comprises:

judging whether each observation label in the observation label sequence is a preset sentence break label or not;

and if the target observation label is a sentence break label, determining that the character corresponding to the target observation label is the target sentence break character, adding a preset sentence break separator behind the target sentence break character to break the sentence, and generating target text sentence break data.

8. A sentence-breaking device based on natural language, characterized in that the sentence-breaking device based on natural language comprises:

the acquisition module is used for acquiring voice data to be processed from a first service scene or acquiring text data to be processed from a second service scene;

the system comprises a first sentence-breaking module, a second sentence-breaking module and a third sentence-breaking module, wherein the first sentence-breaking module is used for inputting voice data to be processed into a preset voice recognition model when the voice data to be processed is obtained from a first service scene, generating a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence-breaking data, and the text sequence to be recognized comprises a plurality of text characters;

the second sentence-breaking module is used for inputting the text data to be processed into a pre-trained text sentence-breaking model when the text data to be processed is obtained from a second service scene, and performing feature screening and sentence breaking on the text data to be processed by combining a natural language processing algorithm to generate target text sentence-breaking data;

and the response data generation module is used for generating target response data according to the target text sentence break data and the corresponding scene configuration, transmitting the target response data to the target terminal, and configuring the scene configuration into the preset scene configuration.

9. A natural language based sentence-breaking device, characterized in that the natural language based sentence-breaking device comprises: a memory and at least one processor, the memory having instructions stored therein;

the at least one processor invoking the instructions in the memory to cause the natural language based sentence break apparatus to perform the natural language based sentence break method of any of claims 1-7.

10. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement a natural language based sentence-breaking method according to any of claims 1-7.

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a sentence segmentation method, device, apparatus, and storage medium based on natural language.

Background

With the continuous development and application of artificial intelligence technology, more and more robot service scene applications appear, and man-machine interaction becomes a common technology in a new era. For the customer service industry, no matter whether the customer service center is a telephone sales center or a customer service center, the intelligent customer service robot can help enterprises to save cost, reduce labor cost and greatly improve working efficiency, and is the best assistant for customer service personnel. The sentence breaking of the user words is the first step of text processing of the customer service robot, and the sentence breaking affects the accuracy of all subsequent interaction modules, so that the performance of the customer service robot is affected.

In a telephone sales scene and a customer service center scene, a user communicates with a customer service through voice signals or characters, and sentence breaking is carried out on the voice signals and the characters of the user. At present, most sentence-breaking modules of customer service robots in the market use a voice endpoint detection technology to break sentences, that is, a starting point and an end point of an actual voice segment are detected from a continuous audio signal by combining voice characteristics such as frequency domain, spectral entropy, fundamental frequency and the like, and then the sentences are broken from the end point or the sentences are determined according to the length of pause time between the end point and the next starting point.

Disclosure of Invention

The invention provides a sentence-breaking method, a sentence-breaking device, sentence-breaking equipment and a storage medium based on natural language, which are used for breaking sentences by adopting a natural language processing algorithm, so that the flexibility and the accuracy of the sentence-breaking are improved.

The invention provides a sentence-breaking method based on natural language in a first aspect, which comprises the following steps: acquiring voice data to be processed from a first service scene, or acquiring text data to be processed from a second service scene; when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model to generate a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of text characters; when text data to be processed is acquired from a second service scene, inputting the text data to be processed into a pre-trained text sentence break model, and performing feature screening and sentence break on the text data to be processed by combining a natural language processing algorithm to generate target text sentence break data; and generating target response data according to the target text sentence break data and the corresponding scene configuration, and transmitting the target response data to a target terminal, wherein the scene configuration is a preset scene configuration.

Optionally, in a first implementation manner of the first aspect of the present invention, when obtaining to-be-processed voice data from a first service scenario, inputting the to-be-processed voice data into a preset voice recognition model to generate a to-be-recognized text sequence, and performing feature screening and sentence segmentation on the to-be-recognized text sequence in combination with a natural language processing algorithm to generate target text sentence segmentation data, where the to-be-recognized text sequence includes a plurality of text characters: when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing feature extraction to generate voice signal features; and processing the characteristics of the voice signal to generate a text sequence to be recognized, and performing characteristic screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of target text characters.

Optionally, in a second implementation manner of the first aspect of the present invention, when obtaining the voice data to be processed from the first service scenario, inputting the voice data to be processed into a preset voice recognition model for feature extraction, and generating the voice signal feature includes: when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing noise elimination processing to generate voice data with noise eliminated; performing signal enhancement processing on the voice data after the noise is eliminated to generate enhanced voice data; and performing feature extraction on the voice data after the enhancement processing to generate voice signal features.

Optionally, in a third implementation manner of the first aspect of the present invention, the processing the characteristics of the speech signal to generate a text sequence to be recognized, and performing characteristic screening and sentence segmentation on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence segmentation data, where the text sequence to be recognized includes a plurality of target text characters includes: inputting the voice signal characteristics into an acoustic model of the voice recognition model for scoring, and generating a plurality of acoustic model scores; inputting the voice signal characteristics into a language model of the voice recognition model for scoring, and generating a plurality of language model scores; searching a target acoustic model score and a target language model score with the highest score in the plurality of acoustic model scores and the plurality of language model scores, and determining a text sequence to be recognized based on the target acoustic model score and the target language model score, wherein the text sequence to be recognized comprises a plurality of target text characters; and carrying out sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data.

Optionally, in a fourth implementation manner of the first aspect of the present invention, when text data to be processed is obtained from the second service scenario, the inputting the text data to be processed into a text sentence break model trained in advance, and performing feature screening and sentence breaking on the text data to be processed by combining a natural language processing algorithm to generate target text sentence break data includes: when text data to be processed is obtained from a second service scene, inputting the text data to be processed into a pre-trained text sentence breaking model, and performing feature screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation label sequence; and performing sentence breaking based on the text observation sequence and the observation label sequence to generate target text sentence breaking data.

Optionally, in a fifth implementation manner of the first aspect of the present invention, when text data to be processed is obtained from the second service scenario, inputting the text data to be processed into a text sentence segmentation model trained in advance, and performing feature screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation tag sequence includes: when text data to be processed is obtained from a second service scene, inputting the text data to be processed into an embedding layer of a pre-trained text sentence-breaking model for vector mapping to generate a vector sequence, wherein the vector sequence does not include a blank space; inputting the vector sequence into a bidirectional long-time and short-time memory recurrent neural network, and performing feature screening to generate a vector sequence after feature screening; inputting the vector sequence with the screened features into a conditional random field to generate a text observation sequence and an observation label sequence, wherein the text observation sequence comprises a plurality of characters, the observation label sequence comprises a plurality of observation labels, and the characters correspond to the observation labels one by one.

Optionally, in a sixth implementation manner of the first aspect of the present invention, the performing sentence break based on the text observation sequence and the observation tag sequence, and generating target text sentence break data includes: judging whether each observation label in the observation label sequence is a preset sentence break label or not; and if the target observation label is a sentence break label, determining that the character corresponding to the target observation label is the target sentence break character, adding a preset sentence break separator behind the target sentence break character to break the sentence, and generating target text sentence break data.

A second aspect of the present invention provides a sentence-breaking device based on natural language, including: the acquisition module is used for acquiring voice data to be processed from a first service scene or acquiring text data to be processed from a second service scene; the system comprises a first sentence-breaking module, a second sentence-breaking module and a third sentence-breaking module, wherein the first sentence-breaking module is used for inputting voice data to be processed into a preset voice recognition model when the voice data to be processed is obtained from a first service scene, generating a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence-breaking data, and the text sequence to be recognized comprises a plurality of text characters; the second sentence-breaking module is used for inputting the text data to be processed into a pre-trained text sentence-breaking model when the text data to be processed is obtained from a second service scene, and performing feature screening and sentence breaking on the text data to be processed by combining a natural language processing algorithm to generate target text sentence-breaking data; and the response data generation module is used for generating target response data according to the target text sentence break data and the corresponding scene configuration, transmitting the target response data to the target terminal, and configuring the scene configuration into the preset scene configuration.

Optionally, in a first implementation manner of the second aspect of the present invention, the first sentence segmentation module includes: the device comprises a characteristic extraction unit, a voice recognition unit and a voice processing unit, wherein the characteristic extraction unit is used for inputting voice data to be processed into a preset voice recognition model when the voice data to be processed is acquired from a first service scene, and performing characteristic extraction to generate voice signal characteristics; and the first sentence breaking unit is used for processing the characteristics of the voice signal to generate a text sequence to be recognized, performing characteristic screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of target text characters.

Optionally, in a second implementation manner of the second aspect of the present invention, the feature extraction unit may be further specifically configured to: when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing noise elimination processing to generate voice data with noise eliminated; performing signal enhancement processing on the voice data after the noise is eliminated to generate enhanced voice data; and performing feature extraction on the voice data after the enhancement processing to generate voice signal features.

Optionally, in a third implementation manner of the second aspect of the present invention, the first sentence-punctuation unit may be further specifically configured to: inputting the voice signal characteristics into an acoustic model of the voice recognition model for scoring, and generating a plurality of acoustic model scores; inputting the voice signal characteristics into a language model of the voice recognition model for scoring, and generating a plurality of language model scores; searching a target acoustic model score and a target language model score with the highest score in the plurality of acoustic model scores and the plurality of language model scores, and determining a text sequence to be recognized based on the target acoustic model score and the target language model score, wherein the text sequence to be recognized comprises a plurality of target text characters; and carrying out sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data.

Optionally, in a fourth implementation manner of the second aspect of the present invention, the second sentence segmentation module includes: the characteristic screening module is used for inputting the text data to be processed into a pre-trained text sentence-breaking model when the text data to be processed is obtained from a second service scene, and performing characteristic screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation label sequence; and the second sentence breaking module is used for carrying out sentence breaking based on the text observation sequence and the observation label sequence to generate target text sentence breaking data.

Optionally, in a fifth implementation manner of the second aspect of the present invention, the feature screening module may be further specifically configured to: when text data to be processed is obtained from a second service scene, inputting the text data to be processed into an embedding layer of a pre-trained text sentence-breaking model for vector mapping to generate a vector sequence, wherein the vector sequence does not include a blank space; inputting the vector sequence into a bidirectional long-time and short-time memory recurrent neural network, and performing feature screening to generate a vector sequence after feature screening; inputting the vector sequence with the screened features into a conditional random field to generate a text observation sequence and an observation label sequence, wherein the text observation sequence comprises a plurality of characters, the observation label sequence comprises a plurality of observation labels, and the characters correspond to the observation labels one by one.

Optionally, in a sixth implementation manner of the second aspect of the present invention, the second sentence segmentation module may be further configured to: judging whether each observation label in the observation label sequence is a preset sentence break label or not; and if the target observation label is a sentence break label, determining that the character corresponding to the target observation label is the target sentence break character, adding a preset sentence break separator behind the target sentence break character to break the sentence, and generating target text sentence break data.

A third aspect of the present invention provides a sentence-breaking device based on natural language, including: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the natural language based sentence break apparatus to perform the natural language based sentence break method described above.

A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the natural language based sentence-breaking method described above.

In the technical scheme provided by the invention, the voice data to be processed is obtained from a first service scene, or the text data to be processed is obtained from a second service scene; when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model to generate a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of text characters; when text data to be processed is acquired from a second service scene, inputting the text data to be processed into a pre-trained text sentence break model, and performing feature screening and sentence break on the text data to be processed by combining a natural language processing algorithm to generate target text sentence break data; and generating target response data according to the target text sentence break data and the corresponding scene configuration, and transmitting the target response data to a target terminal, wherein the scene configuration is a preset scene configuration. In the embodiment of the invention, to-be-processed voice data acquired from a first service scene (a telephone sales scene) is input into a voice recognition model to generate a to-be-recognized text sequence, and a natural language processing algorithm is combined to perform sentence breaking on the to-be-recognized text sequence, or to-be-processed text data acquired from a second service scene (a customer service scene) is input into a trained text sentence breaking model, and natural language processing algorithm is combined to perform sentence breaking on the to-be-processed text data; the method and the device have the advantages that the natural language processing algorithm is used for sentence breaking of the voice data to be processed of the first service scene and the text data to be processed of the second service scene, and flexibility and accuracy of sentence breaking are improved.

Drawings

FIG. 1 is a diagram of an embodiment of a sentence-breaking method based on natural language according to an embodiment of the present invention;

FIG. 2 is a diagram of another embodiment of a sentence-breaking method based on natural language according to an embodiment of the present invention;

FIG. 3 is a diagram of an embodiment of a sentence-breaking device based on natural language according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another embodiment of a sentence-breaking device based on natural language according to an embodiment of the present invention;

fig. 5 is a schematic diagram of an embodiment of a sentence-breaking device based on natural language according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a sentence-breaking method, a sentence-breaking device, sentence-breaking equipment and a storage medium based on natural language, which are used for carrying out sentence-breaking by adopting a natural language processing algorithm, so that the flexibility and the accuracy of the sentence-breaking are improved.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For understanding, a specific flow of an embodiment of the present invention is described below, and referring to fig. 1, an embodiment of a sentence-breaking method based on natural language in the embodiment of the present invention includes:

101. acquiring voice data to be processed from a first service scene, or acquiring text data to be processed from a second service scene;

the server acquires voice data from the first service scene to obtain voice data to be processed, or acquires text data from the second service scene to obtain text data to be processed. It is emphasized that, in order to further ensure the privacy and security of the data fields, the to-be-processed voice data and the to-be-processed text data can also be stored in nodes of a blockchain.

In this embodiment, the first service scenario is a telemarketing scenario, the second service scenario is a customer service scenario, the data acquired by the server from the telemarketing scenario is voice type data, that is, to-be-processed voice data, and the data acquired by the server from the second service scenario is text type data, that is, to-be-processed text data, where the to-be-processed voice data may be "what is asked", "is am", "is there", and the like, and the to-be-processed text data may be "you have consulted a question", and "you know thank you" and the like.

It is to be understood that the execution subject of the present invention may be a sentence-breaking device based on natural language, and may also be a terminal or a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.

102. When voice data to be processed is obtained from a first service scene, inputting the voice data to be processed into a preset voice recognition model to generate a text sequence to be recognized, and performing feature screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of text characters;

when the voice data to be processed is obtained from the first service scene, the server inputs the voice data to be processed into the voice recognition model for processing, firstly generates a text sequence to be recognized, and then performs sentence breaking on the text sequence to be recognized by combining a natural voice processing algorithm, thereby generating target text sentence breaking data.

For example, when the server obtains the to-be-processed voice data of "what is asked" from the telephone sales scene, the server inputs the to-be-processed voice data into the voice recognition model, generates a to-be-recognized text sequence, wherein the to-be-processed voice data of "what is asked" is subjected to processes of noise elimination, channel processing, feature extraction, and the like in the voice recognition model, thereby generating a to-be-recognized text sequence [ what is asked ", and then performs sentence breaking on the to-be-recognized text sequence in combination with a natural language processing algorithm, and generates target text sentence breaking data of" what is asked ".

103. When text data to be processed is obtained from a second service scene, inputting the text data to be processed into a pre-trained text punctuation model, and performing feature screening and punctuation on the text data to be processed by combining a natural language processing algorithm to generate target text punctuation data;

when the text data to be processed is obtained from the second service scene, the server inputs the text data to be processed into the trained text punctuation model for data processing, and punctuation processing is carried out on the text data to be processed by combining a natural language processing algorithm, so that target text punctuation data is generated.

For example, when the text data to be processed of the 'hello consult a question' is acquired from a client service scene, the server inputs the 'hello consult a question' into a trained text sentence break model for data processing, and performs sentence break on the text data to be processed by combining a natural language processing algorithm to generate target text sentence break data as 'hello consult a question'.

104. And generating target response data according to the target text sentence break data and the corresponding scene configuration, and transmitting the target response data to the target terminal, wherein the scene configuration is the preset scene configuration.

The server generates target response data according to the target text sentence break data and transmits the target response data to the target terminal, for example, the target text sentence break data is 'hello, ask for a question', the server generates the target response data 'please explain the question' based on the target text sentence break data of 'hello, ask for a question' and corresponding scene configuration, and finally transmits the target response data of 'please explain the question' to the target terminal.

In the embodiment of the invention, to-be-processed voice data acquired from a first service scene (a telephone sales scene) is input into a voice recognition model to generate a to-be-recognized text sequence, and a natural language processing algorithm is combined to perform sentence breaking on the to-be-recognized text sequence, or to-be-processed text data acquired from a second service scene (a customer service scene) is input into a trained text sentence breaking model, and the natural language processing algorithm is combined to perform sentence breaking on the to-be-processed text data; the method and the device have the advantages that the natural language processing algorithm is used for sentence breaking of the voice data to be processed of the first service scene and the text data to be processed of the second service scene, and flexibility and accuracy of sentence breaking are improved.

Referring to fig. 2, another embodiment of the sentence-breaking method based on natural language according to the embodiment of the present invention includes:

201. acquiring voice data to be processed from a first service scene, or acquiring text data to be processed from a second service scene;

the server acquires voice data from the first service scene to obtain voice data to be processed, or acquires text data from the second service scene to obtain text data to be processed. It is emphasized that, in order to further ensure the privacy and security of the data fields, the to-be-processed voice data and the to-be-processed text data can also be stored in nodes of a blockchain.

In this embodiment, the first service scenario is a telemarketing scenario, the second service scenario is a customer service scenario, the data acquired by the server from the telemarketing scenario is voice type data, that is, to-be-processed voice data, and the data acquired by the server from the second service scenario is text type data, that is, to-be-processed text data, where the to-be-processed voice data may be "what is asked", "is am", "is there", and the like, and the to-be-processed text data may be "you have consulted a question", and "you know thank you" and the like.

202. When voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing feature extraction to generate voice signal features;

for example, when the server obtains the to-be-processed voice data of "asking what is asked" from the telephone sales scene, the server inputs the to-be-processed voice data of "asking what is asked" into the voice recognition module for feature extraction, and the generated voice signal features are as follows:

specifically, when voice data to be processed is acquired from a first service scene, the server inputs the voice data to be processed into a preset voice recognition model, noise elimination processing is carried out, and voice data with noise eliminated is generated; then the server performs signal enhancement processing on the voice data after the noise is eliminated to generate enhanced voice data; and finally, the server extracts the features of the voice data after the enhancement processing to generate voice signal features.

When the server acquires the voice data to be processed of ' asking what is asked ' from a telephone sales scene, the server inputs ' asking what is asked ' into the voice recognition model, firstly carries out noise processing on ' who is asked ' and is what is needed ', and the noise is interference data in the data, namely, inaccurate data is described. In the embodiment, a clustering algorithm is adopted for noise processing, wherein similar sample points in similar voice data to be processed are classified into a cluster by adopting the clustering algorithm, then the sample points falling outside the cluster are determined as noise points, and the noise points are filtered out to generate the voice data with noise eliminated; then the server performs signal enhancement processing on the voice data after the noise is eliminated, pre-emphasis processing is performed on the voice data after the noise is eliminated, so that a high-frequency signal is amplified, the voice data after the high-frequency signal is amplified is obtained, then the voice data after the high-frequency signal is amplified is split into data of short-time frame signals, the split voice data is obtained, a window function is added to the split voice data, the voice data after a window is added is generated, the voice data after the window is added is calculated and normalized by combining Fourier transform, and the voice data after the enhancement processing is generated; and finally, extracting features from the voice data after the enhancement processing to generate voice signal features.

203. Processing the characteristics of a voice signal to generate a text sequence to be recognized, and performing characteristic screening and sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data, wherein the text sequence to be recognized comprises a plurality of target text characters;

after generating the voice signal feature, the server processes the voice signal feature in step 202 to generate a text sequence to be recognized [ what is requested for which position ], and finally generates target text sentence break data "what is requested for which position and what is requested for which position ] in combination with a natural language processing algorithm.

Specifically, the server inputs the voice signal characteristics into an acoustic model of the voice recognition model for scoring, and a plurality of acoustic model scores are generated; the server inputs the voice signal characteristics into a language model of the voice recognition model for scoring, and a plurality of language model scores are generated, wherein the language model can be an n-gram model, an RNN model and the like; and then the server searches a target acoustic model score and a target language model score with the highest score in the plurality of acoustic model scores and the plurality of language model scores by adopting a decoder, determines text characters corresponding to the target acoustic model score and the target language model score as target text characters so as to generate a text sequence to be recognized comprising a plurality of target text characters, and finally performs sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm so as to generate target text sentence breaking data.

204. When text data to be processed is obtained from a second service scene, inputting the text data to be processed into a pre-trained text punctuation model, and performing feature screening and punctuation on the text data to be processed by combining a natural language processing algorithm to generate target text punctuation data;

when the text data to be processed is obtained from the second service scene, the server inputs the text data to be processed into the trained text punctuation model for data processing, and punctuation processing is carried out on the text data to be processed by combining a natural language processing algorithm, so that target text punctuation data is generated.

For example, when the text data to be processed of the 'hello consult a question' is acquired from a client service scene, the server inputs the 'hello consult a question' into a trained text sentence break model for data processing, and performs sentence break on the text data to be processed by combining a natural language processing algorithm to generate target text sentence break data as 'hello consult a question'.

Specifically, when text data to be processed is acquired from a second service scene, the server inputs the text data to be processed into a pre-trained text sentence break model, and performs feature screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation label sequence; and the server carries out sentence breaking based on the text observation sequence and the observation label sequence to generate target text sentence breaking data.

When the server obtains text data to be processed of 'hello consult a question' from a client service scene, the server inputs the text data to be processed into a trained text sentence break model, firstly, a natural language processing algorithm is combined to generate a text observation sequence of [ hello consult a question ] and an observation tag sequence of '0100001', the server carries out sentence break on the text observation sequence of [ hello consult a question ] based on the observation tag sequence of '0100001', and the generated target text sentence break data is 'hello consult a question'.

When the text data to be processed is obtained from the second service scene, the server inputs the text data to be processed into a pre-trained text sentence-breaking model, and performs feature screening on the text data to be processed by combining a natural language processing algorithm to generate a text observation sequence and an observation label sequence, wherein the step of generating the text observation sequence and the observation label sequence comprises the following steps:

when text data to be processed is obtained from a second service scene, the server inputs the text data to be processed into an embedded layer of a pre-trained text sentence-breaking model for vector mapping to generate a vector sequence, wherein the vector sequence does not include a blank space; then the server inputs the vector sequence into a bidirectional long-time memory recurrent neural network, and performs feature screening to generate a vector sequence after feature screening; and finally, inputting the vector sequence with the screened characteristics into a conditional random field by the server to generate a text observation sequence and an observation label sequence, wherein the text observation sequence comprises a plurality of characters, the observation label sequence comprises a plurality of observation labels, and the plurality of characters correspond to the plurality of observation labels one by one.

When a server acquires text data to be processed of 'you can consult a plurality of problems' from a client service scene, the server inputs the text data to be processed into an Embedding layer, namely an Embedding layer for vector mapping, it needs to be noted that the text data to be processed is composed of a plurality of character data, the server maps the plurality of character data in the text data to be processed into word vectors in a low-dimensional space through the Embedding layer, so that an initial vector sequence is generated, and spaces in the initial vector sequence are filtered out based on a preset rule, so that the vector sequence is generated; the server inputs the vector sequence into a bidirectional long-time memory cyclic neural network, namely a BilSTM neural network, the neural network is used for deleting useless features in the vector sequence, and the specific process is as follows: calling a matrix identification parameter, multiplying the matrix identification parameter by a vector sequence, calculating by combining an activation function to obtain useless features, filtering the useless features, finishing the screening of the vector sequence to generate the vector sequence after the features are screened, and then inputting the vector sequence after the features are screened into a conditional random field, namely a CRF layer to perform label calculation so as to generate a text observation sequence of [ the problem of your consultation ] and an observation label sequence of '0100001'.

The server carries out sentence breaking based on the text observation sequence and the observation label sequence, and the step of generating target text sentence breaking data comprises the following steps: the server judges whether each observation label in the observation label sequence is a preset sentence break label; and if the target observation label is a sentence break label, the server determines that the character corresponding to the target observation label is the target sentence break character, and adds a preset sentence break separator behind the target sentence break character to break the sentence, so as to generate target text sentence break data.

The server performs label judgment on the observation label sequence of "0100001", assuming that the sentence break label is "1" and the sentence break label is "0", the server performs label judgment starting from the first observation label in the observation label sequence, judges whether each observation label is a preset sentence break label "1", judges that the second observation label and the last observation label are sentence break labels "1", determines the second observation label as a target sentence break character, and adds a sentence break separator "to the target sentence break character.

205. And generating target response data according to the target text sentence break data and the corresponding scene configuration, and transmitting the target response data to the target terminal, wherein the scene configuration is the preset scene configuration.

The server generates target response data according to the target text sentence break data and transmits the target response data to the target terminal, for example, the target text sentence break data is 'hello, ask for a question', the server generates the target response data 'please explain the question' based on the target text sentence break data of 'hello, ask for a question' and corresponding scene configuration, and finally transmits the target response data of 'please explain the question' to the target terminal.

In the embodiment of the invention, to-be-processed voice data acquired from a first service scene (a telephone sales scene) is input into a voice recognition model to generate a to-be-recognized text sequence, and a natural language processing algorithm is combined to perform sentence breaking on the to-be-recognized text sequence, or to-be-processed text data acquired from a second service scene (a customer service scene) is input into a trained text sentence breaking model, and the natural language processing algorithm is combined to perform sentence breaking on the to-be-processed text data; the method and the device have the advantages that the natural language processing algorithm is used for sentence breaking of the voice data to be processed of the first service scene and the text data to be processed of the second service scene, and flexibility and accuracy of sentence breaking are improved.

In the above description of the sentence-breaking method based on natural language in the embodiment of the present invention, referring to fig. 3, a sentence-breaking device based on natural language in the embodiment of the present invention is described below, and an embodiment of the sentence-breaking device based on natural language in the embodiment of the present invention includes:

an obtaining module 301, configured to obtain to-be-processed voice data from a first service scenario, or obtain to-be-processed text data from a second service scenario;

the first sentence-breaking module 302 is configured to, when to-be-processed voice data is acquired from a first service scenario, input the to-be-processed voice data into a preset voice recognition model to generate a to-be-recognized text sequence, perform feature screening and sentence-breaking on the to-be-recognized text sequence in combination with a natural language processing algorithm to generate target text sentence-breaking data, where the to-be-recognized text sequence includes a plurality of text characters;

the second sentence-breaking module 303 is configured to, when text data to be processed is acquired from a second service scenario, input the text data to be processed into a pre-trained text sentence-breaking model, perform feature screening and sentence-breaking on the text data to be processed by combining a natural language processing algorithm, and generate target text sentence-breaking data;

and the response data generating module 304 is configured to generate target response data according to the target text sentence break data and the corresponding scene configuration, and transmit the target response data to the target terminal, where the scene configuration is a preset scene configuration.

In the embodiment of the invention, to-be-processed voice data acquired from a first service scene (a telephone sales scene) is input into a voice recognition model to generate a to-be-recognized text sequence, and a natural language processing algorithm is combined to perform sentence breaking on the to-be-recognized text sequence, or to-be-processed text data acquired from a second service scene (a customer service scene) is input into a trained text sentence breaking model, and natural language processing algorithm is combined to perform sentence breaking on the to-be-processed text data; the method and the device have the advantages that the natural language processing algorithm is used for sentence breaking of the voice data to be processed of the first service scene and the text data to be processed of the second service scene, and flexibility and accuracy of sentence breaking are improved.

Referring to fig. 4, another embodiment of the sentence-breaking device based on natural language according to the embodiment of the present invention includes:

an obtaining module 301, configured to obtain to-be-processed voice data from a first service scenario, or obtain to-be-processed text data from a second service scenario;

the first sentence-breaking module 302 is configured to, when to-be-processed voice data is acquired from a first service scenario, input the to-be-processed voice data into a preset voice recognition model to generate a to-be-recognized text sequence, perform feature screening and sentence-breaking on the to-be-recognized text sequence in combination with a natural language processing algorithm to generate target text sentence-breaking data, where the to-be-recognized text sequence includes a plurality of text characters;

the second sentence-breaking module 303 is configured to, when text data to be processed is acquired from a second service scenario, input the text data to be processed into a pre-trained text sentence-breaking model, perform feature screening and sentence-breaking on the text data to be processed by combining a natural language processing algorithm, and generate target text sentence-breaking data;

and the response data generating module 304 is configured to generate target response data according to the target text sentence break data and the corresponding scene configuration, and transmit the target response data to the target terminal, where the scene configuration is a preset scene configuration.

Optionally, the first sentence-breaking module 302 includes:

the feature extraction unit 3021, when acquiring to-be-processed voice data from a first service scenario, is configured to input the to-be-processed voice data into a preset voice recognition model, perform feature extraction, and generate a voice signal feature;

the first sentence-breaking unit 3022 is configured to process the characteristics of the voice signal to generate a text sequence to be recognized, perform characteristic screening and sentence-breaking on the text sequence to be recognized by combining with a natural language processing algorithm, and generate target text sentence-breaking data, where the text sequence to be recognized includes a plurality of target text characters.

Optionally, the feature extraction unit 3021 may be further specifically configured to:

when voice data to be processed is acquired from a first service scene, inputting the voice data to be processed into a preset voice recognition model, and performing noise elimination processing to generate voice data with noise eliminated;

performing signal enhancement processing on the voice data after the noise is eliminated to generate enhanced voice data;

and performing feature extraction on the voice data after the enhancement processing to generate voice signal features.

Optionally, the first sentence-punctuating unit 3022 may be further specifically configured to:

inputting the voice signal characteristics into an acoustic model of the voice recognition model for scoring, and generating a plurality of acoustic model scores;

inputting the voice signal characteristics into a language model of the voice recognition model for scoring, and generating a plurality of language model scores;

searching a target acoustic model score and a target language model score with the highest score in the plurality of acoustic model scores and the plurality of language model scores, and determining a text sequence to be recognized based on the target acoustic model score and the target language model score, wherein the text sequence to be recognized comprises a plurality of target text characters;

and carrying out sentence breaking on the text sequence to be recognized by combining a natural language processing algorithm to generate target text sentence breaking data.

Optionally, the second sentence-punctuation module 303 includes:

the feature screening module 3031 is configured to, when text data to be processed is acquired from a second service scenario, input the text data to be processed into a pre-trained text sentence segmentation model, perform feature screening on the text data to be processed by combining a natural language processing algorithm, and generate a text observation sequence and an observation tag sequence;

and a second sentence-breaking module 3032, configured to perform sentence breaking based on the text observation sequence and the observation tag sequence, and generate target text sentence-breaking data.

Optionally, the feature screening module 3031 may be further specifically configured to:

when text data to be processed is obtained from a second service scene, inputting the text data to be processed into an embedding layer of a pre-trained text sentence-breaking model for vector mapping to generate a vector sequence, wherein the vector sequence does not include a blank space;

inputting the vector sequence into a bidirectional long-time and short-time memory recurrent neural network, and performing feature screening to generate a vector sequence after feature screening;

inputting the vector sequence with the screened features into a conditional random field to generate a text observation sequence and an observation label sequence, wherein the text observation sequence comprises a plurality of characters, the observation label sequence comprises a plurality of observation labels, and the characters correspond to the observation labels one by one.

Optionally, the second sentence-punctuation module 3032 may further be configured to:

judging whether each observation label in the observation label sequence is a preset sentence break label or not;

and if the target observation label is a sentence break label, determining that the character corresponding to the target observation label is the target sentence break character, adding a preset sentence break separator behind the target sentence break character to break the sentence, and generating target text sentence break data.

In the embodiment of the invention, to-be-processed voice data acquired from a first service scene (a telephone sales scene) is input into a voice recognition model to generate a to-be-recognized text sequence, and a natural language processing algorithm is combined to perform sentence breaking on the to-be-recognized text sequence, or to-be-processed text data acquired from a second service scene (a customer service scene) is input into a trained text sentence breaking model, and natural language processing algorithm is combined to perform sentence breaking on the to-be-processed text data; the method and the device have the advantages that the natural language processing algorithm is used for sentence breaking of the voice data to be processed of the first service scene and the text data to be processed of the second service scene, and flexibility and accuracy of sentence breaking are improved.

Fig. 3 and 4 describe the sentence-breaking device based on natural language in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the sentence-breaking device based on natural language in the embodiment of the present invention is described in detail from the perspective of hardware processing.

Fig. 5 is a schematic structural diagram of a natural language-based sentence-breaking device 500 according to an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, and one or more storage media 530 (e.g., one or more mass storage devices) for storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a series of instruction operations for the natural language based sentence-breaking device 500. Still further, the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the natural language based sentence-breaking device 500.

The natural language based sentence break apparatus 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the natural language based sentence break device architecture shown in fig. 5 does not constitute a limitation of natural language based sentence break devices and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

The invention also provides a sentence-breaking device based on natural language, which comprises a memory and a processor, wherein computer-readable instructions are stored in the memory, and when being executed by the processor, the computer-readable instructions cause the processor to execute the steps of the sentence-breaking method based on natural language in the embodiments.

The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the natural language based sentence-breaking method.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

19页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:信息处理系统、信息处理法及非暂态电脑可读取记录媒体

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!