Sentence and word acquisition method and device

文档序号:1953430 发布日期:2021-12-10 浏览:18次 中文

阅读说明:本技术 一种语句词语采集方法及装置 (Sentence and word acquisition method and device ) 是由 尹红霞 于 2021-09-10 设计创作,主要内容包括:本发明公开了一种语句词语采集方法及装置。其中,该方法包括:获取语句应用场景;根据预设应用场景分类规则,将所述语句应用场景进行分类,得到语句信息;将所述语句信息进行分割处理,得到分割语句信息;采集所述分割语句信息中的词语数据。本发明解决了现有技术中的语句词语采集方法仅仅对原始语句数据进行采集,无法根据场景信息对语句进行分类采集,降低了语句采集整体的效率的技术问题。(The invention discloses a sentence and word acquisition method and device. Wherein, the method comprises the following steps: obtaining a statement application scene; classifying the sentence application scenes according to preset application scene classification rules to obtain sentence information; carrying out segmentation processing on the statement information to obtain segmented statement information; and acquiring word data in the segmented sentence information. The invention solves the technical problems that the sentence and expression acquisition method in the prior art only acquires original sentence and expression data, and can not acquire sentences in a classified manner according to scene information, thereby reducing the overall efficiency of sentence acquisition.)

1. A sentence and word acquisition method is characterized by comprising the following steps:

obtaining a statement application scene;

classifying the sentence application scenes according to preset application scene classification rules to obtain sentence information;

carrying out segmentation processing on the statement information to obtain segmented statement information;

and acquiring word data in the segmented sentence information.

2. The method of claim 1, wherein prior to the obtaining statement application scenario, the method further comprises:

and acquiring original statement data.

3. The method according to claim 1, wherein before the segmenting the sentence information into segmented sentence information, the method further comprises:

and acquiring a segmentation strategy according to the statement information.

4. The method according to claim 1, wherein after the obtaining of the word data in the segmented sentence information, the method further comprises:

and storing the word data.

5. A sentence and word acquisition device is characterized by comprising:

the obtaining module is used for obtaining a statement application scene;

the classification module is used for classifying the statement application scenes according to preset application scene classification rules to obtain statement information;

the segmentation module is used for carrying out segmentation processing on the statement information to obtain segmented statement information;

and the acquisition module is used for acquiring word data in the segmented sentence information.

6. The apparatus of claim 5, further comprising:

and the acquisition module is also used for acquiring original statement data.

7. The apparatus of claim 5, further comprising:

and the strategy module is used for acquiring the segmentation strategy according to the statement information.

8. The apparatus of claim 5, further comprising:

and the storage module is used for storing the word data.

9. A non-volatile storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the non-volatile storage medium is located to perform the method of any one of claims 1 to 4.

10. An electronic device comprising a processor and a memory; the memory has stored therein computer readable instructions for execution by the processor, wherein the computer readable instructions when executed perform the method of any one of claims 1 to 4.

Technical Field

The invention relates to the field of sentence acquisition, in particular to a sentence and word acquisition method and device.

Background

Along with the continuous development of intelligent science and technology, people use intelligent equipment more and more among life, work, the study, use intelligent science and technology means, improved the quality of people's life, increased the efficiency of people's study and work.

At present, in the process of acquiring statement data, an original statement is generally split and the split statement data is analyzed, and available acquisition information is extracted to perform relevant acquisition operations, but the traditional statement and word acquisition method only acquires the original statement data, and cannot acquire statements in a classified manner according to scene information, so that the overall statement acquisition efficiency is reduced.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a sentence and word acquisition method and device, which at least solve the technical problems that the sentence and word acquisition method in the prior art only acquires original sentence data and cannot classify and acquire sentences according to scene information, and the overall sentence acquisition efficiency is reduced.

According to an aspect of an embodiment of the present invention, there is provided a sentence word collecting method, including: obtaining a statement application scene; classifying the sentence application scenes according to preset application scene classification rules to obtain sentence information; carrying out segmentation processing on the statement information to obtain segmented statement information; and acquiring word data in the segmented sentence information.

Optionally, before the obtaining a statement application scenario, the method further includes: and acquiring original statement data.

Optionally, before the statement information is subjected to the segmentation processing to obtain the segmented statement information, the method further includes: and acquiring a segmentation strategy according to the statement information.

Optionally, after the obtaining of the word data in the segmented sentence information, the method further includes: and storing the word data.

According to another aspect of the embodiments of the present invention, there is also provided a sentence word collecting device, including: the obtaining module is used for obtaining a statement application scene; the classification module is used for classifying the statement application scenes according to preset application scene classification rules to obtain statement information; the segmentation module is used for carrying out segmentation processing on the statement information to obtain segmented statement information; and the acquisition module is used for acquiring word data in the segmented sentence information.

Optionally, the apparatus further comprises: and the acquisition module is also used for acquiring original statement data.

Optionally, the apparatus further comprises: and the strategy module is used for acquiring the segmentation strategy according to the statement information.

Optionally, the apparatus further comprises: and the storage module is used for storing the word data.

According to another aspect of the embodiments of the present invention, a non-volatile storage medium is further provided, where the non-volatile storage medium includes a stored program, and the program controls, when running, a device in which the non-volatile storage medium is located to execute a sentence word collection method.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a processor and a memory; the memory is stored with computer readable instructions, and the processor is used for executing the computer readable instructions, wherein the computer readable instructions execute a sentence and word collecting method when running.

In the embodiment of the invention, an obtaining statement application scene is adopted; classifying the sentence application scenes according to preset application scene classification rules to obtain sentence information; carrying out segmentation processing on the statement information to obtain segmented statement information; the way of collecting the word data in the segmented sentence information solves the technical problems that the sentence collecting method in the prior art only collects original sentence data, and sentences cannot be classified and collected according to scene information, so that the overall efficiency of sentence collection is reduced.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow chart of a sentence word collection method according to an embodiment of the present invention;

fig. 2 is a block diagram of a sentence word collecting device according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In accordance with an embodiment of the present invention, there is provided a method embodiment of a method for sentence word acquisition, it is noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer executable instructions and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

Example one

Fig. 1 is a flowchart of a sentence word collecting method according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:

step S102, obtaining a statement application scene.

Specifically, in order to acquire the sentence words by using the sentence application scenario, the embodiment of the present invention first needs to acquire the sentence application scenario, which is the sentence attribute parameter including the information such as the semantics of the sentence and the application scenario, and by acquiring the sentence application scenario, the error rate and the calculation amount of sentence classification and sentence acquisition can be reduced, and the efficiency of sentence word acquisition is increased.

For example, in the sentence translation process, the words of the sentence need to be collected, however, after the original sentence is collected by the sound collection device or the image collection device, it needs to judge that the application scene where the sentence is located is "dining" according to the original sentence, then, according to the application scene "dining", the sentence splitting rule related to dining, such as words or phrases like chopsticks, rice, full, not full, etc., can be called, and then, according to the application scene and the splitting rule, the word collection operation in the corresponding scene can be performed.

Optionally, before the obtaining a statement application scenario, the method further includes: and acquiring original statement data.

Specifically, in order to obtain a corresponding statement application scenario, in the embodiment of the present invention, before the statement application scenario is determined and generated, an acquisition device is further required to acquire original statement data, and the statement application scenario is obtained according to the original statement data and the statement application scenario mapping model.

And step S104, classifying the statement application scenes according to preset application scene classification rules to obtain statement information.

Specifically, after the statement application scenario is acquired, the information needs to be classified according to an application scenario classification rule preset by a user, so as to obtain statement information, where the statement information is statement data information acquired according to a specific application scenario.

It should be noted that the statement information may be obtained by classifying the statement application scenario to obtain a classification result, and then refining the original statement according to the classification result, so as to obtain the statement information that can be used for segmentation and collection. For example, in the "meal" sentence application scenario, the current sentence is classified as "food" by presetting the application scenario classification rule related to the meal, and the sentence having a relationship with the food is extracted from all the original sentence data, and the sentence related to the food is extracted as the sentence information.

And step S106, carrying out segmentation processing on the statement information to obtain segmented statement information.

Specifically, after the statement information is acquired, in order to smoothly and efficiently extract the words to be acquired, the statement information needs to be segmented, and the segmented statement information is sent to a subsequent word acquisition, so that each keyword to be extracted or a word meeting the extraction rule can be acquired more efficiently and accurately by acquiring the segmented short statements.

Optionally, before the statement information is subjected to the segmentation processing to obtain the segmented statement information, the method further includes: and acquiring a segmentation strategy according to the statement information.

Specifically, in order to perform word acquisition operation on the sentence information, it is necessary to perform division processing on the long sentence information, and use the divided short sentence as the divided sentence information for subsequent word acquisition. Meanwhile, before the statement information is subjected to segmentation processing to obtain segmented statement information, the method further comprises the following steps: and acquiring a segmentation strategy according to the statement information.

And step S108, acquiring word data in the segmented sentence information.

Specifically, in order to obtain word data, word acquisition needs to be performed on the segmented sentence information, wherein the length of a word is determined according to a preset length, a plurality of words in the segmented sentence are identified according to a word identification model, the sentence words needing to be acquired are extracted, and the sentence words are stored and fed back.

Optionally, after the obtaining of the word data in the segmented sentence information, the method further includes: and storing the word data.

Through the embodiment, the technical problems that the sentence and word acquisition method in the prior art only acquires original sentence data and cannot classify and acquire sentences according to scene information, and the overall sentence acquisition efficiency is reduced are solved.

Example two

Fig. 2 is a block diagram of a sentence word collecting device according to an embodiment of the present invention, and as shown in fig. 2, the device includes:

and an obtaining module 20, configured to obtain a statement application scenario.

Specifically, in order to acquire the sentence words by using the sentence application scenario, the embodiment of the present invention first needs to acquire the sentence application scenario, which is the sentence attribute parameter including the information such as the semantics of the sentence and the application scenario, and by acquiring the sentence application scenario, the error rate and the calculation amount of sentence classification and sentence acquisition can be reduced, and the efficiency of sentence word acquisition is increased.

For example, in the sentence translation process, the words of the sentence need to be collected, however, after the original sentence is collected by the sound collection device or the image collection device, it needs to judge that the application scene where the sentence is located is "dining" according to the original sentence, then, according to the application scene "dining", the sentence splitting rule related to dining, such as words or phrases like chopsticks, rice, full, not full, etc., can be called, and then, according to the application scene and the splitting rule, the word collection operation in the corresponding scene can be performed.

Optionally, the apparatus further comprises: and the acquisition module is also used for acquiring original statement data.

Specifically, in order to obtain a corresponding statement application scenario, in the embodiment of the present invention, before the statement application scenario is determined and generated, an acquisition device is further required to acquire original statement data, and the statement application scenario is obtained according to the original statement data and the statement application scenario mapping model.

The classification module 22 is configured to classify the sentence application scenarios according to preset application scenario classification rules to obtain sentence information.

Specifically, after the statement application scenario is acquired, the information needs to be classified according to an application scenario classification rule preset by a user, so as to obtain statement information, where the statement information is statement data information acquired according to a specific application scenario.

It should be noted that the statement information may be obtained by classifying the statement application scenario to obtain a classification result, and then refining the original statement according to the classification result, so as to obtain the statement information that can be used for segmentation and collection. For example, in the "meal" sentence application scenario, the current sentence is classified as "food" by presetting the application scenario classification rule related to the meal, and the sentence having a relationship with the food is extracted from all the original sentence data, and the sentence related to the food is extracted as the sentence information.

And the dividing module 24 is configured to perform dividing processing on the statement information to obtain divided statement information.

Specifically, after the statement information is acquired, in order to smoothly and efficiently extract the words to be acquired, the statement information needs to be segmented, and the segmented statement information is sent to a subsequent word acquisition, so that each keyword to be extracted or a word meeting the extraction rule can be acquired more efficiently and accurately by acquiring the segmented short statements.

Optionally, the apparatus further comprises: and the strategy module is used for acquiring the segmentation strategy according to the statement information.

Specifically, in order to perform word acquisition operation on the sentence information, it is necessary to perform division processing on the long sentence information, and use the divided short sentence as the divided sentence information for subsequent word acquisition. Meanwhile, before the statement information is subjected to segmentation processing to obtain segmented statement information, the method further comprises the following steps: and acquiring a segmentation strategy according to the statement information.

And the acquisition module 26 is used for acquiring word data in the segmented sentence information.

Specifically, in order to obtain word data, word acquisition needs to be performed on the segmented sentence information, wherein the length of a word is determined according to a preset length, a plurality of words in the segmented sentence are identified according to a word identification model, the sentence words needing to be acquired are extracted, and the sentence words are stored and fed back.

Optionally, the apparatus further comprises: and the storage module is used for storing the word data.

According to another aspect of the embodiments of the present invention, a non-volatile storage medium is further provided, where the non-volatile storage medium includes a stored program, and the program controls, when running, a device in which the non-volatile storage medium is located to execute a sentence word collection method.

Specifically, the method further comprises: obtaining a statement application scene; classifying the sentence application scenes according to preset application scene classification rules to obtain sentence information; carrying out segmentation processing on the statement information to obtain segmented statement information; and acquiring word data in the segmented sentence information.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a processor and a memory; the memory is stored with computer readable instructions, and the processor is used for executing the computer readable instructions, wherein the computer readable instructions execute a sentence and word collecting method when running.

Specifically, the method further comprises: obtaining a statement application scene; classifying the sentence application scenes according to preset application scene classification rules to obtain sentence information; carrying out segmentation processing on the statement information to obtain segmented statement information; and acquiring word data in the segmented sentence information.

Through the embodiment, the technical problems that the sentence and word acquisition method in the prior art only acquires original sentence data and cannot classify and acquire sentences according to scene information, and the overall sentence acquisition efficiency is reduced are solved.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

9页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种基于注意力的双向CNN-RNN深度模型的蒙文情感分析方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!