Text generation method and device and electronic equipment

文档序号:136554 发布日期:2021-10-22 浏览:23次 中文

阅读说明:本技术 一种文本生成方法、装置及电子设备 (Text generation method and device and electronic equipment ) 是由 魏梦溪 李旭瑞 贺一帆 于 2020-04-21 设计创作,主要内容包括:本发明公开了一种文本生成方法、装置及电子设备,该方法包括:获取待处理数据,其中,待处理数据包括文字和对应于至少一个指标的数值;根据文字得到对应的文字向量,根据每一指标的数值得到对应指标的数值向量;根据每一指标对应的数值向量和文字向量,确定待透出的指标,作为目标指标;根据目标指标和预设的文本模板,生成数据描述文本。(The invention discloses a text generation method, a text generation device and electronic equipment, wherein the method comprises the following steps: acquiring data to be processed, wherein the data to be processed comprises characters and numerical values corresponding to at least one index; obtaining a corresponding character vector according to the characters, and obtaining a numerical value vector of a corresponding index according to the numerical value of each index; determining indexes to be disclosed as target indexes according to the numerical value vector and the character vector corresponding to each index; and generating a data description text according to the target index and a preset text template.)

1. A text generation method, comprising:

acquiring data to be processed, wherein the data to be processed comprises characters and numerical values corresponding to at least one index;

obtaining a corresponding character vector according to the characters, and obtaining a numerical value vector of a corresponding index according to the numerical value of each index;

determining an index to be disclosed as a target index according to the numerical value vector corresponding to each index and the character vector;

and generating a data description text according to the target index and a preset text template.

2. The method of claim 1, further comprising:

determining a degree adverb corresponding to a target index according to a numerical value vector corresponding to the target index and the character vector, wherein the degree adverb is an adverb representing the change degree of the target index;

the degree adverb is used to generate the data description text.

3. The method of claim 1, further comprising:

determining characters to be revealed as target characters according to the character vectors and the numerical vectors of the data to be processed;

determining a numerical value corresponding to the target character and the target index as a target numerical value;

the target words and the target numerical values are used for generating the data description text.

4. The method of claim 3, wherein the determining the word to be revealed as the target word according to the word vector and the numerical vector of the data to be processed comprises:

combining the character vector and the numerical vectors of all indexes to obtain a data vector;

and determining the target characters according to the data vectors.

5. The method of claim 4, the determining the target literal according to the data vector comprising:

determining a data vector of the data to be processed under the influence of other data to be processed as an influence data vector according to the data vector of the data to be processed;

and determining whether the characters in the data to be processed are target characters to be revealed or not according to the influence data vector based on a preset character classifier.

6. The method of claim 1, wherein the determining the indicators to be revealed as target indicators according to the numerical vectors and the text vectors corresponding to each indicator comprises:

combining the numerical value vector and the character vector of each index to obtain an index vector of the corresponding index;

and determining the target index according to the index vector of each index.

7. The method of claim 6, the determining the target metric from the metric vector for each metric comprising:

for each index, obtaining a comprehensive index vector of the corresponding index according to the corresponding index vector of the data to be processed;

and determining whether the corresponding index is the target index to be revealed or not according to the comprehensive index vector of each index based on a preset index classifier.

8. The method of claim 1, wherein the deriving the corresponding literal vector from the literal comprises:

converting each word of the text into a corresponding word vector;

based on a preset neural network, obtaining a hidden state of the neural network according to the word vector;

and obtaining a character vector corresponding to the character according to the hidden state of the neural network.

9. The method of claim 1, wherein the obtaining a value vector of each index according to the value of the corresponding index comprises:

normalizing the numerical value of each index;

and mapping the numerical value of each index after the normalization processing to a preset vector space to obtain a numerical value vector of the corresponding index.

10. The method of claim 1, the data to be processed being structured data.

11. A text generation apparatus comprising:

the data acquisition module is used for acquiring data to be processed, wherein the data to be processed comprises characters and numerical values corresponding to at least one index;

the vector generation module is used for obtaining corresponding character vectors according to the characters and obtaining numerical value vectors of corresponding indexes according to the numerical value of each index;

the index determining module is used for determining the index to be disclosed as a target index according to the numerical value vector corresponding to each index and the character vector;

and the text generation module is used for generating a data description text according to the target index and a preset text template.

12. An electronic device, comprising:

a processor and a memory for storing instructions for controlling the processor to perform the method of any of claims 1 to 10.

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 10.

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a text generation method, a text generation device, an electronic device, and a computer-readable storage medium.

Background

Business Intelligence (Business Intelligence, BI for short), also known as Business Intelligence or Business Intelligence, refers to the realization of Business value by data analysis using modern data warehouse technology, on-line analysis and processing technology, data mining and data presentation technology.

In the prior art, a large amount of transaction data is generally analyzed based on a fixed index and a fixed threshold value to generate a corresponding description text for a user to conveniently view.

However, the description text generated in this way can only reflect the description information of the preset fixed index, but cannot reflect other key information.

Disclosure of Invention

An object of the present invention is to provide a new technical solution for generating a description text of data.

According to a first aspect of the present invention, there is provided a text generation method, including:

acquiring data to be processed, wherein the data to be processed comprises characters and numerical values corresponding to at least one index;

obtaining a corresponding character vector according to the characters, and obtaining a numerical value vector of a corresponding index according to the numerical value of each index;

determining an index to be disclosed as a target index according to the numerical value vector corresponding to each index and the character vector;

and generating a data description text according to the target index and a preset text template.

Optionally, the method further includes:

determining degree adverbs corresponding to the target indexes according to the numerical vectors and the character vectors corresponding to the target indexes;

the degree adverb is an adverb representing the degree of change of the target index; the degree adverb is used to generate the data description text.

Optionally, the method further includes:

determining characters to be revealed as target characters according to the character vectors and the numerical vectors of the data to be processed;

determining a numerical value corresponding to the target character and the target index as a target numerical value; the target words and the target numerical values are used for generating the data description text.

Optionally, the determining, according to the word vector and the numerical vector of the data to be processed, the word to be revealed, as a target word, includes:

combining the character vector and the numerical vectors of all indexes to obtain a data vector;

and determining the target characters according to the data vectors.

Optionally, the determining the target text according to the data vector includes:

determining a data vector of the data to be processed under the influence of other data to be processed as an influence data vector according to the data vector of the data to be processed;

and determining whether the characters in the data to be processed are the target characters to be revealed or not according to the influence data vector based on a preset character classifier.

Optionally, the determining the index to be disclosed according to the numerical vector and the text vector corresponding to each index includes, as a target index:

combining the numerical value vector and the character vector of each index to obtain an index vector of the corresponding index;

and determining the target index according to the index vector of each index.

Optionally, the determining the target index according to the index vector of each index includes:

for each index, obtaining a comprehensive index vector of the corresponding index according to the corresponding index vector of the data to be processed;

and determining whether the corresponding index is the target index to be revealed or not according to the comprehensive index vector of each index based on a preset index classifier.

Optionally, the obtaining a corresponding text vector according to the text includes:

converting each word of the text into a corresponding word vector;

based on a preset neural network, obtaining a hidden state of the neural network according to the word vector;

and obtaining a character vector corresponding to the character according to the hidden state of the neural network.

Optionally, the obtaining a value vector of a corresponding index according to the value of each index includes:

normalizing the numerical value of each index;

and mapping the numerical value of each index after the normalization processing to a preset vector space to obtain a numerical value vector of the corresponding index.

Optionally, the data to be processed is structured data.

According to a second aspect of the present invention, there is provided a text generation apparatus comprising:

the data acquisition module is used for acquiring data to be processed, wherein the data to be processed comprises characters and numerical values corresponding to at least one index;

the vector generation module is used for obtaining corresponding character vectors according to the characters and obtaining numerical value vectors of corresponding indexes according to the numerical value of each index;

the index determining module is used for determining the index to be disclosed as a target index according to the numerical value vector corresponding to each index and the character vector;

and the text generation module is used for generating a data description text according to the target index and a preset text template.

According to a third aspect of the present invention, there is provided an electronic apparatus comprising:

a processor and a memory for storing instructions for controlling the processor to perform the method according to the first aspect of the invention.

According to a fourth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method according to the first aspect of the present invention.

In the embodiment of the invention, the indexes to be disclosed are determined by the character vectors corresponding to the characters in the data to be processed and the numerical vectors corresponding to the numerical values of each index, and the data description text is generated according to the indexes to be disclosed for displaying. Therefore, key target indexes which are concerned by the user and have large fluctuation range can be selectively revealed from the multiple indexes according to the numerical value of the indexes and the character reasons, and the accuracy of the revealed information can be ensured.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic diagram of a hardware configuration of an electronic device according to a first embodiment of the present invention.

Fig. 2 is a schematic diagram of a hardware structure of an electronic device according to a second embodiment of the present invention.

FIG. 3 shows a flow diagram of a text generation method of an embodiment of the invention.

Fig. 4 is a scene diagram illustrating a text generation method according to an embodiment of the present invention.

FIG. 5 shows a flow diagram of one example of a text generation method of an embodiment of the invention.

Fig. 6 shows a functional block diagram of a text generation apparatus of an embodiment of the present invention.

Fig. 7 shows a functional block diagram of an electronic device according to a third embodiment of the invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

< hardware configuration >

Fig. 1 and 2 are block diagrams of hardware configurations of an electronic device 1000 that may be used to implement the method of any embodiment of the invention.

In one embodiment, as shown in FIG. 1, the electronic device 1000 may be a server 1100.

The server 1100 provides the computers for processing, databases, and communications facilities. The server 1100 can be a unitary server or a distributed server across multiple computers or computer data centers. The server may be of various types, such as, but not limited to, a web server, a news server, a mail server, a message server, an advertisement server, a file server, an application server, an interaction server, a database server, or a proxy server. In some embodiments, each server may include hardware, software, or embedded logic components or a combination of two or more such components for performing the appropriate functions supported or implemented by the server. For example, a server, such as a blade server, a cloud server, etc., or may be a server group consisting of a plurality of servers, which may include one or more of the above types of servers, etc.

In this embodiment, the server 1100 may include a processor 1110, a memory 1120, an interface device 1130, a communication device 1140, a display device 1150, and an input device 1160, as shown in fig. 1.

In this embodiment, the server 1100 may also include a speaker, a microphone, and the like, which are not limited herein.

The processor 1110 may be a dedicated server processor, or may be a desktop processor, a mobile version processor, or the like that meets performance requirements, and is not limited herein. The memory 1120 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1130 includes various bus interfaces such as a serial bus interface (including a USB interface), a parallel bus interface, and the like. The communication device 1140 is capable of wired or wireless communication, for example. The display device 1150 is, for example, a liquid crystal display panel, an LED display panel touch display panel, or the like. Input devices 1160 may include, for example, a touch screen, a keyboard, and the like.

In this embodiment, the memory 1120 of the server 1100 is used to store instructions for controlling the processor 1110 to operate at least to perform a method according to any embodiment of the invention. The skilled person can design the instructions according to the disclosed solution. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.

Although shown as multiple devices in fig. 1, the present invention may relate to only some of the devices, e.g., server 1100 may relate to only memory 1120 and processor 1110.

In one embodiment, the electronic device 1000 may be a terminal device 1200 such as a PC, a notebook computer, or the like used by an operator, which is not limited herein.

In this embodiment, referring to fig. 2, the terminal apparatus 1200 may include a processor 1210, a memory 1220, an interface device 1230, a communication device 1240, a display device 1250, an input device 1260, a speaker 1270, a microphone 1280, and the like.

The processor 1210 may be a mobile version processor. The memory 1220 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1230 includes, for example, a USB interface, a headphone interface, and the like. The communication device 1240 may be capable of wired or wireless communication, for example, the communication device 1240 may include a short-range communication device, such as any device that performs short-range wireless communication based on short-range wireless communication protocols, such as the Hilink protocol, WiFi (IEEE 802.11 protocol), Mesh, bluetooth, ZigBee, Thread, Z-Wave, NFC, UWB, LiFi, and the like, and the communication device 1240 may also include a long-range communication device, such as any device that performs WLAN, GPRS, 2G/3G/4G/5G long-range communication. The display device 1250 is, for example, a liquid crystal display, a touch display, or the like. The input device 1260 may include, for example, a touch screen, a keyboard, and the like. A user can input/output voice information through the speaker 1270 and the microphone 1280.

In this embodiment, memory 1220 of terminal device 1200 is used to store instructions for controlling processor 1210 to operate at least to perform a method according to any of the embodiments of the present invention. The skilled person can design the instructions according to the disclosed solution. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.

Although a plurality of devices of the terminal apparatus 1200 are shown in fig. 2, the present invention may relate only to some of the devices, for example, the terminal apparatus 1200 relates only to the memory 1220 and the processor 1210 and the display device 1250.

In the embodiment of the invention, the data to be processed is acquired; obtaining a corresponding character vector according to characters in the data to be processed, and obtaining a numerical value vector of a corresponding index according to the numerical value of each index in the data to be processed; determining a target index to be disclosed according to the numerical vector and the character vector corresponding to each index; and generating a data description text corresponding to the data to be processed according to the target index and a preset text template.

The data to be processed in the embodiments of the present specification may be, for example, consumption data, physical examination data, epidemic situation data, product sales data, and the like. The data description text may be text describing key information of the data to be processed. In the case that the data to be processed is consumption data, the generated data description text may contain prompt information or key consumption information of a bulk transaction. In the case that the data to be processed is physical examination data, the generated data description text may contain health indicators most prone to abnormality. Under the condition that the data to be processed is epidemic situation data, the generated data description text can comprise epidemic situation information which needs to be focused by the user. In the case where the data to be processed is product sales data, the generated data description text may contain product information with a large sales volume variation range.

< method embodiment I >

In the present embodiment, a text generation method is provided. The method may be implemented by an electronic device. The electronic device may be the server 1100 as shown in fig. 1 or the terminal device 1200 as shown in fig. 2.

As shown in fig. 3, the text generation method of the present embodiment may include the following steps S3100 to S3400:

step S3100, acquiring data to be processed.

The data to be processed is data containing characters and numbers. In one embodiment of the invention, the data to be processed includes text and a numerical value corresponding to at least one index.

In one embodiment of the invention, the data to be processed may be data generated in an intelligent business scenario.

Further, the data to be processed may be structured data. For example, the data to be processed may be a data table as shown in table 1 below.

TABLE 1

In the data table shown in table 1, the data of each row may be a piece of data to be processed, except for the header row. In the first line of data, the text includes the trading platform 1, the consumer card, the store 1, the value corresponding to index 1 is 46.3, the value corresponding to index 2 is 102.7%, and the value corresponding to index 3 is 53.2. In the second line of data, the text includes the trading platform 2, the consumer card, the store 2, the value corresponding to index 1 is 26.2, the value corresponding to index 2 is 46.66%, and the value corresponding to index 3 is 32.7. In the third row of data, the text includes the trading platform 3, the consumer card, the store 3, the value corresponding to index 1 is 26.2, the value corresponding to index 2 is 46.66%, and the value corresponding to index 3 is 23.6. In the fourth line of data, the text includes the trading platform 4, the consumer card, the store 4, the value corresponding to index 1 is 35.9, the value corresponding to index 2 is 46.66%, and the value corresponding to index 3 is 23.4.

Step S3200, obtain a corresponding word vector according to the words, and obtain a numerical value vector corresponding to each index according to the numerical value of each index.

In an embodiment of the present invention, the manner of obtaining the corresponding literal vector according to the literal may include:

converting each word of the characters into a corresponding word vector; based on a preset neural network, obtaining a hidden state of the neural network according to the word vector; and obtaining a character vector corresponding to the character according to the hidden state of the neural network.

In this embodiment, a word vector of each word of the text in the piece of data to be processed may be generated through a correlation model (word to vector, word2vec) used to generate the word vector. Word2vec, a group of correlation models used to generate Word vectors. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic word text. The network is represented by words and the input words in adjacent positions are guessed, and the order of the words is unimportant under the assumption of the bag-of-words model in word2 vec. After training is complete, the word2vec model may be used to map each word to a vector, which may be used to represent word-to-word relationships.

In an embodiment of the present invention, the preset neural network may be any neural network trained in advance, and may be, for example, a CNN network, an RNN network, a unidirectional LSTM network, or a bidirectional LSTM network.

Under the condition that the neural network is a unidirectional neural network, word vectors are positively directed into the unidirectional neural network according to the sequence of each word in the characters of the data to be processed, and the hidden state of the unidirectional neural network can be obtainedThen, it may be a hidden state of the unidirectional neural networkWord vector h as corresponding wordc

Where the neural network is dualUnder the condition of the neural network, word vectors are input into the bidirectional neural network in the forward direction and the reverse direction according to the sequence of each word in the characters of the data to be processed, and two hidden states of the bidirectional neural network can be obtainedAndthen, it may be that the two hidden states of the bidirectional neural network are mergedAndmerging the hidden statesWord vector h as corresponding wordc

In an embodiment of the present invention, the manner of obtaining the value vector of the corresponding index according to the value of each index may include:

normalizing the numerical value of each index; and mapping the numerical value of each index after the normalization processing to a preset vector space to obtain a numerical value vector of the corresponding index.

In one example, it may be a value that normalizes the value of each index to a value within [ -1,1 ]; and dividing the [ -1,1] into buckets, wherein the number of the buckets can be preset according to application scenes or specific requirements. And respectively mapping the numerical vectors to a preset vector space, so that a single numerical value can be expanded to a numerical value vector with a specified dimensionality. The preset vector space may be initialized randomly. In this way, the relative information of the numerical values is maximally preserved and enriched.

And S3300, determining the index to be revealed as the target index according to the digital vector and the character vector corresponding to each index.

In an embodiment of the present invention, the determining the index to be revealed according to the number vector and the character vector corresponding to each index may include the following steps S3310 to S3320 as a target index:

step S3310, combine the value vector and the character vector of each index to obtain the index vector of the corresponding index.

When the data to be processed includes a plurality of pieces, the numerical vector and the character vector of each index may be combined for each piece of data to be processed, so as to obtain the index vector of the corresponding index.

For example, the literal vector h of the literal in the ith piece of data to be processedc,iThe index vector of index k is vk,iThen, the index vector v of the index k in the ith piece of data to be processed is usedk,iAnd a literal vector hc,iCombining to obtain an index vector H of an index k in the ith piece of data to be processedk,i=[hc,i,vk,i]。

Step S3320, determining a target index according to the index vector of each index.

In one embodiment of the present invention, determining the target index according to the index vector of each index may include steps S3321 to S3322 as follows:

step S3321, for each index, according to the corresponding index vector of the data to be processed, obtaining a comprehensive index vector of the corresponding index.

In an example, for each index, the corresponding index vector of each piece of data to be processed is subjected to weighted summation according to a preset weight, so as to obtain a comprehensive index vector of the corresponding index.

Specifically, the weight of each piece of data to be processed may be set in advance according to an application scenario or a specific requirement, or may be obtained through training in advance.

For example, the index vector of index k in the ith piece of data to be processed is Hk,iThe weight corresponding to the ith piece of data to be processed is lambdaiThe composite index vector of index k can be represented as Ck=∑ii*Hk,i)。

In another example, for each index, the corresponding index vector of each piece of data to be processed is weighted and averaged according to a preset weight to obtain a comprehensive index vector of the corresponding index.

Step S3322, based on the preset index classifier, according to the comprehensive index vector of each index, determining whether the corresponding index is the target index of the index to be revealed.

The index classifier can be obtained by training a binary classification algorithm according to a pre-acquired index sample set. Each sample in the index sample set comprises a comprehensive index vector of each index and a label used for indicating whether the corresponding index is transmitted out.

Specifically, the comprehensive index vector of each index may be input into the index classifier to obtain an output result, which may be 0 or 1, for example, and whether the corresponding index is the target index to be revealed is determined according to the output result. For example, when the output result corresponding to one of the indexes is 1, the index may be set as the target index, and when the output result corresponding to one of the indexes is 0, the index may not be set as the target index.

And step S3400, generating a data description text according to the target index and a preset text template.

The text template may be preset according to an application scenario or a specific requirement. For example, the text template may include "XX is greatly increased," where XX is the location where the target index needs to be filled out.

Specifically, the target index may be filled in a corresponding position in the text template to obtain the data description text.

In an embodiment of the present invention, after obtaining the data description text, the method may further include: and displaying the data description text.

The application scenario of the method may specifically be as shown in fig. 4, where the electronic device obtains data to be processed, obtains a corresponding text vector according to text, and obtains a numerical vector corresponding to each index according to a numerical value of each index; determining indexes to be disclosed as target indexes according to the digital vectors and the character vectors corresponding to each index; and generating a data description text according to the target index and a preset text template.

Specifically, the data description text may be displayed on an interface of the electronic device executing the embodiment of the present invention, so that a user may intuitively know key information, which is concerned by the user and has a large variation range, in the data to be processed.

In the embodiment of the invention, the indexes to be disclosed are determined by the character vectors corresponding to the characters in the data to be processed and the numerical vectors corresponding to the numerical values of each index, and the data description text is generated according to the indexes to be disclosed for displaying. Therefore, key target indexes which are concerned by the user and have large fluctuation range can be selectively revealed from the multiple indexes according to the numerical value of the indexes and the character reasons, and the accuracy of the revealed information can be ensured.

< method example two >

On the basis of the first embodiment, the method may further include: and determining degree adverbs corresponding to the target indexes according to the numerical vectors and the character vectors corresponding to the target indexes, and generating a data description text according to the degree adverbs, wherein the degree adverbs are also used for generating the data description text.

The degree adverb is an adverb representing the degree of change of the target index. For example, it may be a sudden increase, a large increase, a steady hold, no change, a large drop, or a sudden decrease.

In an embodiment of the present invention, the determining the degree adverb corresponding to the target indicator according to the numeric vector and the text vector corresponding to the target indicator may include:

acquiring a comprehensive index vector of a target index and a preset adverb classifier; and determining degree adverbs of the target indexes according to the comprehensive index vector of the target indexes on the basis of the adverb classifier.

The adverb classifier can be obtained by training an N classification algorithm (N is the number of degree adverbs) according to a pre-obtained adverb sample set. Each sample in the adverb sample set includes a comprehensive index vector of each index and a label for representing an adverb of a corresponding degree.

On the basis, a data description text can be generated according to the target index, the degree adverb and a preset text template.

For example, the text template may include "XXYY", where "XX" is a position where the target index needs to be filled in, and "YY" is a position where the adverb needs to be filled in, then, in the case where the target index is "index 1", and the adverb is "greatly increased", the generation of the data description text may include "index 1 greatly increased".

In this embodiment, the degree adverb is used to describe the change condition of the target index, so that the generated data description text is more vivid, more flexible and more reasonable.

< method example III >

On the basis of the foregoing first method embodiment and/or second method embodiment, the method may further include steps S4100 to S4200 as follows:

step S4100, determining the characters to be revealed out as the target characters according to the character vectors and the numerical vectors of the data to be processed.

In an embodiment of the present invention, determining the text to be revealed according to the text vector and the numerical vector of the data to be processed may include the following steps S4110 to S4120 as a target text:

step S4110, combining the character vectors and the numerical vectors of all indexes to obtain data vectors.

In the embodiment where the data to be processed is multiple, the word vector and the numerical vectors of all the indexes may be combined for each piece of data to be processed to obtain a data vector corresponding to the data to be processed.

For example, the number of the indexes is n, and the character vector h of the characters in the ith piece of data to be processedc,iThe index vector of index k is vk,iThen, the index vector v of each index in the ith piece of data to be processed is calculatedk,i(k∈[1,n]) And a literal vector hc,iCombining to obtain a data vector H in the ith piece of data to be processedi=[hc,i,v1,i,v2,i,…,vk,i,…vn,i],k∈[1,n]。

Step S4120, determining a target word according to the data vector of the data to be processed.

In one embodiment of the present invention, determining the target word from the data vector may include steps S4121 to S4122 as follows:

step S4121, determining, according to the data vector of the data to be processed, a data vector of the data to be processed under the influence of other data to be processed, as an influence data vector.

In the embodiment where the data to be processed is multiple, the data vector of each piece of data to be processed under the influence of other pieces of data to be processed may be determined as the influence data vector according to the data vector of the data to be processed.

In an example, for each piece of data to be processed, the data vectors corresponding to all the data to be processed are subjected to weighted summation according to the corresponding weights, so as to obtain an influence data vector of the corresponding data to be processed under the influence of other data to be processed.

Specifically, when determining the influence data vector of each piece of data to be processed, the weights of all pieces of data to be processed may be different, and the weight of each piece of data to be processed is set in advance according to an application scenario or a specific requirement, or may be obtained by training in advance.

For example, the data vector of the ith piece of data to be processed is HiFor the jth piece of data to be processed, the weight corresponding to the ith piece of data to be processed is λi,jThen, the influence data vector of the jth data to be processed under the influence of other data to be processed can be represented as Cj=∑ii,j*Hi)。

In another example, for each piece of data to be processed, the data vectors corresponding to all the data to be processed are weighted according to the corresponding weights to obtain an influence data vector of the corresponding data to be processed under the influence of other data to be processed.

Step S4122, based on the preset character classifier, determining whether the character in the corresponding data to be processed is the target character to be revealed according to the influence data vector of the data to be processed.

The character classifier can be obtained by training a two-classification algorithm according to a character sample set acquired in advance. Each sample in the text sample set includes a corresponding influence data vector and a target text for indicating whether the corresponding text is revealed.

Specifically, the influence data vector of each piece of data to be processed may be input into the character classifier to obtain an output result, which may be, for example, 0 or 1, and whether the character in the corresponding data to be processed is the target character to be revealed is determined according to the output result. For example, when the output result corresponding to the word in one of the data to be processed is 1, the word in the data to be processed is taken as the target word, and when the output result corresponding to the word in one of the data to be processed is 0, the word in the data to be processed is not taken as the target word.

In another embodiment of the present invention, it may be determined whether the word corresponding to the data to be processed is the target word to be revealed according to the data vector of each piece of data to be processed.

Specifically, the data vector of each piece of data to be processed may be input into another pre-trained character classifier, so as to obtain an output result of whether the character corresponding to the data to be processed is the target character to be revealed, and determine whether the character corresponding to the data to be processed is the target character to be revealed according to the output result.

Step S4200, determining a value corresponding to the target character and the target index as a target value.

Specifically, the target value may be a value included in the data to be processed and corresponding to the target character and the target index, and the target value is unique when the target character and the target index are determined.

For example, in table 1 of the first embodiment of the method, when the target character includes the transaction platform 1, the consumer card, and the store 4, and the target index is index 1, the target character and the target index correspond to a target value of 35.9.

In this embodiment, the target words and the target numerical values may be used to generate the data description text. Specifically, the data description text may be generated according to the target words, the target indexes, the target numerical values, and the text template.

For example, the text template may include "influenced by AA, XXYY, XX of AA is ZZ", "AA" is a position where the target character needs to be filled in, "XX" is a position where the target index needs to be filled in, "YY" is a position where the adverb needs to be filled in, "ZZ" is a position where the target numerical value needs to be filled in, then, in the case where the target character is "shop 4", the target index is "index 1", the adverb is "greatly increased", and the target numerical value is 35.9, the generated data description text may include "influenced by shop 4, index 1 greatly increased, index 1 of shop 4 is 35.9".

In the embodiment, the data description text is generated by the target words and the target numerical values, so that the data description text can show information which is more valuable to the user.

< example >

FIG. 5 is a flow chart of a text generation method of one example of the present invention.

As shown in fig. 5, the method may include steps S5001 to S5012:

step S5001, a plurality of pieces of data to be processed are acquired.

Wherein each piece of data to be processed comprises characters and a numerical value corresponding to at least one index.

In one example, the pieces of data may be structured data, and one of the pieces of data to be processed may be a line of data in the structured data.

Step S5002, for each piece of data to be processed, obtaining a corresponding character vector according to the characters, and obtaining a numerical value vector of a corresponding index according to the numerical value of each index.

Step S5003, for each piece of data to be processed, combining the numerical value vector and the character vector of each index respectively to obtain the index vector of the corresponding index.

Step S5004, for each index, obtaining a comprehensive index vector of the corresponding index according to the corresponding index vector of each piece of data to be processed.

Step S5005, based on the preset index classifier, according to the comprehensive index vector of each index, determining whether the corresponding index is the target index of the index to be revealed.

Step S5006, based on a preset adverb classifier, determining the degree adverb of the target index according to the comprehensive index vector of the target index.

The degree adverb is an adverb representing the degree of change of the target index. For example, it may be a sudden increase, a large increase, a steady hold, no change, a large drop, or a sudden decrease.

Step S5007, for each piece of data to be processed, combining the character vectors and the numerical vectors of all indexes to obtain data vectors.

Step S5008, determining a data vector of each piece of data to be processed under the influence of other pieces of data to be processed respectively as an influence data vector according to a corresponding data vector of each piece of data to be processed.

Step S5009, based on a preset character classifier, determining whether a character in corresponding data to be processed is a target character to be revealed according to an influence data vector of each piece of data to be processed.

In step S5010, a value corresponding to the target character and the target index is determined as a target value.

And step S5011, generating a data description text according to the target index, the degree adverb, the target character, the target numerical value and a preset text template.

In step S5012, the data description text is presented.

< text generating apparatus embodiment >

In this embodiment, a text generating apparatus 6000 is provided, as shown in fig. 6, including a data obtaining module 6100, a vector generating module 6200, an index determining module 6300, and a text generating module 6400. The data acquiring module 6100 is configured to acquire data to be processed, where the data to be processed includes characters and a numerical value corresponding to at least one index; the vector generation module 6200 is configured to obtain a corresponding text vector according to the text, and obtain a numerical value vector of a corresponding index according to a numerical value of each index; the index determining module 6300 is configured to determine, according to the numerical vector and the text vector corresponding to each index, an index to be revealed as a target index; the text generation module 6400 is configured to generate a data description text according to the target index and a preset text template.

In an embodiment of the present invention, the text generating apparatus 6000 may further include a module for presenting the data description text.

In an embodiment of the present invention, the text generating apparatus 6000 may further include:

and determining the degree adverb corresponding to the target index according to the numerical vector and the character vector corresponding to the target index.

The degree adverb is an adverb representing the degree of change of the target index and is used for generating a data description text.

In an embodiment of the present invention, the text generating apparatus 6000 may further include:

a module for determining the character to be revealed as the target character according to the character vector and the numerical vector of the data to be processed;

and the module is used for determining a numerical value corresponding to the target character and the target index as a target numerical value.

Wherein the target words and the target numerical values may be used to generate a data description text.

In an embodiment of the present invention, determining the word to be revealed according to the word vector and the numerical vector of the data to be processed includes, as a target word:

combining the character vector and the numerical vectors of all indexes to obtain a data vector;

and determining the target characters according to the data vectors.

In one embodiment of the present invention, determining the target word from the data vector comprises:

determining a data vector of the data to be processed under the influence of other data to be processed as an influence data vector according to the data vector of the data to be processed;

and determining whether the characters in the corresponding data to be processed are the target characters to be revealed or not according to the influence data vector based on a preset character classifier.

In an embodiment of the present invention, the index determining module 6300 may further be configured to:

combining the numerical value vector and the character vector of each index to obtain an index vector of the corresponding index;

and determining a target index according to the index vector of each index.

In one embodiment of the present invention, determining the target index from the index vector of each index comprises:

for each index, obtaining a comprehensive index vector of the corresponding index according to the corresponding index vector of the data to be processed;

and based on a preset index classifier, determining whether the corresponding index is a target index to be disclosed according to the comprehensive index vector of each index.

In an embodiment of the present invention, obtaining a corresponding literal vector from a literal includes:

converting each word of the characters into a corresponding word vector;

based on a preset neural network, obtaining a hidden state of the neural network according to the word vector;

and obtaining a character vector corresponding to the character according to the hidden state of the neural network.

In an embodiment of the present invention, obtaining a value vector of a corresponding index according to a value of each index includes:

normalizing the numerical value of each index;

and mapping the numerical value of each index after the normalization processing to a preset vector space to obtain a numerical value vector of the corresponding index.

In one embodiment of the invention, the data to be processed is structured data.

It will be appreciated by those skilled in the art that the text generation apparatus 6000 can be implemented in various ways. The text generation apparatus 6000 may be realized by, for example, an instruction configuration processor. For example, the text generation apparatus 6000 may be implemented by storing instructions in a ROM and reading the instructions from the ROM into a programmable device when the device is started. For example, the text generation apparatus 6000 may be solidified into a dedicated device (e.g., ASIC). The text generating apparatus 6000 may be divided into units independent of each other or may be implemented by combining them together. The text generation means 6000 may be implemented by one of the various implementations described above, or may be implemented by a combination of two or more of the various implementations described above.

In this embodiment, the text generating apparatus 6000 may have various implementation forms, for example, the text generating apparatus 6000 may be any functional module running in a software product or an application program providing a text generating service, or a peripheral insert, a plug-in, a patch, etc. of the software product or the application program, and may also be the software product or the application program itself.

< electronic apparatus >

In this embodiment, an electronic device 1000 is also provided. The electronic device 1000 may be the server 1100 shown in fig. 1 or the terminal device 1200 shown in fig. 2.

As shown in fig. 7, the electronic device 1000 may further include a processor 1300 and a memory 1400, the memory 1400 for storing executable instructions; the processor 1300 is configured to operate the electronic device 1000 to perform a text generation method according to any embodiment of the present invention according to the control of the instructions.

In this embodiment, the electronic device 1000 may be a mobile phone, a tablet computer, a palm computer, a desktop computer, a notebook computer, a workstation, a game console, or the like. For example, the electronic device 1000 may be a smartphone in which an application providing a display service is installed.

< computer-readable storage Medium >

In this embodiment, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a text generation method according to any of the embodiments of the present invention.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

20页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于生成海报的方法、装置、电子设备、存储介质及产品

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!