Word vector compression method and device based on frequency domain transformation

文档序号:1831767 发布日期:2021-11-12 浏览:11次 中文

阅读说明:本技术 一种基于频域变换的词向量压缩方法及装置 (Word vector compression method and device based on frequency domain transformation ) 是由 冯旻伟 尹竞成 杨晓倩 杨萌 阮良 于 2021-08-05 设计创作,主要内容包括:本公开的实施方式提供了一种基于频域变换的词向量压缩方法。该方法包括:将待压缩的词向量进行傅里叶变换,得到与所述词向量对应的频域向量;计算所述频域向量中各个元素的模值,并基于所述模值的数值大小对所述频域向量中的元素进行排序;从排序后的所述频域向量中选取若干元素,并基于所述选取出的若干元素构建压缩后的词向量。通过以上技术方案,既不改变词向量原有的表达能力,又可以按词向量各元素的重要性进行排序,因此,不仅能够处理高度非线性分布的语言中的词汇,还可以去除不重要的元素,仅保留代表词向量中关键信息的元素,从而实现对词向量的压缩,降低了词向量的维护成本。(The embodiment of the disclosure provides a word vector compression method based on frequency domain transformation. The method comprises the following steps: carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector; calculating the modulus value of each element in the frequency domain vector, and sequencing the elements in the frequency domain vector based on the numerical value of the modulus value; and selecting a plurality of elements from the sorted frequency domain vectors, and constructing a compressed word vector based on the selected elements. Through the technical scheme, the original expression capacity of the word vector is not changed, and sequencing can be performed according to the importance of each element of the word vector, so that the vocabulary in the language with highly nonlinear distribution can be processed, unimportant elements can be removed, and only the elements representing the key information in the word vector are reserved, thereby realizing compression of the word vector and reducing the maintenance cost of the word vector.)

1. A word vector compression method based on frequency domain transformation comprises the following steps:

carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector;

calculating the modulus value of each element in the frequency domain vector, and sequencing the elements in the frequency domain vector based on the numerical value of the modulus value;

and selecting a plurality of elements from the sorted frequency domain vectors, and constructing a compressed word vector based on the selected elements.

2. The method of claim 1, the fourier transform algorithm used to fourier transform the word vector to be compressed comprising a discrete fourier transform;

the Fourier transform of the word vector to be compressed includes:

performing discrete Fourier transform on the word vector to be compressed based on the following formula:

wherein, [ X ]0,X1,…,XN-1]TRepresents the word vector to be compressed, [ Y [ ]0,Y1,…,YN-1]TRepresenting said frequency domain vector, Yk=ak+bki,i2=-1,0≤k≤N-1,WNThe representation size is a matrix of N x N factors,

wherein, ω isNThe expression factor, ωN=e^(-2πi/N)。

3. The method of claim 1, the fourier transform algorithm used to fourier transform the word vector to be compressed comprising a fast fourier transform;

the Fourier transform of the word vector to be compressed includes:

performing fast Fourier transform on the word vector to be compressed based on the following formula:

wherein the content of the first and second substances,is expressed as a size ofThe unit matrix of (a) is,is expressed as a size ofThe diagonal matrix of (a) is,

is expressed as a size ofThe matrix of factors of (a) is,

wherein the content of the first and second substances,the number of the presentation factors is,

4. the method of claim 1, the selecting elements from the ordered frequency domain vectors, comprising:

and when the sorting strategy is that the numerical values of the modulus values are sorted from large to small, selecting a plurality of elements at the front in the sorting result.

5. The method of claim 1, further comprising:

and acquiring corresponding position information of the elements in the frequency domain vector before sequencing.

6. The method of claim 5, wherein constructing a compressed word vector based on the selected elements comprises:

respectively taking the position information corresponding to the elements and the modulus values of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements;

alternatively, the first and second electrodes may be,

respectively taking the position information corresponding to the elements and the squares of the modulus values of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements;

alternatively, the first and second electrodes may be,

and respectively taking the position information corresponding to the elements, the real parts of the elements and the imaginary parts of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements.

7. The method of claim 5, the position information comprising an inverse of corresponding position information of the number of elements in a pre-ordered frequency domain vector.

8. A word vector compression apparatus based on frequency domain transform, comprising:

the conversion module is used for carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector;

the sorting module is used for calculating the modulus value of each element in the frequency domain vector and sorting the elements in the frequency domain vector based on the numerical value of the modulus value;

and the construction module is used for selecting a plurality of elements from the sequenced frequency domain vectors and constructing compressed word vectors based on the selected elements.

9. A medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 7.

10. A computing device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor implements the method of any one of claims 1-7 by executing the executable instructions.

Technical Field

The embodiment of the disclosure relates to the technical field of computers, and more particularly, to a word vector compression method and device based on frequency domain transformation, a storage medium and an electronic device.

Background

This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

The term vector refers to a method for representing words in a language in a vector form in mathematics, namely, the words are mapped into a vector space and are represented by vectors.

For example, "apple" may be represented as [0.343434, -0.88749,0.992112 …,0.232432 ].

The elements in the word vector can be used to indicate that the "apple" appears in which sentences, the "apple" often appears before which words, and often appears after which words, the "apple" usually appears at what position of the sentence, and the length of the sentence where the "apple" appears.

Generally, word vectors can be obtained by adopting word2vec, glove and other methods through neural network model training based on billions of orders of magnitude of massive corpora in a corpus, and information of a certain vocabulary in a corpus is represented.

In order to improve the representation effect of the word vector, more information can be represented by increasing the dimension of the word vector, and a corpus in the order of billions generally contains quite a lot of information, so that the length of the word vector needs to be as long as possible to contain the information.

In practical applications, the total amount of words is often large, for example, the word amount of chinese is generally 50 tens of thousands, and assuming that the length of the word vector corresponding to each word is 200, about 1 hundred million vector elements are required to represent all the words of chinese, which inevitably causes a problem that the word vector occupies too large a storage space, and also causes a problem that the calculation amount is too large when the word vector is processed, and finally increases the maintenance cost of the word vector.

Disclosure of Invention

Therefore, an improved word vector compression scheme is highly needed, so that the high-nonlinearity human language vocabulary can be accurately compressed without reducing the vocabulary amount, and the aim of reducing the maintenance cost of the word vector is fulfilled.

In this context, embodiments of the present disclosure are intended to provide a word vector compression method and apparatus based on frequency domain transformation.

In a first aspect of embodiments of the present disclosure, a word vector compression method based on frequency domain transformation is provided, including:

carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector;

calculating the modulus value of each element in the frequency domain vector, and sequencing the elements in the frequency domain vector based on the numerical value of the modulus value;

and selecting a plurality of elements from the sorted frequency domain vectors, and constructing a compressed word vector based on the selected elements.

In one embodiment of the present disclosure, the fourier transform algorithm used to fourier transform the word vector to be compressed includes a discrete fourier transform;

the Fourier transform of the word vector to be compressed includes:

performing discrete Fourier transform on the word vector to be compressed based on the following formula:

wherein, [ X ]0,X1,…,XN-1]TRepresents the word vector to be compressed, [ Y [ ]0,Y1,…,YN-1]TRepresenting said frequency domain vector, Yk=ak+bki,i2=-1,0≤k≤N-1,WNThe representation size is a matrix of N x N factors,

wherein, ω isNThe expression factor, ωN=e^(-2πi/N)。

In one embodiment of the present disclosure, the fourier transform algorithm used to fourier transform the word vector to be compressed includes a fast fourier transform;

the Fourier transform of the word vector to be compressed includes:

performing fast Fourier transform on the word vector to be compressed based on the following formula:

wherein the content of the first and second substances,is expressed as a size ofThe unit matrix of (a) is,is expressed as a size ofThe diagonal matrix of (a) is,

is expressed as a size ofThe matrix of factors of (a) is,

wherein the content of the first and second substances,the number of the presentation factors is,

in an embodiment of the present disclosure, the selecting a number of elements from the sorted frequency domain vectors includes:

and when the sorting strategy is that the numerical values of the modulus values are sorted from large to small, selecting a plurality of elements at the front in the sorting result.

In one embodiment of the present disclosure, the method further comprises:

and acquiring corresponding position information of the elements in the frequency domain vector before sequencing.

In an embodiment of the present disclosure, the constructing a compressed word vector based on the selected elements includes:

respectively taking the position information corresponding to the elements and the modulus values of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements;

alternatively, the first and second electrodes may be,

respectively taking the position information corresponding to the elements and the squares of the module values of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements;

alternatively, the first and second electrodes may be,

and respectively taking the position information corresponding to the elements, the real parts of the elements and the imaginary parts of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements.

In one embodiment of the present disclosure, the position information includes an inverse of the position information corresponding to the number of elements in the frequency domain vector before sorting.

In a second aspect of embodiments of the present disclosure, there is provided an apparatus comprising:

the conversion module is used for carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector;

the sorting module is used for calculating the modulus value of each element in the frequency domain vector and sorting the elements in the frequency domain vector based on the numerical value of the modulus value;

and the construction module is used for selecting a plurality of elements from the sequenced frequency domain vectors and constructing compressed word vectors based on the selected elements.

In a third aspect of embodiments of the present disclosure, there is provided a medium; having stored thereon computer instructions which, when executed by a processor, implement the steps of the method as described below:

carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector;

calculating the modulus value of each element in the frequency domain vector, and sequencing the elements in the frequency domain vector based on the numerical value of the modulus value;

and selecting a plurality of elements from the sorted frequency domain vectors, and constructing a compressed word vector based on the selected elements.

In a fourth aspect of embodiments of the present disclosure, there is provided a computing device comprising:

a processor; and a memory for storing processor-executable instructions;

wherein the processor implements the steps of the method by executing the executable instructions to:

carrying out Fourier transform on the word vector to be compressed to obtain a frequency domain vector corresponding to the word vector;

calculating the modulus value of each element in the frequency domain vector, and sequencing the elements in the frequency domain vector based on the numerical value of the modulus value;

and selecting a plurality of elements from the sorted frequency domain vectors, and constructing a compressed word vector based on the selected elements.

The above embodiments of the present disclosure have at least the following advantages:

and transforming the word vectors which are regarded as signal waves into a frequency domain space through Fourier transformation to obtain frequency domain vectors with the same length, then calculating the module values of all elements of the frequency domain vectors, and sequencing based on the module values, thereby selecting a plurality of elements to construct compressed word vectors. According to the technical scheme, on one hand, the vocabulary quantity is not reduced when the word vector is compressed, so that the original expression capacity of the word vector is not changed; on the other hand, after the word vectors are converted into the frequency domain space, the word vectors can be sorted based on the magnitude of the modulus of each element in the frequency domain vector, and sorting according to the importance of each element of the word vectors is realized, so that not only can words in a language with highly nonlinear distribution be processed, but also unimportant elements can be removed, and only elements representing key information in the word vectors are reserved, thereby realizing compression of the word vectors and reducing the maintenance cost of the word vectors.

Drawings

The foregoing and other objects, features and advantages of exemplary embodiments of the present disclosure will be readily understood by reading the following detailed description with reference to the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

fig. 1 schematically illustrates a flow chart of a word vector compression method based on frequency domain transformation according to an embodiment of the present disclosure;

fig. 2 schematically illustrates a block diagram of a word vector compression apparatus based on frequency domain transformation according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a schematic diagram of a word vector compression medium based on frequency domain transformation, according to an embodiment of the present disclosure;

fig. 4 schematically shows a schematic diagram of an electronic device capable of implementing the above method according to an embodiment of the present disclosure.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to the embodiment of the disclosure, a word vector compression method, a medium, a device and a computing device based on frequency domain transformation are provided.

In this document, it is to be understood that the number of any element in the figures is intended to be illustrative rather than restrictive, and that any nomenclature is used for differentiation only and not in any limiting sense.

At present, a word vector matrix compression method based on clustering and principal component analysis is generally adopted to respectively reduce the dimension of the row number and the column number of the word vector matrix, so that the compression of the word vector is realized, and the maintenance cost of the word vector is reduced.

For example, for a word vector to be compressed, the word vector may be clustered to remove words with similar semantics, thereby reducing the number of rows of a word vector matrix, and then, principal component analysis may be performed on features of the word vector, only main features in the word vector may be retained, thereby reducing the number of columns of the word vector matrix and implementing compression of the word vector.

On the one hand, however, when line compression of word vectors is performed, clustering results are not always accurate due to the richness and diversity of vocabularies in languages, and the expression capacity of compressed vocabularies is limited due to the reduction of the total number of the vocabularies after clustering; on the other hand, when performing column compression of word vectors, the compression based on principal component analysis is essentially a linear dimensionality reduction technique, and is generally applicable to the situation that the spatial distribution is gaussian, while the distribution of words in the actual language generally presents high non-linearity, so that the result obtained based on principal component analysis is not necessarily accurate.

The compression process of the word vector is described as a simpler example. Referring to Table 1 below, assuming that the vocabulary is 4 and each word vector is 10 in length, the pair is now required

The word vectors shown in table 1 are compressed.

Vocabulary and phrases Word vector
Happy [0.8966,0.8534,…,0.9001]
Joy of joy [0.7902,0.9432,…,0.8349]
Sadness and sorrow [0.1731,0.2419,…,0.3453]
Worry about [0.2285,0.2943,…,0.3717]

Table 1 a 4 x 10 dimensional matrix M1 to be compressed can be obtained from table 1:

the specific process of the word vector matrix compression method based on clustering and principal component analysis is as follows:

firstly, clustering the vocabulary in the compression matrix M1, such as k-means clustering and hierarchical clustering, and classifying the vocabulary by using Euclidean distance between word vectors as similarity measurement indexes.

Continuing with the example, for table 1, the hyper-parameter in the clustering algorithm, i.e. the category number k, may be set to 2, and since the euclidean distance of the word vectors corresponding to "happy" and "happy" is much smaller than the euclidean distance of the word vectors corresponding to "happy" and "sad", the word vectors may be classified into "happy" and "sad" may be classified into another category based on the euclidean distance, and the clustering result may be as shown in table 2 below.

TABLE 2

Then, representative words and category centers of each category are selected from the clustering results, and words with word frequency reaching a preset frequency can be selected as the representative words, namely common words, while the category centers are word vectors corresponding to the representative words.

Continuing with the example, the selected representative words and the category centers may be as shown in table 3 below:

categories Representative word Word vector
1 Happy [0.8966,0.8534,…,0.9001]
2 Sadness and sorrow [0.1731,0.2419,…,0.3453]

Table 3 at this point, the word vectors in table 3 may be represented by the matrix M2:

the matrix M2 is a new word vector matrix obtained by compressing rows of word vectors in the matrix M1 by using a clustering algorithm, and in the matrix M2, the length of each word vector is still 10, and the number of columns of word vectors needs to be further compressed.

And then, performing principal component analysis on the features contained in the preliminarily compressed word vectors, and selecting the first n principal components to obtain a final compressed matrix M3.

For example, assume that after principal component analysis is performed on the word vector of M2, the matrix M3 obtained by selecting the first 5 principal components is:

then the table 3 can be updated as shown in table 4 below:

categories Representative word Word vector
1 Happy [0.7845,0.8435,0.8642,0.6983,0.9300]
2 Sadness and sorrow [0.2312,0.3892,0.4319,0.3875,0.2488]

TABLE 4

That is, the matrix M3 is the final compressed matrix obtained based on clustering and principal component analysis, and at this time, the word vector matrix to be stored is compressed from M1 of 4 × 10 dimensions to M3 of 2 × 5 dimensions.

The importance of vocabulary compression in the face of huge vocabulary in languages can be imagined from the above process. When the word vectors are compressed through two angles of clustering and principal component analysis, on one hand, due to the richness and diversity of the vocabulary in the language, the hyper-parameter k in the clustering algorithm is difficult to select, so that the clustering result is not always accurate when the row compression of the word vectors is carried out, and the expression capacity of the compressed vocabulary is limited due to the reduction of the total amount of the clustered vocabulary;

on the other hand, when column compression of word vectors is performed, compression based on principal component analysis is essentially a linear dimensionality reduction technology, and is generally applicable to the situation that spatial distribution is gaussian, and the distribution of words in an actual language generally presents high nonlinearity, so that the result obtained based on principal component analysis is not necessarily accurate, and is not the optimal solution selected by word vector elements.

The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.

Summary of The Invention

As described above, the present disclosure finds that, when a clustering algorithm is used to perform line compression of word vectors, the compressed word vectors are limited in expression ability due to the reduction of the vocabulary amount, and when the word vectors are compressed based on principal component analysis, the linear dimension reduction technique is used, which is not suitable for highly nonlinear distribution of the vocabulary in the language, and thus the obtained principal components are not accurate.

In view of this, the present specification provides a technical solution for transforming word vectors into frequency domain vectors, sorting based on magnitude of a modulus of the transformed frequency domain vectors, obtaining key elements in the word vectors based on a sorting result, and constructing new word vectors based on the key elements, thereby implementing word vector compression on highly non-linearly distributed words without reducing the amount of words, and further reducing maintenance cost of the word vectors.

The core technical concept of the specification is as follows:

the word vector can be regarded as a signal wave with a limited length, each element can be regarded as a signal sampling value on discrete time obtained after equally-spaced sampling of the signal on a continuous time domain, and after Fourier transform is used, the discrete signal on the time domain can be converted into a discrete signal on a frequency domain.

And transforming the word vectors which are regarded as signal waves into a frequency domain space through Fourier transformation to obtain frequency domain vectors with the same length, then calculating the module values of all elements of the frequency domain vectors, and sequencing based on the module values, thereby selecting a plurality of elements to construct compressed word vectors.

Through the technical scheme, on one hand, the original expression capacity of the word vector is not changed because the word vocabulary quantity is not reduced when the word vector is compressed; on the other hand, after the word vectors are converted into the frequency domain space, the word vectors can be sorted based on the magnitude of the modulus of each element in the frequency domain vector, and sorting according to the importance of each element of the word vectors is realized, so that words in a language with highly nonlinear distribution can be processed, unimportant elements can be removed, and only elements representing key information in the word vectors are reserved, so that compression of the word vectors is realized, and the maintenance cost of the word vectors is reduced.

Having described the general principles of the present disclosure, various non-limiting embodiments of the present disclosure are described in detail below.

Application scene overview

The core of the natural language processing task is the representation of words by a language model, e.g., meaning of the words, semantics of the words in context, etc. While a word vector may be used to characterize the feature information of a word, in general, a word vector may represent a vector form as shown in table 1, and a plurality of word vectors may form a word vector matrix as M1.

Therefore, the word vector matrix constructed based on the massive corpus can be widely applied to natural language processing scenes, for example, other songs with similar lyric semantics with the song are recommended to the user based on the song collected by the user; for another example, based on the user's comments on a piece of information in the information client, the user's emotions are determined and collected, and so on.

However, due to the characteristics of huge vocabulary in languages and complex grammar, a large number of characteristics are often required for description, so that the generated word vector matrix is large in size and occupies a considerable storage space, and therefore, an effective word vector matrix compression method needs to be provided.

It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present disclosure, and the embodiments of the present disclosure are not limited in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.

Exemplary method

The technical idea of the present specification will be described in detail by specific examples.

The invention aims to provide a technical scheme for transforming word vectors into frequency domain vectors, sequencing the frequency domain vectors based on the magnitude of modulus values of the transformed frequency domain vectors, acquiring key elements in the word vectors based on a sequencing result, and constructing new word vectors based on the key elements, so that word vector compression of words in highly nonlinear distribution is realized on the premise of not reducing the words, and the maintenance cost of the word vectors is reduced.

During implementation, the word vector to be compressed can be subjected to Fourier transform to obtain a frequency domain vector corresponding to the word vector;

for example, the word vector to be compressed may be regarded as a signal wave, and the word vector to be compressed is transformed into a frequency domain space by discrete fourier transform or fast fourier transform, so as to obtain a frequency domain vector corresponding to the word vector.

After obtaining the frequency domain vector, calculating a modulus of each element in the frequency domain vector, and sorting the elements in the frequency domain vector based on the magnitude of the modulus;

for example, the modulus values may be calculated for each element in the frequency domain vector, and the elements in the frequency domain vector may be sorted from large to small according to the numerical size of the calculated modulus values.

After the sorting is completed, a plurality of elements can be selected from the sorted frequency domain vectors, and a compressed word vector is constructed based on the selected elements;

for example, when the sorting strategy is to sort the numerical values of the modulus values from large to small, a plurality of elements at the top in the sorting result can be selected; meanwhile, the corresponding position information of the plurality of elements in the frequency domain vector before sequencing can be obtained; furthermore, the position information corresponding to the elements and the modulus values of the elements can be respectively used as new elements, and the compressed word vector is constructed based on the sorting of the numerical values of the modulus values of the elements.

Through the technical scheme, on one hand, the original expression capacity of the word vector is not changed because the word vocabulary quantity is not reduced when the word vector is compressed; on the other hand, after the word vectors are converted into the frequency domain space, the word vectors can be sorted based on the magnitude of the modulus of each element in the frequency domain vector, and sorting according to the importance of each element of the word vectors is realized, so that words in a language with highly nonlinear distribution can be processed, unimportant elements can be removed, and only elements representing key information in the word vectors are reserved, so that compression of the word vectors is realized, and the maintenance cost of the word vectors is reduced.

Referring to fig. 1, fig. 1 is a flowchart of a word vector compression method based on frequency domain transformation according to an exemplary embodiment, where the method includes the following steps:

step 101, performing fourier transform on a word vector to be compressed to obtain a frequency domain vector corresponding to the word vector.

The word vector to be compressed may be represented by a floating point number, for example, "apple" may be represented as [0.343434, -0.88749,0.992112, …,0.232432 ].

Assuming that the length of the word vector to be compressed is N, the word vector may be regarded as a segment of signal wave, and each element may be regarded as a signal sampling value at a discrete time obtained after sampling the signal at equal intervals in a continuous time domain.

For example, the word vector [ X ]0,X1,…,XN-1]TEach element in (a) represents a signal value at a time of 0, 1.

Further, the word vectors which are regarded as signal waves are transformed into a frequency domain space through Fourier transform, and frequency domain vectors with the same length are obtained.

Continuing with the example, the frequency domain vector corresponding to the word vector may be [ Y [ ]0,Y1,…,YN-1]T

It is worth noting that each element in the frequency domain vector is in the form of a complex number, i.e., Yk=ak+bki,i2=-1,0≤k≤N-1。

In particular, the method comprises the following steps of,

wherein i represents an imaginary unit, i.e.

In one embodiment shown, the fourier transform algorithm used to fourier transform the word vector to be compressed may include a discrete fourier transform;

further, the word vector to be compressed may be subjected to a discrete fourier transform based on the following formula:

wherein, [ X ]0,X1,…,XN-1]TRepresents the word vector to be compressed, [ Y [ ]0,Y1,…,YN-1]TRepresenting said frequency domain vector, Yk=ak+bki,i2=-1,0≤k≤N-1,WNThe representation size is a matrix of N x N factors,

wherein, ω isNThe expression factor, ωN=e^(-2πi/N)。

In computer processing, the time complexity is reduced, the algorithm running speed can be increased, and the running efficiency is improved. As can be seen from the calculation formula of the discrete Fourier transform, the time complexity of the discrete Fourier transform is O (N)2) In order to further reduce the time complexity, fast fourier transform can be adopted, the calculation amount is reduced by continuously dividing the word vector into two new word vectors, and the length of the new word vectors is half of the original length, and the calculation amount is reduced by adopting the methodThe temporal complexity of the fast fourier transform is o (nlogn).

For example, assuming that N is 8, the word vector to be compressed, i.e., the domain vector [ X ═ X0,X1,…,X7]TConversion into a frequency domain vector [ Y0,Y1,…,Y7]TThe discrete fourier transform of (a) is calculated as:

wherein, ω isN-e^(-2πi/N)。

Because the functions comprise exponential functions and unit imaginary numbers, the Euler formula e can be relatedixThe following properties were obtained:

the periodicity is as follows:

symmetry:

in light of the above properties, and the division by parity terms, the first half elements and the second half elements of the frequency domain vector can be computed separately:

wherein the content of the first and second substances,

according to the formula, the discrete Fourier transform of the time domain vector with the length of 8 can be converted into two vectors with the length of 4 to respectively perform the discrete Fourier transform; if the odd term and the even term are further divided, the four vectors with the length of 2 are equivalently subjected to discrete Fourier transform respectively; further division is performed, and the length of the vector to be subjected to discrete Fourier transform can be reduced to 1.

It can be seen that the length of the word vector is gradually reduced to 1 by sequentially dividing the word vector by odd terms and even terms, in the process, log N times is only needed, and after fourier transforms corresponding to two vectors with the length of N/2 are obtained each time, the fourier transform corresponding to the vector with the length of N can be obtained in o (N) time, so that the total time complexity only needs o (nlogn).

For example, if the word vector length N is 200, the time complexity using the discrete fourier transform is 40000, while the time complexity using the fast fourier transform is reduced to 460, which is reduced by almost two orders of magnitude.

In another embodiment shown, the fourier transform algorithm used to fourier transform the word vector to be compressed may include a fast fourier transform;

further, the word vector to be compressed is subjected to fast fourier transform based on the following formula:

wherein the content of the first and second substances,is expressed as a size ofThe unit matrix of (a) is,is expressed as a size ofThe diagonal matrix of (a) is,

is expressed as a size ofThe matrix of factors of (a) is,

wherein the content of the first and second substances,the number of the presentation factors is,

it is to be noted that the matrix is transformed into an equivalent form of the fast fourier transform.

In addition, since the fast fourier transform needs to continuously divide the odd and even terms, N needs to satisfy 2mAnd m needs to be a positive integer.

Therefore, in one example, before performing the fast fourier transform, it is necessary to determine whether the length N of the word vector to be compressed satisfies N-2m

If yes, performing fast Fourier transform;

if not full ofIf sufficient, it is necessary to determine that N < 2mMinimum value of m, will be (2)m-N) 0 s are filled as new elements into the word vector to be compressed and the filled word vector is subjected to a fast fourier transformation.

Whether the discrete fourier transform or the fast fourier transform is used, the conversion from the word vector to the frequency domain vector can be realized, and the person skilled in the art can select the conversion according to the actual needs.

And 102, calculating the modulus value of each element in the frequency domain vector, and sequencing the elements in the frequency domain vector based on the numerical value of the modulus value.

As can be seen from the foregoing description, any signal wave can be decomposed into a form of superposition and/or integration of a series of sine waves, a fourier transform is used to find sine wave components of each frequency contained in a finite discrete time signal, and each element in a frequency domain vector obtained by fourier transforming a time domain vector can be represented in the form of a complex number, where the complex number represents an amplitude corresponding to each sine wave component in a frequency domain space, and obviously, the larger the amplitude is, the larger the influence of the sine wave corresponding to the amplitude is.

Therefore, the amplitude of each sine wave can be determined by calculating the modulus of each element in the frequency domain vector, and the elements in the frequency domain vector can be sorted based on the magnitude of the modulus.

It is to be noted that numbers having the form z ═ a + bi are referred to as complex numbers, where i is an imaginary unit and i is an imaginary unit2-1, a, b is an arbitrary real number, a being the real part of the complex number, b being the imaginary part of the complex number, and the modulus of the complex number z

Step 103, selecting a plurality of elements from the sorted frequency domain vectors, and constructing a compressed word vector based on the selected elements.

In this specification, after the elements in the frequency domain vector are sorted based on the magnitude of the modulus, a plurality of elements may be selected from the sorted frequency domain vector according to a preset selection rule, and used as elements for constructing a new word vector.

For example, several elements with the largest modulus values may be selected as new elements to construct compressed word vectors, and the specific number of the elements may be determined by those skilled in the art based on actual needs, which is not limited by the present disclosure.

In an embodiment shown, when the sorting strategy is that the numerical values of the modulus values are sorted from large to small, a plurality of elements at the top of the sorting result are selected.

For example, in one example, the calculation results of the modulus values may be arranged in a numerical order from large to small, and the first Q elements may be selected from the one-dimensional array;

of course, in another example, the calculation results of the modulus values may be arranged from small to large, and the last Q elements may be selected from the one-dimensional array.

In an embodiment shown, corresponding position information of the several elements in the frequency domain vector before sorting may also be obtained.

For example, suppose that after being sorted from large to small according to the numerical value of the modulus, the top 5 elements are selected, and the corresponding position information of the 5 elements in the frequency domain vector before being sorted is obtained.

In one example, since the elements in the frequency domain vector may be represented by Yk=ak+bki denotes that k may denote corresponding position information of each element in the frequency domain vector before sorting.

For example, assuming that the 5 selected elements are Y12, Y33, Y1, Y4, and Y91 respectively in descending order of the magnitude of the modulus value, the corresponding position information of each acquired element in the frequency domain vector before sorting is 12, 33, 1, 4, and 91 respectively.

In one illustrated embodiment, the compressed word vector may be constructed based on any of the following:

in an example, the position information corresponding to the elements and the modulus values of the elements may be respectively used as new elements, and the compressed word vector may be constructed based on the ordering of the elements;

for example, for elements Y12, Y33, Y1, Y4, and Y91 in the determined frequency domain vector, the position information 12, 33, 1, 4, and 91 corresponding to the five elements and the modulus values of the five elements may be based onAndrespectively as new elements, and based on the sorting results of the five elements by the magnitude of the modulus value, constructing a compressed word vector as shown in the following:

in another example, the position information corresponding to the elements and the squares of the modulus values of the elements may be respectively used as new elements, and the compressed word vector may be constructed based on the ordering of the elements;

for example, for elements Y12, Y33, Y1, Y4, and Y91 in the determined frequency domain vector, the position information 12, 33, 1, 4, and 91 corresponding to the five elements and the squares of the modulus values of the five elements may be based onAndrespectively as new elements, and based on the sorting results of the five elements by the magnitude of the modulus value, constructing a compressed word vector as shown in the following:

in another example, the position information corresponding to the elements, the real parts of the elements, and the imaginary parts of the elements may be respectively used as new elements, and the compressed word vector may be constructed based on the ordering of the elements.

For example, for the elements Y12, Y33, Y1, Y4, and Y91 in the determined frequency domain vector, the compressed word vector as shown below may be constructed based on the position information 12, 33, 1, 4, and 91 corresponding to the five elements, and the real parts and imaginary parts of the five elements as new elements, respectively, and based on the ordered results of the five elements by magnitude of the modulus values:

since the length of the word vector to be compressed is generally large, the reconstructed word vector may have an excessively large element value representing the position information, and the position information may be processed and then used to construct the compressed word vector.

In one embodiment shown, the position information includes an inverse of the position information corresponding to the number of elements in the frequency domain vector before sorting.

For example, for elements Y12, Y33, Y1, Y4, and Y91 in the determined frequency domain vector, the position information may be taken as 1/12, 1/33, 1, 1/4, and 1/91 and used to construct a compressed word vector. The construction method may refer to the aforementioned three methods, and is not described herein again.

In the above embodiment, the word vectors regarded as the signal waves are transformed into the frequency domain space by fourier transform to obtain frequency domain vectors with the same length, then the module values of the elements of the frequency domain vectors are calculated, and the ordering is performed based on the module values, so that a plurality of elements are selected to construct compressed word vectors. By the technical scheme, on one hand, the original expression capacity of the word vector is not changed because the vocabulary quantity is not reduced when the word vector is compressed; on the other hand, after the word vectors are converted into the frequency domain space, the word vectors can be sorted based on the magnitude of the modulus of each element in the frequency domain vector, and sorting according to the importance of each element of the word vectors is realized, so that not only can words in a language with highly nonlinear distribution be processed, but also unimportant elements can be removed, and only elements representing key information in the word vectors are reserved, thereby realizing compression of the word vectors and reducing the maintenance cost of the word vectors.

Exemplary devices

Having described the method of the exemplary embodiment of the present disclosure, referring next to fig. 2, fig. 2 is a block diagram of a word vector compression apparatus based on frequency domain transformation according to an exemplary embodiment.

The implementation process of the functions and actions of each module in the following apparatus is specifically detailed in the implementation process of the corresponding step in the above method, and is not described herein again. For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points.

As shown in fig. 2, the word vector compression apparatus 200 based on frequency domain transformation may include: a transformation module 201, an ordering module 202 and a construction module 203. Wherein:

the transformation module 201 is configured to perform fourier transformation on a word vector to be compressed, so as to obtain a frequency domain vector corresponding to the word vector;

the sorting module 202 is configured to calculate a modulus value of each element in the frequency domain vector and sort the elements in the frequency domain vector based on a numerical size of the modulus value;

the construction module 203 is configured to select elements from the sorted frequency domain vectors and construct a compressed word vector based on the selected elements.

In one embodiment, the fourier transform algorithm used to fourier transform the word vector to be compressed comprises a discrete fourier transform;

the transformation module 201 further:

performing discrete Fourier transform on the word vector to be compressed based on the following formula:

wherein, [ X ]0,X1,…,XN-1]TRepresents the word vector to be compressed, [ Y [ ]0,Y1,…,YN-1]TRepresenting said frequency domain vector, Yk=ak+bki,i2=-1,0≤k≤N-1,WNThe representation size is a matrix of N x N factors,

wherein, ω isNThe expression factor, ωN=e^(-2πi/N)。

In one embodiment, the fourier transform algorithm used to fourier transform the word vector to be compressed comprises a fast fourier transform;

the transformation module 201 further:

performing fast Fourier transform on the word vector to be compressed based on the following formula:

wherein the content of the first and second substances,is expressed as a size ofThe unit matrix of (a) is,is expressed as a size ofThe diagonal matrix of (a) is,

is expressed as a size ofThe matrix of factors of (a) is,

wherein the content of the first and second substances,the number of the presentation factors is,

in an embodiment, the build module 203 further:

and when the sorting strategy is that the numerical values of the modulus values are sorted from large to small, selecting a plurality of elements at the front in the sorting result.

In one embodiment, the apparatus 200 further comprises:

the position module 204 obtains corresponding position information of the plurality of elements in the frequency domain vector before sorting.

In an embodiment, the build module 203 further:

respectively taking the position information corresponding to the elements and the modulus values of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements;

alternatively, the first and second electrodes may be,

respectively taking the position information corresponding to the elements and the squares of the module values of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements;

alternatively, the first and second electrodes may be,

and respectively taking the position information corresponding to the elements, the real parts of the elements and the imaginary parts of the elements as new elements, and constructing the compressed word vector based on the sequencing of the elements.

In an embodiment, the position information includes an inverse of the position information corresponding to the number of elements in the pre-ordered frequency domain vector.

The details of each module of the word vector compression apparatus 200 based on frequency domain transformation have been described in detail in the foregoing description of the word vector compression method based on frequency domain transformation, and therefore, the details are not repeated herein.

It should be noted that although several modules or units of the word vector compression apparatus 200 based on frequency domain transformation are mentioned in the above detailed description, such division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Exemplary Medium

Having described the apparatus of the exemplary embodiments of the present disclosure, reference is next made to fig. 3, where fig. 3 is a schematic diagram of a word vector compression medium based on frequency domain transformation according to an exemplary embodiment.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the present disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present disclosure as described in the "exemplary methods" section above of this specification, when the program product is run on the terminal device.

Referring to fig. 3, a program product 30 for implementing the above method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Exemplary computing device

Having described the method, medium, and apparatus of the exemplary embodiments of the present disclosure, reference is next made to fig. 4, where fig. 4 is a schematic diagram of an electronic device capable of implementing the method according to an exemplary embodiment.

An electronic device 400 according to such an embodiment of the disclosure is described below with reference to fig. 4. The electronic device 400 shown in fig. 4 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.

As shown in fig. 4, electronic device 400 is embodied in the form of a general purpose computing device. The components of electronic device 400 may include, but are not limited to: the at least one processing unit 401, the at least one memory unit 402, and a bus 403 that couples various system components including the memory unit 402 and the processing unit 401.

Wherein the storage unit stores program code, which can be executed by the processing unit 401, to cause the processing unit 401 to execute the steps of the various embodiments described above in this specification.

The storage unit 402 may include readable media in the form of volatile storage units, such as a random access storage unit (RAM)4021 and/or a cache storage unit 4022, and may further include a read-only storage unit (ROM) 4023.

The storage unit 402 may also include a program/use tool 4024 having a set (at least one) of program modules 4025, such program modules 4025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, and in some combination, may comprise a reality of the network environment.

Bus 403 may be any type of bus structure representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 400 may also communicate with one or more external devices 404 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 400, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 400 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 405. Also, electronic device 400 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through network adapter 406. As shown, the network adapter 406 communicates with the other modules of the electronic device 400 over a bus 403. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

It should be noted that although in the above detailed description several units/modules or sub-units/modules of the apparatus are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, in accordance with the embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

23页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:用于生成会议纪要的方法、装置、设备、介质和产品

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!