Optimization method of FFT chip

文档序号：857505 发布日期：2021-04-02 浏览：19次中文

阅读说明：本技术 一种fft芯片的优化方法 (Optimization method of FFT chip ) 是由刘亚鹏乐立鹏安印龙谷艳方新嘉楚晓梅于 2020-12-16 设计创作，主要内容包括：本发明涉及一种FFT芯片的优化方法,属于FFT芯片设计领域；步骤一、将2048点的FFT芯片分解成6级运算；每级运算都用到旋转因子步骤二、将旋转因子用三角函数表示为步骤三、将旋转因子三角函数中的角度元素用x表示,即步骤四、计算出x范围为0°-45°时,cosx和-sinx的全部值,通过计算旋转因子的全部值,将其存储在FFT芯片中的ROM中,即完成FFT芯片的优化；本发明通过改变FFT计算中的一个常量系数,即旋转因子的产生与存储,从而减少ROM中存储的数据,来提高器件的效率和减少占用的资源。(The invention relates to an optimization method of an FFT chip, belonging to the field of FFT chip design; step one, decomposing a 2048-point FFT chip into 6-level operation; the twiddle factors are used for each stage of operation Step two, converting the twiddle factor Expressed as a trigonometric function Step three, converting the twiddle factor Angle element in trigonometric function Is represented by x, i.e. Step four, calculating all values of cosx and-sinx when the range of x is 0-45 DEG, and passing Calculating twiddle factors All the values of (a) are stored in a ROM in the FFT chip, namely the optimization of the FFT chip is completed; the invention reduces the data stored in the ROM by changing a constant coefficient in the FFT calculation, namely the generation and the storage of the twiddle factor, thereby improving the efficiency of the device and reducing the occupied resources.)

1. An optimization method of an FFT chip is characterized in that: the method comprises the following steps:

step one, decomposing a 2048-point FFT chip into 6-level operation; wherein, the 1-5 level operation adopts the FFT operation of base-4; the 6 th-stage operation adopts the FFT operation of radix-2; the twiddle factor is used for each stage of operation of the FFT chip

Step two, converting the twiddle factorExpressed as a trigonometric functionWherein N is the number of operation pointsWhen the number is the twiddle factor of 1-5 grade operation, N is 1024; when in useWhen the twiddle factor is the twiddle factor of the 6 th-stage operation, N is 2048; n is an independent variable, n is 2, 4, 6 … … 254;

step three, converting the twiddle factorAngle element in trigonometric functionIs represented by x, i.e.The value range of x is 0-360 degrees;

step four, calculating all values of cosx and-sinx when the range of x is 0-45 degrees through the steps in step threeNamely, when x ranges from 0 to 45 DEG, the rotation factor is obtainedAll values of (a) areAnd storing the data in a ROM in the FFT chip, namely finishing the storage optimization of the FFT chip.

2. The method for optimizing an FFT chip according to claim 1, wherein: when the rotation factor corresponding to the angle beyond the range of 0-45 DEG needs to be calculatedOnly the rotation factor of x in the range of 0-45 DEG needs to be called from ROMThe corresponding value of (A) is converted by a trigonometric function to obtain a corresponding twiddle factor corresponding to an angle outside the range of 0-45 DEGThe value is obtained.

3. The method for optimizing an FFT chip according to claim 2, wherein: calculating the corresponding twiddle factor of the angle y outside the range of 0-45 degreesThe specific method comprises the following steps:

s1, calculating cosy

When the temperature is 45 °<When y is less than or equal to 90 degrees,wherein x is a value from 45 ° to 0 °;

when y is less than 90 degrees and less than or equal to 135 degrees, cosy is-sinx, wherein x is a value from 0 degrees to 45 degrees;

when 135 ° < y ≦ 180 °, cosy ═ cos (pi-x), x takes a value from 45 ° to 0 °;

when 180 ° < y ≦ 225 °, cosy ═ cosx, x takes on a value from 0 ° to 45 °;

when 225 °<When y is less than or equal to 270 degrees,x is from 45 ° to 0 °;

when 270 ° < y ≦ 315 °, cosy ═ sinx; x is from 0 ° to 45 °;

cosy ═ cos (2 pi-x) when 315 ° < y ≦ 360 °; x is from 45 ° to 0 °;

s2, calculating-sin y

When the temperature is 45 °<When y is less than or equal to 90 degrees,wherein x is a value from 45 ° to 0 °;

-siny-cosx when 90 ° < y ≦ 135 °, wherein x takes a value from 0 ° to 45 °;

-siny-sin (pi-x) when 135 ° < y ≦ 180 °, x is taken from 45 ° to 0 °;

when 180 degrees < y < 225 degrees, -siny ═ sinx, x is from 0 to 45 degrees;

when 225 °<When y is less than or equal to 270 degrees,x is from 45 ° to 0 °;

-syny ═ cosx when 270 ° < y ≦ 315 °; x is from 0 ° to 45 °;

-siny-sin (2 pi-x) when 315 ° < y ≦ 360 °; x is from 45 ° to 0 °;

s3, according to cosy and-siny, according toThe corresponding rotation factor of the angle beyond the range of 0-45 degrees can be reversely calculatedThe value is obtained.

Technical Field

The invention belongs to the field of FFT chip design, and relates to an optimization method of an FFT chip.

Background

The fourier transform is an existing integral transform, i.e. the time-domain to frequency-domain transformation and the mutual transformation. Its name is named because its basic idea is proposed by the french scholar fourier system. There are 4 variants of the fourier transform, namely continuous fourier transform, fourier series, discrete time fourier transform, and discrete fourier transform. With the rapid development of digital electronic technology and integrated circuit design and manufacturing technology, digital signal processing has been widely applied in the fields of radar, communication, image processing, multimedia, and the like. Discrete Fourier Transform (DFT) plays an important role as a basic operation in digital signal processing.

Because the direct calculation of the DFT is too computationally intensive, it is impractical to directly perform spectral analysis and real-time processing of the signal using the DFT algorithm. Until 1965, the situation did not change fundamentally after a fast algorithm of DFT, which is called fast fourier transform, that is, FFT, appeared, and the computation amount of discrete fourier transform was reduced by several orders of magnitude with the introduction of FFT algorithm, so that the implementation and application of digital signal processing became easier.

The ROM in the current FFT chip stores a large amount of data, so that the device is occupied with a large amount of resources, and the operation efficiency is low.

Disclosure of Invention

The technical problem solved by the invention is as follows: the method overcomes the defects of the prior art, and provides an optimization method of an FFT chip, which reduces data stored in a ROM by changing a constant coefficient in FFT calculation, namely the generation and storage of a twiddle factor, thereby improving the efficiency of the device and reducing occupied resources.

The technical scheme of the invention is as follows:

an optimization method of an FFT chip comprises the following steps:

Step two, converting the twiddle factorExpressed as a trigonometric functionWherein N is the number of operation pointsWhen the number is the twiddle factor of 1-5 grade operation, N is 1024; when in useWhen the twiddle factor is the twiddle factor of the 6 th-stage operation, N is 2048; n isIs an independent variable, n is 2, 4, 6 … … 254;

step three, converting the twiddle factorAngle element in trigonometric functionIs represented by x, i.e.The value range of x is 0-360 degrees;

step four, calculating all values of cos x and-sin x when the range of x is 0-45 degrees through the steps in step threeNamely, when x ranges from 0 to 45 DEG, the rotation factor is obtainedAnd storing all the values in a ROM in the FFT chip, namely finishing the storage optimization of the FFT chip.

In the above optimization method for FFT chip, when the rotation factor corresponding to the angle beyond the range of 0-45 DEG needs to be calculatedOnly the rotation factor of x in the range of 0-45 DEG needs to be called from ROMThe corresponding value of (A) is converted by a trigonometric function to obtain a corresponding twiddle factor corresponding to an angle outside the range of 0-45 DEGThe value is obtained.

In the optimization method of the FFT chip, the twiddle factor corresponding to the angle y beyond the range of 0-45 degrees is calculatedThe specific method comprises the following steps:

s1, calculating cosy

When the temperature is 45 °<When y is less than or equal to 90 degrees,wherein x is a value from 45 ° to 0 °;

when y is less than 90 degrees and less than or equal to 135 degrees, cosy is-sinx, wherein x is a value from 0 degrees to 45 degrees;

when 135 ° < y ≦ 180 °, cosy ═ cos (pi-x), x takes a value from 45 ° to 0 °;

when 180 ° < y ≦ 225 °, cosy ═ cosx, x takes on a value from 0 ° to 45 °;

when 225 °<When y is less than or equal to 270 degrees,x is from 45 ° to 0 °;

when 270 ° < y ≦ 315 °, cosy ═ sinx; x is from 0 ° to 45 °;

cosy ═ cos (2 pi-x) when 315 ° < y ≦ 360 °; x is from 45 ° to 0 °;

s2, calculate-siny

When the temperature is 45 °<When y is less than or equal to 90 degrees,wherein x is a value from 45 ° to 0 °;

-siny-cosx when 90 ° < y ≦ 135 °, wherein x takes a value from 0 ° to 45 °;

-siny-sin (pi-x) when 135 ° < y ≦ 180 °, x is taken from 45 ° to 0 °;

when 180 degrees < y < 225 degrees, -siny ═ sinx, x is from 0 to 45 degrees;

when 225 °<When y is less than or equal to 270 degrees,x is from 45 ° to 0 °;

-syny ═ cosx when 270 ° < y ≦ 315 °; x is from 0 ° to 45 °;

-siny-sin (2 pi-x) when 315 ° < y ≦ 360 °; x is from 45 ° to 0 °;

s3, according to cosy and-siny, according toThe corresponding rotation factor of the angle beyond the range of 0-45 degrees can be reversely calculatedThe value is obtained.

Compared with the prior art, the invention has the beneficial effects that:

(1) the 2048-point mixed base FFT chip is optimized, so that the scale and the power consumption are effectively reduced, and the efficiency is improved;

(2) the invention only obtains the twiddle factor when the x ranges from 0 to 45 degrees and the total values of cosx and sinx are all that is to say when the x ranges from 0 to 45 degreesIs stored for a twiddle factor in the range 45 deg. -360 degThe value of (A) can be obtained only by calling conversion, and the storage capacity of several orders of magnitude is reduced, so that the realization and application of digital signal processing become easier.

Drawings

FIG. 1 is a flow chart of the FFT chip of the present invention.

Detailed Description

The invention is further illustrated by the following examples.

The invention provides an optimization method of an FFT chip, which reduces data stored in a ROM by changing a constant coefficient in FFT calculation, namely the generation and storage of twiddle factors, thereby improving the efficiency of a device and reducing occupied resources.

The method for optimizing the FFT chip, as shown in fig. 1, specifically includes the following steps:

step three, converting the twiddle factorAngle element in trigonometric functionIs represented by x, i.e.The value range of x is 0-360 degrees;

step four, calculating all values of cos x and-sin x when the range of x is 0-45 degrees through the steps in step threeNamely, when x ranges from 0 to 45 DEG, the rotation factor is obtainedAnd storing all the values in a ROM in the FFT chip, namely finishing the storage optimization of the FFT chip.

When the rotation factor corresponding to the angle beyond the range of 0-45 DEG needs to be calculatedOnly the rotation factor of x in the range of 0-45 DEG needs to be called from ROMThe corresponding value of (A) is converted by a trigonometric function to obtain a corresponding twiddle factor corresponding to an angle outside the range of 0-45 DEGThe value is obtained.

Calculating the corresponding twiddle factor of the angle y outside the range of 0-45 degreesThe specific method comprises the following steps:

s1, calculating cosy

When the temperature is 45 °<When y is less than or equal to 90 degrees,wherein x is a value from 45 ° to 0 °;

when y is less than 90 degrees and less than or equal to 135 degrees, cosy is-sinx, wherein x is a value from 0 degrees to 45 degrees;

when 135 ° < y ≦ 180 °, cosy ═ cos (pi-x), x takes a value from 45 ° to 0 °;

when 180 ° < y ≦ 225 °, cosy ═ cosx, x takes on a value from 0 ° to 45 °;

when 225 °<When y is less than or equal to 270 degrees,x is from 45 ° to 0 °;

when 270 ° < y ≦ 315 °, cosy ═ sinx; x is from 0 ° to 45 °;

cosy ═ cos (2 pi-x) when 315 ° < y ≦ 360 °; x is from 45 ° to 0 °;

s2, calculate-siny

When the temperature is 45 °<When y is less than or equal to 90 degrees,wherein x is a value from 45 ° to 0 °;

-siny-cosx when 90 ° < y ≦ 135 °, wherein x takes a value from 0 ° to 45 °;

-siny-sin (pi-x) when 135 ° < y ≦ 180 °, x is taken from 45 ° to 0 °;

when 180 degrees < y < 225 degrees, -siny ═ sinx, x is from 0 to 45 degrees;

when 225 °<When y is less than or equal to 270 degrees,x is from 45 ° to 0 °;

-syny ═ cosx when 270 ° < y ≦ 315 °; x is from 0 ° to 45 °;

-siny-sin (2 pi-x) when 315 ° < y ≦ 360 °; x is from 45 ° to 0 °;

s3, according to cosy and-siny, according toThe corresponding rotation factor of the angle beyond the range of 0-45 degrees can be reversely calculatedThe value is obtained.

The continuous fourier transform represents the square integrable function f (t) as an integral or series form of a complex exponential function.

This is an integrated form of the function F (ω) representing the frequency domain as a function F (t) of the time domain.

Discrete Fourier transform:

discrete Fourier Transform (DFT), which is a discrete form of a continuous fourier transform in the time and frequency domains, transforms samples of a time domain signal into samples in the frequency domain.

Let x (N) be a finite sequence of length M, then define the N-point discrete Fourier transform of x (N)

Thus, the inverse transform of sequence x (n):

where x (k) represents DFT-transformed data, x (n) is a sampled analog signal, and x (n) in the formula can be a complex signal, and in practice, x (n) is a real signal.N-1 is an N-point twiddle factor.

Usually x (n) is,are complex numbers such that each calculation of x (k) requires N complex multiplications, N-1 complex additions. When N X (k) values are to be calculated, then N is required²The complex multiplication requires N (N-1) complex additions. Because of the large number of calculations, it is not cost effective to directly use the DFT algorithm for spectral analysis and real-time processing of the signal before the fast fourier transform occurs. The situation changes fundamentally when Fast Fourier Transform (FFT) algorithms are present.

By investigating the coefficients in DFT operationsWe have found that:the symmetry and the periodicity can improve the operation rule of DFT.The symmetry of (a) may allow some terms in the DFT operation to be combined to reduce the number of multiplications during the DFT operation by about half. At the same time, utilizeThe periodicity of (c) may decompose the DFT of the long sequence into a DFT of smaller number of points. The current mature FFT algorithm has base-2, base-4, split base and the like.

FFT operation of radix-2:

the FFT operation of radix-2 can be divided into a time-domain decimation method and a frequency-domain decimation method according to the order of input data. The decimation by time (DIT) radix-2 FFT algorithm is discussed herein. The algorithm divides a complex number sequence x (n) into two groups according to the parity of n:

wherein

X1(k), X2(k) is two DFTs of N/2 point sequences, X1(k) is an even point sequence DFT in an original sequence X, X2(k) is an odd point sequence DFT in an original sequence X (N), and the periods are both N/2. Due to the fact thatAnd according to its periodicity:

it can be seen that a DFT of an N-point complex sequence x (N) can be obtained by DFT transformation of two N/2-point complex sequences. The graph structures of the above two formulas are just butterflies, so they are also called butterfly operation structures.

And continuously performing further decomposition on the two sequences with the length of N/2 points according to the method, namely performing parity decomposition on X1(k) and X2(k) respectively to obtain 4 complex sequences with the length of N/4 points, and integrating the results to obtain a DFT conversion result of the two sequences with the length of N/2 points. And in the same way, continuously decomposing 4 complex sequences with the length of N/4 points according to parity respectively, and after the sequences are finally decomposed into N/2-point complex sequences, decomposing the N/2-point sequences according to parity respectively to obtain N/2 pairs of single-point sequences, wherein each pair of single-point sequences can be synthesized through a butterfly operation to obtain a DFT conversion result of the two-point sequences. And then combining the DFT results of every two corresponding two-point complex sequences into a DFT result of a four-point complex sequence, and so on until the DFT conversion results of the two-point sequences are combined into the DFT result of the N-point complex sequence step by step. Obviously, the DFT that we find the N-point complex sequence is obtained by the inverse process of the parity decomposition process of the previous sequence.

FFT operation of radix-4:

the algorithm is to extract a complex sequence x (N) according to 4r, 4r +1, 4r +2, 4r +3 in time domain, and N is 4 for a length^NThe DFT operation is carried out on the complex sequence of the points, butterfly operation can be carried out once every 4 points, and the operation is completed in M levels.

Where x (N) is an input sequence with a column length N ═ 0, 1, 2.., N-1. He was divided into the following four subsequences according to 4r, 4r +1, 4r +2, 4r + 3:

x_(4r)＝x_1(r)

x_(4r+1)＝x_2(r)

x_(4r+2)＝x_3(r)

x_(4r+3)＝x_4(r)

wherein r is 0, 1, 2, 3, N/4-1. Then using the coefficientThe periodicity and symmetry of (a) can be deduced:

order:

and ordering:

A′＝X_(k)

B′＝X_(k+N/4)

C′＝X_(k+2N/4)

D′＝X_(k+3N/4)

then there are:

X_(k)＝A+W^pB+W^2pC+W^3pD

X_(k+N/4)＝A-jW^pB-W^2pC+jW^3pD

X_(k+2N/4)＝A-W^pB+W^2pC-W^3pD

X_(k+3N/4)＝A+jW^pB-W^2pC-jW^3pD

a radix-4 butterfly requires 3 complex multiplications and 8 complex additions.

According to the method, the four N/4 point subsequences obtained by the first transformation decomposition are decomposed continuously, and the four N/4 point subsequences can be further decomposed to respectively obtain four N/16 point subsequences.Similarly, the DFT result of the N/4-point sequence can be obtained through the DFT results of the four N/16-point subsequences. By analogy, N is 4^sAfter the point sequences are subjected to butterfly decomposition for s-1 time, N/4 point subsequences can be obtained, each 4 point subsequence is a butterfly operation unit, and then DFT conversion results of each 4 point subsequence are combined step by step to obtain a DFT conversion result of the N point sequence.

When the sequence with the sequence length of N is subjected to radix-4 FFT operation, each stage has N/4 butterfly units for operation, and is multiplied by a group of twiddle factors respectivelyThus only the k value needs to be determined to obtain a set of twiddle factors. In the first stage, all butterfly operations only need to be carried out with complex addition operation, the second stage k has 4 values, the third stage k has 16 values, and so on, the number of the mth stage twiddle factors is 4^m-1Until the last stage has N/4 twiddle factors.

The existing ASIC implementation of 2048-point FFT adopts a mixed algorithm of radix-2 FFT and radix-4 FFT algorithms, namely a mixed base (2) of primary base 2 and 5-level base 4¹¹＝2¹×4⁵) To be implemented. And (3) realizing data transformation of a 1024-point sequence by using a radix-4 algorithm, and then completing FFT spectrum transformation from two 1024 points to 2048 points by using a radix-2 algorithm.

Taking frequency-domain decimation as an example, the base-2 algorithm can be used in the case of sequence length N2048 and 2048 2 × 1024, and the algorithm derives the following formula:

when k is an even number

When k is an odd number

The base 4 algorithm may be used in the case of a sequence length N of 1024, 1024 of 4x256, and the algorithm derives the formula as follows:

the following expression is defined according to the twiddle factor:

substituting the three expressions into a radix-4 FFT algorithm to obtain

The above formula N-point DFT sequence is divided into 4N/4-point subsequences, i.e. X (4k), X (4k +1), X (4k +2) and X (4k +3), thereby simplifying radix-4 by frequency decimation FFT formulation as:

therein utilize

Each N/4 point DFT input is a linear combination of 4 signal samples and a phase factor. This process was repeated v times, where v is log 4N. For example, 1024-point DFT needs to be repeated 5 times, that is, there are 5-level operations.

In 1024-point FFT implementation, most circuits mainly use three ROMs to initialize twiddle factors for FFT operation, and the three ROMs store 255 twiddle factors respectively. 1024-point radix-4 FFT operation needs 5 levels of butterfly operation, and each level has 256 butterfly units. Therefore, the ROM is required to store a large amount of data, resources are occupied, efficiency is reduced, and storage of the twiddle factors is optimized.

Since the twiddle factor is essentially a constant consisting of periodic functions cos and sin.

Other values for this function can be calculated by flipping and phase shifting as long as the value of 1/8 for the trigonometric function is known, so only the value of 1/8 needs to be stored in ROM.

Although the present invention has been described with reference to the preferred embodiments, it is not intended to limit the present invention, and those skilled in the art can make variations and modifications of the present invention without departing from the spirit and scope of the present invention by using the methods and technical contents disclosed above.

12页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种新型快速高效的滤波器小样本建模及优化方法

Optimization method of FFT chip

相关技术

网友询问留言