Time-frequency difference parameter joint estimation GPU implementation method of communication signals

文档序号:167708 发布日期:2021-10-29 浏览:21次 中文

阅读说明:本技术 一种通信信号的时频差参数联合估计gpu实现方法 (Time-frequency difference parameter joint estimation GPU implementation method of communication signals ) 是由 侯素霞 夏畅雄 董剑峰 于 2021-05-26 设计创作,主要内容包括:本发明公开了一种通信信号的时频差参数联合估计GPU实现方法,该方法包括设定联合估计的分段次数,并将通信信号数据从CPU拷贝至GPU内存;利用CUDA核函数根据设定的分段次数对通信信号数据进行分段式时频差参数联合估计;将时频差参数联合估计结果传回CPU内存,并释放显存资源。本发明通过充分利用GPU的并行计算能力,极大的提高了时频差参数估计运算效率,能够在精度不变的情况下,其时效性远远超过原先的CPU系统,极大满足了当前实时定位系统的要求。(The invention discloses a time-frequency difference parameter joint estimation GPU implementation method of communication signals, which comprises the steps of setting the segmentation times of joint estimation and copying communication signal data from a CPU to a GPU memory; carrying out sectional time-frequency difference parameter joint estimation on communication signal data by using a CUDA kernel function according to the set sectional times; and returning the time-frequency difference parameter joint estimation result to the CPU memory, and releasing the video memory resource. The invention greatly improves the time-frequency difference parameter estimation operation efficiency by fully utilizing the parallel computing capability of the GPU, can greatly exceed the time efficiency of the original CPU system under the condition of unchanged precision, and greatly meets the requirements of the current real-time positioning system.)

1. A time-frequency difference parameter joint estimation GPU realization method of communication signals is characterized by comprising the following steps:

s1, setting the segmentation times of the joint estimation, and copying the communication signal data from the CPU to the GPU memory;

s2, carrying out sectional time-frequency difference parameter joint estimation on the communication signal data by using a CUDA kernel function according to the number of the sections set in the step S1;

and S3, transmitting the time-frequency difference parameter joint estimation result obtained in the step S2 back to a CPU memory, and releasing the video memory resource.

2. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 1, wherein the step S1 specifically includes the following sub-steps:

s11, estimating the size of the memory required to be allocated by the current time-frequency difference parameter joint estimation;

s12, acquiring GPU performance data;

s13, determining the segmentation times of joint estimation according to the memory size estimated in the step S11 and the GPU performance data acquired in the step S12;

s14, allocating GPU memory;

s15, copying the communication signal data from the CPU to the GPU memory distributed in the step S14.

3. The method of claim 2, wherein the formula for computing the segmentation times is as follows:

where m is the number of segmentation, N is the data length, fsFor the time search range, trFor data sampling rate, Y is the memory space of the GPU.

4. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 1, wherein the step S2 specifically includes the following sub-steps:

s21, respectively carrying out displacement conjugate dot multiplication on the main satellite signal and the adjacent satellite signal according to the set segmentation times by adopting a GPU parallel mode;

s22, performing fast Fourier transform based on CUDA on the result of the shift conjugate point multiplication obtained in the step S21;

s23, calculating a power spectrum value in the frequency domain searching range for the fast Fourier transform result obtained in the step S22;

and S24, calculating the maximum value and the mean value of the power spectrum value obtained in the step S23 by adopting a parallel reduction algorithm and a shared memory strategy to obtain a time-frequency difference parameter estimation value and a reference SNR value.

5. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 4, wherein the step S21 specifically includes the following sub-steps:

s211, shifting the adjacent satellite signal y for R-M +1 times according to the time difference searching range; wherein M is a time difference search starting point index, and R is a time difference search end point index;

s212, carrying out conjugate calculation on the shifted adjacent satellite signals in sequence to obtain a shift conjugate matrix of the adjacent satellite signals, wherein the shift conjugate matrix is expressed as

Wherein B is a shift conjugate matrix of the adjacent satellite signals,carrying out time shift on signal data corresponding to adjacent satellite signals once, wherein N is the data length;

s213, performing point multiplication on the shift conjugate matrix of the adjacent satellite signals obtained in the step S212 and the main satellite signals to obtain a shift conjugate point multiplication matrix of the main satellite signals and the adjacent satellite signals, wherein the shift conjugate point multiplication matrix is expressed as

Wherein, C is a shift conjugate point multiplication matrix of the main satellite signal and the adjacent satellite signal, A is the main satellite signal, and A ═ x1 x2 x3…xN]。

6. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 5, wherein the step S21 further includes corresponding each thread of the GPU to each matrix element data in the shifted conjugate dot product matrix of the master satellite signal and the neighbor satellite signal, representing the shifted conjugate dot product matrix of the master satellite signal and the neighbor satellite signal by using a two-dimensional grid and two-dimensional blocks, and establishing a corresponding relationship between row indexes and column indexes of the shifted conjugate dot product matrix of the master satellite signal and the neighbor satellite signal and storage index positions thereof.

7. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 4, wherein the step S22 specifically includes the following sub-steps:

s221, calling a cuFFTPlan management function by using a CUDA kernel function to generate FFTPlan;

s222, calling cuffExecZ 2Z by using a CUDA kernel function to calculate a fast Fourier transform result, wherein the result is expressed as:

wherein D is a fast Fourier transform matrix and FFT is a fast Fourier transform operation.

8. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 4, wherein the step S23 specifically includes the following sub-steps:

s231, FFTShift shifting is carried out on the fast Fourier transform result obtained in the step S22, and the zero frequency component is shifted to the center of the frequency spectrum and is expressed as

Wherein, Shift _ D is an FFTShift Shift matrix;

s232, calculating a generalized power spectrum amplitude value according to the FFTShift shift matrix obtained in the step S231, and expressing the generalized power spectrum amplitude value as

Wherein, E is a generalized power spectrum amplitude value, P is a frequency domain search starting position index, and Q is a frequency domain search ending position index.

9. The method for implementing time-frequency difference parameter joint estimation GPU for communication signals according to claim 4, wherein the step S24 specifically includes the following sub-steps:

s241, copying the global memory into a shared memory, and calculating the maximum value or the accumulated sum of the power spectrum values obtained in the step S23 when the memory is loaded for the first time;

s242, setting the number of blocks according to the data length and the maximum thread number supported by the GPU, executing parallel protocol calculation in each thread block by adopting a single-instruction-multiple-data-stream mode, storing a value obtained by final protocol at the block index position of a result array, merging the data length multiplied by the maximum thread number in each calculation, and obtaining an index corresponding to the maximum value through continuous iteration until the data length is merged into a numerical value;

s243, calculating time-frequency difference parameter estimation values according to the indexes corresponding to the maximum values obtained in the step S242, and calculating reference SNR values according to the obtained maximum values and the average value of the amplitude values of the generalized power spectrum, wherein the reference SNR values are expressed as

dto=i·1/fs+t0=i·1/fs+M·1/fs=(M+i)/fs

dfo=j·fs/N+f0=j·fs/N+(-fs/2+P·fs/N)=(j+P)·fs/N-fs/2

Wherein dto is the time difference parameter estimation value, dfo is the frequency difference parameter estimation value, (i, j) is the index corresponding to the maximum value, fsIs the sampling rate, f0For the start frequency, t, of the frequency domain search range0Searching for a start time for a time difference, M a time difference searching start point index, P a frequency difference searching start point index, emaxMaximum value of amplitude value of generalized power spectrum, EmeanIs the average value of the amplitude values of the generalized power spectrum.

Technical Field

The invention relates to the technical field of communication positioning, in particular to a time-frequency difference parameter joint estimation GPU implementation method of communication signals.

Background

The TDOA/FDOA index can be used for a joint positioning system, the estimation precision of the TDOA and the FDOA directly influences the positioning precision, and the processing speed directly influences the real-time performance of the positioning system. The traditional positioning system is realized by adopting a CPU, and the processing time consumption of the method is increased linearly along with the increase of the data volume. When the using scene has no time-frequency difference value prior condition, the time-frequency difference parameter estimation calculation exceeding 2G data volume is needed, and the positioning system in the CPU mode can hardly be used in real time. With the tremendous development of the High Performance Computing (HPC) domain, especially the advent of GPU-CPU heterogeneous architecture, it is possible to directly coordinate linear control with big data concurrent computing.

The traditional parameter estimation process adopts a serial CPU calculation architecture, when the data volume is large, serial calculation is time-consuming, and the calculation time consumption is linearly increased along with the change of the calculation amount, so that the time-consuming problem is urgently needed to be solved.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a method for realizing the joint estimation of the time-frequency difference parameters of the communication signals and the GPU.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that:

a time-frequency difference parameter joint estimation GPU implementation method of communication signals comprises the following steps:

s1, setting the joint estimation segmentation times, and copying the communication signal data from the CPU to the GPU memory;

s2, carrying out sectional time-frequency difference parameter joint estimation on the communication signal data by using a CUDA kernel function according to the number of the sections set in the step S1;

and S3, transmitting the time-frequency difference parameter joint estimation result obtained in the step S2 back to a CPU memory, and releasing the memory display resource.

Further, the step S1 specifically includes the following sub-steps:

s11, estimating the size of the memory required to be allocated by the current time-frequency difference parameter joint estimation;

s12, acquiring GPU performance data;

s13, determining the segmentation times of joint estimation according to the memory size estimated in the step S11 and the GPU performance data acquired in the step S12;

s14, allocating GPU memory;

s15, copying the communication signal data from the CPU to the GPU memory distributed in the step S14.

Further, the calculation formula of the segmentation times is as follows:

where m is the number of segmentation, N is the data length, fsFor the time search range, trFor data sampling rate, Y is the memory space of the GPU.

Further, the step S2 specifically includes the following sub-steps:

s21, respectively carrying out displacement conjugate dot multiplication on the main satellite signal and the adjacent satellite signal according to the set segmentation times by adopting a GPU parallel mode;

s22, performing fast Fourier transform based on CUDA on the result of the shift conjugate point multiplication obtained in the step S21;

s23, calculating a power spectrum value in the frequency domain searching range for the fast Fourier transform result obtained in the step S22;

and S24, calculating the maximum value and the mean value of the power spectrum value obtained in the step S23 by adopting a parallel reduction algorithm and a shared memory strategy to obtain a time-frequency difference parameter estimation value and a reference SNR value.

Further, the step S21 specifically includes the following sub-steps:

s211, shifting the adjacent satellite signal y for R-M +1 times according to the time difference searching range; wherein M is a time difference search starting point index, and R is a time difference search end point index;

s212, carrying out conjugate calculation on the shifted adjacent satellite signals in sequence to obtain a shift conjugate matrix of the adjacent satellite signals, wherein the shift conjugate matrix is expressed as

Wherein B is a shift conjugate matrix of the adjacent satellite signals,performing time shift on signal data corresponding to adjacent satellite signals once, wherein N is the data length;

s213, performing point multiplication on the shift conjugate matrix of the adjacent satellite signals obtained in the step S212 and the main satellite signals to obtain a shift conjugate point multiplication matrix of the main satellite signals and the adjacent satellite signals, wherein the shift conjugate point multiplication matrix is expressed as

Wherein, C is a shift conjugate point multiplication matrix of the main satellite signal and the adjacent satellite signal, A is the main satellite signal, and A ═ x1 x2 x3… xN]。

Further, the step S21 further includes corresponding each thread of the GPU to each matrix element data in the shifted conjugate point multiplication matrix of the master satellite signal and the neighboring satellite signal, representing the shifted conjugate point multiplication matrix of the master satellite signal and the neighboring satellite signal by using a two-dimensional grid and a two-dimensional block, and establishing a corresponding relationship between a row index and a column index of the shifted conjugate point multiplication matrix of the master satellite signal and the neighboring satellite signal and a storage index position thereof.

Further, the step S22 specifically includes the following sub-steps:

s221, calling a cuFFTPlan management function by using a CUDA kernel function to generate FFTPlan;

s222, calling cuffExecZ 2Z by using a CUDA kernel function to calculate a fast Fourier transform result, wherein the result is expressed as:

wherein D is a fast Fourier transform matrix and FFT is a fast Fourier transform operation.

Further, the step S23 specifically includes the following sub-steps:

s231, FFTShift shifting is carried out on the fast Fourier transform result obtained in the step S22, and the zero frequency component is shifted to the center of the frequency spectrum and is expressed as

Wherein, Shift _ D is an FFTShift Shift matrix;

s232, calculating a generalized power spectrum amplitude value according to the FFTShift shift matrix obtained in the step S231, and expressing the generalized power spectrum amplitude value as

Wherein, E is a generalized power spectrum amplitude value, P is a frequency domain search starting position index, and Q is a frequency domain search ending position index.

Further, the step S24 specifically includes the following sub-steps:

s241, copying the global memory into a shared memory, and calculating the maximum value or the accumulated sum of the power spectrum values obtained in the step S23 when the memory is loaded for the first time;

s242, setting the number of blocks according to the data length and the maximum thread number supported by the GPU, executing parallel protocol calculation in each thread block by adopting a single-instruction-multiple-data-stream mode, storing a value obtained by final protocol at the block index position of a result array, calculating and merging the data length which is multiplied by the maximum thread number each time, and obtaining an index corresponding to the maximum value through continuous iteration until the data length is merged into a numerical value;

s243, calculating time-frequency difference parameter estimation values according to the indexes corresponding to the maximum values obtained in the step S242, and calculating reference SNR values according to the obtained maximum values and the average value of the amplitude values of the generalized power spectrum, wherein the reference SNR values are expressed as

dto=i·1/fs+t0=i·1/fs+M·1/fs=(M+i)/fs

dfo=j·fs/N+f0=j·fs/N+(-fs/2+P·fs/N)=(j+P)·fs/N-fs/2

Wherein dto is the time difference parameter estimation value, dfo is the frequency difference parameter estimation value, and (i, j) is the maximum valueCorresponding index, fsIs the sampling rate, f0For the start frequency, t, of the frequency domain search range0Searching for a start time for a time difference, M a time difference searching start point index, P a frequency difference searching start point index, emaxMaximum value of generalized power spectrum amplitude value, EmeanIs the average value of the amplitude values of the generalized power spectrum.

The invention has the following beneficial effects:

the invention greatly improves the time-frequency difference parameter estimation operation efficiency by fully utilizing the parallel computing capability of the GPU, can greatly exceed the time efficiency of the original CPU system under the condition of unchanged precision, and greatly meets the requirements of the current real-time positioning system.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a diagram illustrating a cross-ambiguity function in an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a sectional type parameter estimation process of a GPU according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a protocol flow in an embodiment of the present invention;

FIG. 5 is a graph illustrating a parallel processing speed-up ratio curve according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.

Referring to fig. 1, an embodiment of the present invention provides a method for implementing a time-frequency difference parameter joint estimation GPU for communication signals, including the following steps S1 to S3:

s1, setting the joint estimation segmentation times, and copying the communication signal data from the CPU to the GPU memory;

in this embodiment, step S1 specifically includes the following sub-steps:

s11, estimating the size of the memory required to be allocated by the current time-frequency difference parameter joint estimation;

s12, acquiring GPU performance data;

s13, determining the segmentation times of joint estimation according to the memory size estimated in the step S11 and the GPU performance data acquired in the step S12, wherein the calculation formula is as follows:

where m is the number of segmentation, N is the data length, fsFor the time search range, trTaking the data sampling rate as the reference, and Y is the memory space of the GPU;

s14, allocating GPU memory;

s15, copying the communication signal data from the CPU to the GPU memory distributed in the step S14.

S2, carrying out sectional time-frequency difference parameter joint estimation on the communication signal data by using a CUDA kernel function according to the number of the sections set in the step S1;

in this embodiment, when performing parameter estimation on a communication signal, it is assumed that a signal transmitted by a target is intercepted by two receiving stations, x (t) and y (t), respectively, and the receiving station receives a division signal s1(t) and s2(t) besides, there is a noise n which is statistically independent of each other and which is uncorrelated with the signal1(t) and n2(t) is represented by

x(t)=s1(t)+n1(t)

Where A is the amplitude variation factor resulting from the different paths, the two received signals have a time offset τ due to the movement of the target radiation source relative to the two receiving stations0And frequency offset fd0

The mutual ambiguity function of the two signals is defined as:

wherein X and Y are Fourier transform corresponding to X (t) and Y (t) signals respectively, and tau is time difference and frequency deviation f of the two signalsd

By definition, the mutual ambiguity function of two signals of the receiving station can be obtained and expressed as

It can be seen from the definition of the fuzzy function that the fuzzy function is a time-frequency representation based on the generalized cross-correlation, and each dimension of the fuzzy function can be regarded as a correlation operation, so that the correlation of the signals directly determines the result of the cross-fuzzy function. In general, signals and noise are independent from each other, and therefore mutual fuzzy function accumulation results of homologous signalsMuch greater than the other results. Therefore when tau-D0=0,fd-fd0When the mutual ambiguity function reaches a maximum value of 0, then τ is equal to D0,fd=fd0Namely, the time difference and the sum frequency difference of the two paths of signals, and the time-frequency difference parameter estimation can be realized.

A typical cross-blur function is shown in fig. 2. Since the two satellite uplink signals have correlation, and the correlation between the signals and noise and other signals is weak, the cross-ambiguity function peak of the signals is much larger than the peak of clutter and noise as long as T is long enough.

The mutual fuzzy function calculation flow adopted in the invention is firstly according to the search range t of the time domainrAnd a sampling rate fsAnd calculating to obtain the time search range number n-tr-fs, and then executing the subsequent 3 steps in a loop to obtain the mutual fuzzy function result. As shown in fig. 3, includes: 1) the reference signals are time-shifted and then co-shiftedCalculating a yoke, and finally performing corresponding point multiplication on the target signal and the reference signal to obtain shift conjugate point multiplication; 2) calculating the FFT result of the first step; 3) and calculating the amplitude value according to the FFT result, namely a power spectrum calculation process. 4) Searching a peak value of the mutual fuzzy function, and estimating time difference and frequency difference of the two paths of signals according to an index position corresponding to the peak value; 4) and calculating the mean value of the cross-fuzzy function matrix, and calculating the SNR according to the peak value and the mean value result.

Step S2 specifically includes the following substeps:

s21, respectively carrying out displacement conjugate dot multiplication on the main satellite signal and the adjacent satellite signal according to the set segmentation times by adopting a GPU parallel mode;

specifically, step S21 specifically includes the following sub-steps:

s211, assuming that the signal of the main star is A ═ x1 x2 x3 … xN]The adjacent star signal is Y ═ Y1 y2 y3 … yN]The signal length is N, the corresponding index of the time difference search range is [ M-R]M, R is an integer, M is less than or equal to N, R is less than or equal to N, M is a time difference search starting point index, R is a time difference search end point index, and the adjacent star signals are shifted for R-M +1 times according to the time difference search range;

s212, carrying out conjugate calculation on the shifted adjacent satellite signals in sequence to obtain a shift conjugate matrix B of the adjacent satellite signals, wherein the shift conjugate matrix B is expressed as

Where B is the shifted conjugate matrix of the adjacent satellite signal, Bi,jPerforming time shift on signal data corresponding to adjacent satellite signals once, wherein N is the data length; wherein b isi,jThe relationship between the result and the original signal is represented by the following formula, y[n]Representing the corresponding original reference signal sampling value in the index range of 1 to N, and the conversion relationship is

bi,j=y[j-(M+i-1)]

Wherein, y[n]Indicating the signal sample value at the corresponding position after shifting, N indicating the shift index, e.g. N is 1-M-R, when N is less than 1, the index is increased by N, and the index corresponding to the original signal value after shifting is y1-M-R+N;yn+NN + N value, Y, representing adjacent satellite signal Yn-NRepresenting the N-N value of the adjacent satellite signal Y.

S213, performing dot product calculation on the shift conjugate matrix of the adjacent satellite signals obtained in the step S212 and each row matrix element in the main satellite signals to obtain a shift conjugate dot product matrix C of the main satellite signals and the adjacent satellite signals, which is expressed as

Wherein, C is a shifting conjugate point multiplication matrix of the main satellite signal and the adjacent satellite signal.

After obtaining the shift conjugate point multiplication matrix C of the main satellite signal and the adjacent satellite signal, the invention makes each thread of the GPU correspond to each matrix element data in the shift conjugate point multiplication matrix of the main satellite signal and the adjacent satellite signal, adopts a two-dimensional grid and a two-dimensional block to represent the shift conjugate point multiplication matrix of the main satellite signal and the adjacent satellite signal, and establishes the corresponding relation between the row index and the column index of the shift conjugate point multiplication matrix of the main satellite signal and the adjacent satellite signal and the storage index position thereof, and the corresponding relation is expressed as

tidx=iy*N+ix

Where tidx is the index position of the shifted conjugate point multiplied by matrix C, ix is the row index of the shifted conjugate point multiplied by matrix C, and iy is the column index of the shifted conjugate point multiplied by matrix C.

Wherein each thread computes a conjugate point multiplication of a set of data, denoted as

Wherein, cm,nRepresents the (m.n) th value of the shifted conjugate point multiplication matrix C, Re () represents the real part of the complex number, Im () represents the imaginary part of the complex number, xmRepresents the mth value of the signal of the main satellite,the 1-M-M-n representing adjacent star signals]The value, position index conversion relation refers to the description in step S212.

S22, performing fast Fourier transform based on CUDA on the result of the shift conjugate point multiplication obtained in the step S21;

specifically, the present invention employs a CUDA-based Fast Fourier Transform (FFT) to compute FFT results for each row of the shift conjugate dot product matrix C.

Step S22 specifically includes the following substeps:

s221, calling a cuFFTPlan management function by using a CUDA kernel function to generate FFTPlan;

s222, calling cuffExecZ 2Z by using a CUDA kernel function to calculate a fast Fourier transform result, wherein the result is expressed as:

wherein D is a fast Fourier transform matrix and FFT is a fast Fourier transform operation.

S23, calculating a power spectrum value in the frequency domain searching range for the fast Fourier transform result obtained in the step S22;

specifically, step S23 specifically includes the following sub-steps:

s231, FFTShift shifting is carried out on the fast Fourier transform result obtained in the step S22, and the zero frequency component is shifted to the center of the frequency spectrum and is expressed as

Wherein, Shift _ D is an FFTShift Shift matrix;

s232, setting a frequency domain search range to be (P, Q), calculating the square of the absolute value of the P-Q array according to the FFTShift shift matrix obtained in the step S231, and obtaining a generalized power spectrum amplitude value expressed as

Wherein E is a generalized power spectrum amplitude value, P is a frequency difference search starting point index, and Q is a frequency difference search end point index.

And S24, calculating the maximum value of the power spectrum value obtained in the step S23 by adopting a parallel reduction algorithm and a shared memory strategy to obtain a time-frequency difference parameter estimation value and a reference SNR value.

Specifically, step S24 specifically includes the following sub-steps:

s241, copying the global memory into a shared memory, and calculating the maximum value of the power spectrum value obtained in the step S23 when the memory is loaded for the first time;

s242, executing parallel specification calculation in a single instruction multiple data stream mode in each thread block, as shown in fig. 4, continuously iterating, and storing a value obtained by final specification at a position where an index is 0 to obtain the value and an index (i, j) corresponding to the value;

s243, calculating time-frequency difference parameter estimation value according to index (i, j) corresponding to maximum value obtained in step S242, and obtaining maximum value emaxAnd mean value Emean=Esum/(Q-P +1) · (R-M +1) A reference SNR value can be calculated, expressed as

dto=i·1/fs+t0=i·1/fs+M·1/fs=(M+i)/fs

dfo=j·fs/N+f0=j·fs/N+(-fs/2+P·fs/N)=(j+P)·fs/N-fs/2

The final time-frequency difference parameter estimation SNR result isemaxIs the maximum value of the mutually fuzzy matrix E, EmeanIs the mean of the cross-ambiguity matrix,

the SNR is an index for evaluating the performance of the correlation result, the larger the SNR is, the better the correlation is, the higher the reliability of the time-frequency difference is, and conversely, the smaller the SNR is, the worse the correlation is. The SNR calculation method is represented by the formula:

the final time-frequency difference parameter estimation SNR result isemaxIs the maximum value of the mutually fuzzy matrix E, EmeanIs the mean of the cross-ambiguity matrix,

and S3, transmitting the time-frequency difference parameter joint estimation result obtained in the step S2 back to a CPU memory, and releasing the memory display resource.

The following is a description of the analysis of the beneficial effects of the time-frequency difference parameter joint estimation GPU implementation method of the present invention with reference to specific examples.

The experimental conditions are as follows: data sampling rate fs1048576, N, f, frequency difference search ranger10kHz, time difference range trProcessing is performed using a serial algorithm and a parallel algorithm, respectively, for 2us to 10ms, to obtain a time difference parameter, which takes time as shown in table 1.

TABLE 1 Serial and parallel processing time consuming

Wherein, the data amount is data length (time)Interval/(1/sampling rate)) ═ N · tr·fsAnd 1M 1024, 1G 1024. The experimental GPU used NVIDIA Tesla P100-PCIE-16GB, and the performance of the experimental GPU is shown in Table 2.

TABLE 2 Tesla P100 Primary Performance parameters

As shown in fig. 5, for the parallel processing acceleration ratio curve, it can be seen that the acceleration ratio linearly increases as the data amount increases. Compared with a serial algorithm, the parallel time-frequency difference parameter estimation has the advantage that the processing speed is greatly improved.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and as described above, the content of the present specification should not be construed as a limitation to the present invention.

It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific modifications and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these modifications and combinations are within the scope of the invention.

17页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种地下空间的三维定位方法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!