Multi-user pairing method, device and system

文档序号：1956525 发布日期：2021-12-10 浏览：20次中文

阅读说明：本技术 一种多用户配对方法、装置及系统 (Multi-user pairing method, device and system ) 是由刘国臣赵铭明寸文璟陈志堂黄崇德赵景于 2020-06-10 设计创作，主要内容包括：本发明公开了一种多用户配对的方法、装置及系统,该方法包括：获取用户特征信息,所述用户特征信息包括用户信道、信道质量指示以及历史速率；根据所述用户特征信息获取用户之间的相关信息；根据所述用户之间的相关信息获取用户配对的被选择概率分布；根据所述用户配对的被选择概率分布对用户进行采样,根据采样结果对用户进行配对。本申请提供的这种多用户配对的方法无需迭代运算,相对于现有技术计算量较小,可以降低运算复杂度,提高实时性。(The invention discloses a method, a device and a system for pairing multiple users, wherein the method comprises the following steps: acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate; acquiring related information among users according to the user characteristic information; obtaining the selected probability distribution of the user pairs according to the related information among the users; and sampling the users according to the selected probability distribution of the user pairing, and pairing the users according to the sampling result. The multi-user pairing method provided by the application does not need iterative operation, and compared with the prior art, the method is small in calculation amount, and can reduce operation complexity and improve instantaneity.)

1. A method of multi-user pairing, the method comprising:

acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate;

acquiring related information among users according to the user characteristic information;

obtaining the selected probability distribution of the user pairs according to the related information among the users;

and sampling the users according to the selected probability distribution of the user pairing, and pairing the users according to the sampling result.

2. The method according to claim 1, wherein the obtaining the related information between users according to the user feature information comprises:

and performing attention mechanism processing based on natural language processing on the user characteristic information, and acquiring related information among users.

3. The method of claim 1, wherein obtaining a selected probability distribution of a user pair according to the related information between the users comprises:

and carrying out multilayer perceptron processing on the related information among the users to obtain the selected probability distribution of the user pair.

4. The method of claim 1, wherein the sampling users according to the selected probability distribution of the user pair, and wherein the pairing users according to the sampling result comprises:

and sampling the selected probability distribution of the user pairing by a bisection method, and pairing the users with the sampling result of 1.

5. The method of any one of claims 1 to 4, further comprising:

and performing reinforcement learning on the cost function according to the selected probability distribution of the user pairs and the number of the users.

6. An apparatus for multi-user pairing, the apparatus comprising:

the first processing module is used for acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate;

the second processing module is used for acquiring related information among users according to the user characteristic information;

the third processing module is used for acquiring the selected probability distribution of the user pair according to the related information among the users;

and the fourth processing module is used for sampling the users according to the selected probability distribution of the user pairing and pairing the users according to the sampling result.

7. The apparatus of claim 6,

the second processing module is specifically configured to perform attention mechanism processing based on natural language processing on the user feature information, and acquire related information between users.

8. The apparatus of claim 6,

the third processing module is specifically configured to perform multi-layer perceptron processing on the relevant information between the users to obtain a selected probability distribution of user pairing.

9. The apparatus of claim 6,

the fourth processing module is specifically configured to sample the selected probability distribution of the user pairing by a bisection method, and pair the user whose sampling result is 1.

10. The apparatus of any one of claims 6 to 9, further comprising:

and the fifth processing module is used for carrying out reinforcement learning on the cost function according to the selected probability distribution of the user pair and the number of the users.

11. A neural network is characterized by comprising an attention mechanism network based on natural language processing and a multilayer perceptron network, wherein the multilayer perceptron network comprises a probability distribution acquisition module and a sampling module;

the attention mechanism network based on natural language processing is used for acquiring user characteristic information and acquiring related information among users according to the user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate;

the probability distribution acquisition module is used for acquiring the selected probability distribution of the user pair according to the related information among the users;

and the sampling module is used for sampling the users according to the selected probability distribution of the user pairing and pairing the users according to the sampling result.

12. The neural network of claim 11,

the attention mechanism network based on natural language processing is specifically used for performing attention mechanism processing based on natural language processing on the user characteristic information and acquiring related information among users.

13. The neural network of claim 11,

the probability distribution obtaining module is specifically configured to perform multi-layer perceptron processing on the related information between the users to obtain a selected probability distribution of user pairing.

14. The neural network of claim 11,

the sampling module is specifically configured to sample the selected probability distribution of the user pairing by a bisection method, and pair the user whose sampling result is 1.

15. The neural network of any one of claims 11 to 14, further comprising:

and the reinforcement learning network is used for reinforcement learning of the cost function according to the selected probability distribution of the user pairs and the number of the users.

16. A base station, characterized in that the base station is configured to perform the method for multi-user pairing according to any one of claims 1 to 5.

17. A computer program product, which, when run on a computer apparatus, causes the computer apparatus to perform the method of multi-user pairing of any one of claims 1 to 5.

18. A computer readable storage medium comprising a computer program or instructions for causing a computing device to perform the method of multi-user pairing of any one of claims 1 to 5 when the computer program or instructions is run on the computing device.

19. A chip, comprising: a processor coupled to a memory for storing a computer program or instructions, the processor for executing the computer program or instructions in the memory to implement the method of multi-user pairing of any one of claims 1 to 5.

Technical Field

The invention relates to the field of communication, in particular to a multi-user pairing method, device and system.

Background

In a wireless network scenario, a user (terminal device) is connected with a nearby base station through an air interface, and a communication pipeline is established to complete a data communication service of the user. With the rapid development of internet services and the popularization of terminal devices such as mobile phones and tablet computers, wireless services are becoming more and more complex. In a wireless environment, there are often a large number of users. Therefore, the base station equipment is more strained from using limited resources such as frequency, calculation, storage and the like.

In a Multiple Input Multiple Output (MIMO) or massive MIMO communication system, different users may be paired and the same resource may be multiplexed for data transmission, thereby improving the spectrum efficiency of a base station. Users are selected for pairing, and the key is to reduce interference signals between users and improve the signal to interference plus noise ratio (SINR) of users. The users may be paired using a greedy criterion (greedy criterion) based multi-user pairing method. The method can select the paired users one by one according to the channel states of the users, and can improve the throughput of the system to the maximum extent while reducing the search complexity. The method can also introduce a proportional fair (proportional fair) algorithm to change a maximization objective function so as to solve the problem of low throughput of edge users.

The multi-user pairing method based on the greedy criterion needs to traverse all users, and performs combination calculation of a maximized objective function with a selected user set until the result of the objective function is not increased any more, and finally completes user pairing. For example, there are 100 users, after selecting one user from the 100 users, the remaining 99 users are traversed, the user with the largest gain is selected as the second paired user, and then after traversing the remaining 98 users, the user with the largest gain is selected as the third paired user … … until there is no gain. Because the method needs iterative operation, all unselected users need to be traversed when one user is selected, the calculation amount is large, and the calculation time complexity is high.

Disclosure of Invention

A first aspect of an embodiment of the present application provides a method for multi-user pairing, where the method includes: acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate; acquiring related information between users according to the user characteristic information; obtaining the selected probability distribution of user pairing according to the related information among the users; and sampling the users according to the selected probability distribution of the user pairing, and pairing the users according to the sampling result. The multi-user pairing method can acquire related information among users according to the user characteristic information, and then acquire the selected probability distribution of user pairing according to the high-dimensional matrix expression among the users. The users are sampled according to the selected probability distribution of the user pairing, and then the users are paired according to the sampling result, iterative operation is not needed, the calculation amount is small compared with the prior art, the operation complexity and the time complexity can be reduced, and the real-time performance is improved. And sampling the users according to the selected probability distribution of the user pairs, so that the users can respectively output the pairs according to the sampling result.

Optionally, with reference to the first aspect, in a first possible implementation manner of the first aspect, the obtaining relevant information between users according to the user feature information includes: and performing attention mechanism processing based on natural language processing on the user characteristic information, and acquiring related information among users. The correlation information between the users may indicate the correlation of each user with the overall status. The user characteristics are subjected to attention mechanism processing based on natural language processing without iterative operation, so that the operation amount can be reduced.

Optionally, with reference to the first aspect, in a second possible implementation manner of the first aspect, the obtaining a selected probability distribution of a user pair according to related information between users includes: and carrying out multi-layer perceptron processing on the related information among the users to obtain the selected probability distribution of the user pairs. The users can be paired according to the selected probability distribution of the user pairing, so that the same resources can be multiplexed for data transmission, and the spectrum efficiency is improved.

Optionally, with reference to the first aspect, in a third possible implementation manner of the first aspect, the sampling the users according to the selected probability distribution of the user pairs, and the pairing the users according to the sampling result includes: and sampling the selected probability distribution of the user pairing by a dichotomy, and pairing the users with the sampling result of 1. And pairing the users with the sampling result of 1 by adopting a dichotomy. Therefore, the same resources can be multiplexed for data transmission, and the frequency spectrum efficiency is improved.

Optionally, with reference to any one possible implementation manner of the first aspect to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the method further includes: and performing reinforcement learning on the cost function according to the selected probability distribution of the user pairs and the number of the users. Therefore, the cost function is subjected to reinforcement learning, so that the cost function can be continuously adapted to a specific environment.

A second aspect of the present application provides an apparatus for multi-user pairing, comprising: the first processing module is used for acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate; the second processing module is used for acquiring related information among users according to the user characteristic information; the third processing module is used for acquiring the selected probability distribution of the user pair according to the related information among the users; and the fourth processing module is used for sampling the users according to the selected probability distribution of the user pairing and pairing the users according to the sampling result. The device for multi-user pairing can acquire related information among users according to the user characteristic information and acquire the selected probability distribution of user pairing according to the related information among the users. The users are sampled according to the selected probability distribution of the user pairing, and then the users are paired according to the sampling result, iterative operation is not needed, the calculation amount is small compared with the prior art, the operation complexity and the time complexity can be reduced, and the real-time performance is improved. And sampling the users according to the selected probability distribution of the user pairs, so that the users can respectively output the pairs according to the sampling result.

Optionally, with reference to the second aspect, in a first possible implementation manner of the second aspect, the second processing module is specifically configured to perform attention mechanism processing based on natural language processing on the user feature information, and acquire related information between users. The correlation information between the users may indicate the correlation of each user with the overall status. The user characteristics are subjected to attention mechanism processing based on natural language processing without iterative operation, so that the operation amount can be reduced.

Optionally, with reference to the second aspect, in a second possible implementation manner of the second aspect, the third processing module is specifically configured to perform multi-layer perceptron processing on the relevant information between the users to obtain a selected probability distribution of the user pair. The users can be paired according to the selected probability distribution of the user pairing, so that the same resources can be multiplexed for data transmission, and the spectrum efficiency is improved.

Optionally, with reference to the second aspect, in a third possible implementation manner of the second aspect, the fourth processing module is specifically configured to sample, by bisection, a selected probability distribution of the user pairing, and pair the users whose sampling result is 1. And pairing the users with the sampling result of 1 by adopting a dichotomy. Therefore, the same resources can be multiplexed for data transmission, and the frequency spectrum efficiency is improved.

Optionally, with reference to any one possible implementation manner of the third possible implementation manner of the second aspect to the second aspect, in a fourth possible implementation manner of the second aspect, the apparatus further includes: and the fifth processing module is used for performing reinforcement learning on the cost function according to the selected probability distribution of the user pairs and the number of the users, so that the cost function can be continuously adapted to a specific environment.

The third aspect of the application provides a neural network, which comprises an attention mechanism network based on natural language processing and a multilayer perceptron network, wherein the multilayer perceptron network comprises a probability distribution acquisition module and a sampling module; the attention mechanism network based on natural language processing is used for acquiring user characteristic information and acquiring related information among users according to the user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate; the probability distribution acquisition module is used for acquiring the selected probability distribution of the user pair according to the related information among the users; and the sampling module is used for sampling the users according to the selected probability distribution of the user pairing and pairing the users according to the sampling result. The neural network processing process does not need iterative operation, and compared with the prior art, the neural network processing method has the advantages that the calculation amount is small, the time complexity can be reduced, and the instantaneity is improved. And sampling the users according to the selected probability distribution of the user pairs, so that the users can respectively output the pairs according to the sampling result.

Optionally, with reference to the third aspect, in a first possible implementation manner of the third aspect, the attention mechanism network based on natural language processing is specifically configured to perform attention mechanism processing based on natural language processing on the user feature information, and acquire related information between users. The user characteristics are subjected to attention mechanism processing based on natural language processing without iterative operation, so that the operation amount can be reduced.

Optionally, with reference to the third aspect, in a second possible implementation manner of the third aspect, the probability distribution obtaining module is specifically configured to perform multi-layer perceptron processing on relevant information between users to obtain a selected probability distribution of a user pair. The users can be paired according to the selected probability distribution of the user pairing, so that the same resources can be multiplexed for data transmission, and the spectrum efficiency is improved.

Optionally, with reference to the third aspect, in a third possible implementation manner of the third aspect, the sampling module is specifically configured to sample, by bisection, a selected probability distribution of a user pair, and pair users whose sampling results are 1. Therefore, the same resources can be multiplexed for data transmission, and the frequency spectrum efficiency is improved.

Optionally, with reference to the third possible implementation manner of the third aspect to the third aspect, in a fourth possible implementation manner of the third aspect, the neural network further includes: and the reinforcement learning network is used for reinforcement learning of the cost function according to the selected probability distribution of the user pairs and the number of the users, so that the cost function can be continuously adapted to a specific environment.

A fourth aspect of the present application provides a base station for performing a method for multi-user pairing as in any one of the possible implementations of the first aspect to the first aspect of the present application.

A fifth aspect of the present application provides a computer program product for causing a computer device to perform the method according to any one of the possible implementations of the first aspect of the present application when the computer program product runs on the computer device.

A sixth aspect of the present application provides a computer-readable storage medium comprising a computer program or instructions which, when run on a computer device, causes the computer device to perform the method as in any one of the possible implementations of the first aspect of the present application.

A seventh aspect of the present application provides a chip, including: a processor coupled to a memory for storing programs or instructions, the processor being configured to execute computer programs or instructions in the memory to implement the method according to any one of the possible implementations of the first aspect of the present application.

The application provides a method, a device and a system for multi-user pairing, wherein the method comprises the following steps: acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate; acquiring related information among users according to the user characteristic information; obtaining the selected probability distribution of user pairing according to the related information among the users; and sampling the users according to the selected probability distribution of the user pairing, and pairing the users according to the sampling result. The multi-user pairing method can acquire the related information among the users according to the user characteristic information, and then acquire the selected probability distribution of the user pairing according to the related information among the users. The users are sampled according to the selected probability distribution of the user pairing, and then the users are paired according to the sampling result, iterative operation is not needed, the calculation amount is small compared with the prior art, the time complexity can be reduced, and the real-time performance is improved.

Drawings

FIG. 1 is a block diagram of a greedy criterion based multi-user pairing method in the prior art;

FIG. 2 is a method for multi-user pairing based on a grouping method in the prior art;

FIG. 3 is a method for multi-user pairing provided herein;

FIG. 4 is a process diagram of natural language processing provided herein;

FIG. 5 is a schematic view of an attention-based process provided herein;

FIG. 6 is a schematic diagram illustrating an implementation process of an algorithm deployment provided herein;

FIG. 7 is a schematic diagram of an apparatus for multi-user pairing according to the present application;

FIG. 8 is a schematic diagram of a multi-user pairing apparatus according to the present application;

fig. 9 is a schematic diagram of a multi-user paired neural network provided in the present application.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus.

In a wireless network scenario, a user is connected with a nearby base station through an air interface, and a communication pipeline is established to complete the data communication service of the user. With the rapid development of internet services and the popularization of terminal devices such as mobile phones and tablet computers, wireless services are becoming more and more complex. In a wireless environment, there are often a large number of users (terminal devices). Therefore, the base station equipment is more strained from using limited resources such as frequency, calculation, storage and the like.

In an MIMO or massive MIMO communication system, different users can be paired and the same resource is multiplexed for data transmission, thereby improving the spectrum efficiency of a base station. And selecting users for pairing, wherein the key points are to reduce interference signals among the users and improve the SINR of the users.

Multi-user pairing (multi-user pairing) is a resource management technique under MIMO or massive MIMO spatial multiplexing. The base station selects the largest subset that maximizes the spectral efficiency according to the channel state of the set of users to be served. This is a typical combinatorial optimization problem, and the current multi-user pairing technique considers factors such as computing resources and storage overhead.

Taking the null-breaking beam weight as an example, the mathematical model of the downlink user received signal is as follows:

wherein, y_iIs the signal that the user receives from the base station (i.e. the user receives to the base station)Allocated resources), H_iIs a representation of the channel response of the base station to the user, w_iIs a null-breaking beam weight, x, for user i_iIs the data traffic of the user and n is noise. Beam Forming (BF) weights a transmission signal to form a narrow beam pointing to a terminal or a specific direction. The null-breaking beam weight refers to a weighted numerical value obtained by weighting the transmission signal.

Regarding the wave beam zero breaking weight value of multi-user pairing, when there is no pairing user, the wave beam zero breaking weight value vector w is in single user_iIs a channel matrix H_iTransposing:

for multiple paired users, take the null-breaking beam weight as an example, the null-breaking beam weight is H^TPseudo-inverse of (2):

so that H_iw_k0, i ≠ k, where σ²Is ambient noise.

The SINR is calculated as follows:

H_iis a representation of the channel response of the base station to the user, w_iIs the weight of the null-breaking beam for user i, here the numerator H_iw_i||²Is the signal power of user i, the denominator sigma_k≠i||H_iw_k||²The method is used for counting the noise signal power accumulated value of other users except the user i.

The channel capacity is calculated by the shannon formula as follows:

the formula for calculating the SINR can show that the key point is to reduce the interference signals between users if the SINR is required to be improved on the premise that the environmental noise is not changed.

In the prior art, there is a multi-user pairing method based on a greedy criterion. The method can select the paired users one by one according to the channel states of the users, and can improve the throughput of the system to the maximum extent while reducing the search complexity. The method can also introduce a proportional fairness algorithm to change a maximization objective function so as to solve the problem of low throughput of edge users.

In the method, the following maximum objective function formula under fair scheduling is realized by combining historical rate and frequency spectrum efficiency of current user pairing: the following were used:

where SE is the spectral efficiency, Thp, of the current user pairing_averageIs the user historical average throughput.

Referring to fig. 1, the greedy criterion-based multi-user pairing method includes: 101. and traversing all the users, and performing combined calculation of the maximum objective function with the selected user set. 102. When the result of the objective function is no longer increased, user pairing is completed. For example, there are 100 users, after selecting one user from the 100 users, the remaining 99 users are traversed, the user with the largest gain is selected as the second paired user, and then the remaining 98 users are traversed … until there is no gain. When there is no gain, all selected users are paired. Because the method needs iterative operation, all unselected users need to be traversed when one user is selected, the calculation amount is large, and the calculation time complexity is high.

The scenario of multi-user pairing is a combinatorial optimization (combinatorial optimization) problem, so there are also traditional solutions, such as heuristic (heuristic algorithm) and meta-heuristic (metaheuristic) algorithms. The meta-heuristic algorithm includes a tabu search (tabu search) algorithm, a Simulated Annealing (SA) algorithm, a Genetic Algorithm (GA), an Ant Colony Optimization (ACO) algorithm, a Particle Swarm Optimization (PSO) algorithm, etc., but the complexity of the solution processing method is too high to be applied to a high-real-time scenario of a base station.

In the embodiment, a grouping method-based multi-user pairing method can be further adopted for user pairing. Specifically, referring to fig. 2, the method for pairing multiple users based on the grouping method includes: 201. and calculating the channel correlation among the users according to the channel state of the users, and taking the users with the correlation higher than a preset value as a group. The relevance of users between different groups is low. 202. And selecting one user from each group according to a preset rule, and pairing the selected users. In this method, only correlation is considered for pairing, users with high correlation are grouped into one group in the grouping process, and the threshold value of the grouping correlation needs to be set manually. The number of packets cannot be determined. The method only considers correlation, and the size of the solution space which can be explored is low.

Therefore, with the popularization of terminal devices and the expansion of network scale, especially the enrichment of mobile applications, the conventional resource allocation method adopted by the existing cell base station device may bring about insufficient resource allocation. How to achieve optimal resource allocation in a multi-user simultaneous online scene is a combined optimization problem of a non-deterministic polynomial (NP).

The present application provides a method for multi-user pairing, please refer to fig. 3, the method includes:

301. and acquiring user characteristic information.

And acquiring user characteristic information. The user characteristic information includes a user channel, a Channel Quality Indicator (CQI), and a historical rate.

302. And acquiring related information among users according to the user characteristic information.

And obtaining related information among users according to the characteristic information of the users by adopting an attention machine (attention mechanism) processing mode based on Natural Language Processing (NLP).

Referring to fig. 4, the NLP processing process may include:

3021. and preprocessing the user characteristic information.

Specifically, the preprocessing step can be divided into data cleaning, data segmentation and part-of-speech tagging.

Data cleaning: and finding out interesting contents from the user characteristic information, and cleaning and deleting the contents which are not interesting and are regarded as noise. The data cleaning mode comprises the following steps: deduplication, alignment, deletion, and the like. Content can be extracted by rules, regular expression matching, etc.

Data segmentation: the text in the user characteristic information is cut into phrases with preset minimum unit granularity. The data segmentation method comprises the following steps: a segmentation method based on string matching, a segmentation method based on statistics, a segmentation method based on rules, and the like.

Part of speech tagging: each phrase is tagged with a part-of-speech tag, such as an adjective, verb, noun, etc. This allows the text to incorporate more useful language information during subsequent processing.

3022. And vectorizing the result obtained by the preprocessing.

And converting the result after the user characteristic information preprocessing into a type which can be calculated by a computer. Specifically, the obtained phrase character string is subjected to vectorization processing to obtain a feature vector. Illustratively, the vectorization process may be performed according to a bag of words (BOW) model and a word vector model.

3023. And obtaining a feature subset through feature selection.

After the feature vectors are obtained, an appropriate feature subset with high expressive power needs to be selected from the feature vectors. Specifically, a suitable feature subset may be selected from the feature vector by a feature extraction algorithm.

After the feature subset is obtained, performing activation function calculation of N neurons on the feature subset through a neural network, and increasing the dimension of the input feature subset to N dimensions. For example, the neural network performs activation function calculation of three neurons on the feature subset, and the feature subset input can be raised to three dimensions.

And performing attention mechanism-based processing on the raised N-dimensional features. It should be noted that the attention mechanism may enable the neural network to focus on a subset of its inputs (or features): a particular input is selected. In situations where computing power is limited, the attention mechanism is a resource allocation scheme that is the primary means to solve the information overload problem, and can allocate computing resources to more important tasks.

Specifically, the related information based on the attention mechanism among users can be calculated by the following formula:

see fig. 5 in particular, where Q, K, V is a high-dimensional feature obtained by performing upscaling on a subset of features of a user. First, a correlation matrix multiplication calculation based on an attention mechanism is carried out on the high-dimensional feature (Q, K), and specifically, the matrix of the high-dimensional feature Q and K is transposed (K)^T) Multiplication (matmul). The results obtained are then subjected to a normalization (scale) process. In particular, the method comprises the following steps of,is to two vectors Q and K in the molecule^TAnd (4) normalization processing. And performing softmax calculation on the result of the normalization processing to obtain a softmax result. And multiplying the obtained softmax result with the high-dimensional characteristic V of the user, wherein the output result is the related information between the users.

And splicing the output results of each user together to form a high-dimensional matrix expression of the related information among the users. For example, the output result of user 1 is M and the output result of user 2 is N. A high-dimensional matrix expression of the relevant information between the user 1 and the user 2 is obtained as M, N.

Illustratively, assume that there are A, B, C users.

1) The acquiring of the feature information F of each user includes: [ channel, CQI, historical rate ]. For example, the characteristic information of the user a is FA ═ the channel of the user a, the CQI of the user a, and the historical rate of the user a. Similarly, the feature information of the user B and the user C are FB and FC, respectively.

2) And performing NLP-based processing on the characteristic information F of each user, and then performing neural network processing to obtain the high-dimensional characteristics after the dimensionality is increased. Specifically, obtaining the high-dimensional features by each user includes: FNQ, FNK, FNV. For example, the high-dimensional features obtained by the A user after the dimension upgrade include: FNQA, FNKA, FNVA. The high-dimensional features of the user B after the dimension is increased can be obtained as follows: FNQB, FNKB, FNVB; the high-dimensional features of the C user after the dimension increasing comprise: FNQC, FNKC and FNVC.

3) An attention-based correlation matrix multiplication is performed on the high-dimensional features of each user. Illustratively, the user a performs a correlation matrix multiplication calculation based on an attention mechanism to obtain:

FLA＝[FNQA*FNKA^T，FNQA*FNKB^T，FNQA*FNKC^T]wherein FNKA^TIs a transpose of the high-dimensional feature matrix FNKA, FNKB, of user A^TIs a transpose of the high-dimensional feature matrix FNKB, FNKC, of user B^TIs the transpose of the high-dimensional feature matrix FNKC for user C. FLB and FLC can be calculated in the same way. The method can adopt a parallel mode, adopt a plurality of attention networks to be parallel, and can simultaneously calculate and obtain the FLB and the FLC, thereby greatly reducing the cost of calculation time.

5) Divide FLA byNormalization is carried out, and then the result FLNA of softmax is obtained through softmax processing.

6) And after the result of softmax is obtained, multiplying the result by the high-dimensional characteristic information of the user to obtain the relevant information of the user and other users. Specifically, the related information of the user a based on the user B and the user C is obtained: FOA ═ FLNA FNVA + FA. It is also possible to obtain the relevant information FOB of user B based on user a and user C and the relevant information FOC of user C based on user a and user B.

7) Stitching the outputs of users A, B, C results in a high-dimensional matrix representation of the relevant information between users A, B, C. Specifically, the high dimensional matrix is expressed as [ FOA, FOB, FOC ].

It should be noted that the high-dimensional matrix expression of the relevant information between the users is independent of the number of users, so the network weight and complexity of the method provided by the application do not change with the increase of the number of users.

303. And acquiring the selected probability distribution of the user pair according to the related information among the users.

And (3) performing multi-layer perceptron (MLP) processing on the high-dimensional matrix expression of the obtained related information among the users to obtain the selected probability distribution of the user pairs.

The result calculation formula of the multilayer perception processing model is as follows:

y＝σ(w*x+b)

where x is a high-dimensional matrix representation of the relevant information between the input users, w is the neuron weights, b is the bias, σ is the activation function, and y is the selected probability distribution of the output user pair. The user pair selection probability distribution includes a probability that each user is selected.

304. And sampling the users according to the selected probability distribution of the user pairing, and pairing the users according to the sampling result.

And sampling the users according to the selected probability distribution of the user pairs to obtain the classified output of the users. Specifically, the user pairing selected probability distribution may be processed by adopting a two-classification method, and a result of 0 or 1 may be obtained according to the probability of the user being selected. All users with output of 1 are then paired.

305. And (5) strengthening the learning cost function.

After pairing users, the algorithm model may also be learned intensively. For example, the algorithmic model is intensively learned by an AC algorithm (actionritic). Each base station is in different scenes, real-time information is acquired according to the change of the scenes by using a reinforcement learning method aiming at different scenes, the network weight is updated, and the optimal pairing output of the users is realized. According to the current state characteristics of scene input, the result selected by the user is output end to end, and the reinforcement learning cost function J (theta) is as follows:

where B is the number of users and L is the Reward value obtained by interacting with the environment. b is a parameter that does not depend on pi and estimates the baseline function of the expected result to reduce the gradient change. p is a radical of_θ(π_i) Is the probability that the user is selected. The Reward value may evaluate the quality of the neural network paired by the user. The output of the neural network obtains a Reward value according to environment feedback, each time the Reward value is used as the update direction of the neural network, the neural network can be continuously and iteratively learned, the neural network can be continuously optimized and continuously adapt to the environment, and the user pairing result output by the neural network can be more adaptive to the environment.

It is noted that special handling is required in some special scenarios. Constrained by the communication system protocol, a packet error must be retransmitted at a specified Transmission Time Interval (TTI). Or scenario needs (VIP users, high priority traffic), the recommended set must contain a specified subset and maximize utility. Aiming at the situation, the probability of selecting the special user is improved, and the combination of the special user and the special user obtains the optimal pairing combination.

The implementation process of deploying the algorithm model provided by the application to the existing network is shown in fig. 6. Referring to fig. 6, the implementation process is as follows:

401. and collecting the data of the current network.

And collecting real data of the current network to acquire channel data of the user and CQI of the user.

402. And (5) off-line training the model.

And performing supervision training on the network by using the current network data on an offline platform. When the model can be converged to a preset value, the initialization of the model is completed. Illustratively, the preset value may be 90%, and when the prediction success rate is converged to 90% or more, the stability and robustness can be guaranteed.

403. And updating to the current network enabling model.

When the model can be converged to a preset value, the initialization of the model is completed.

404. Reinforcement learning periodically updates the model.

And setting the period of the online updating of the model according to different scenes. For example, the crowd-dense scene reinforcement learning update period is 50TTI, and the crowd-sparse scene reinforcement learning period may be set to 10 TTI. The frequency spectrum efficiency is improved by better adapting to different scenes at one time.

The model update is done locally at the base station or transmitted to a central server, allowing some delay. Preferably, this can be done in real time on the baseband board of the base station.

The network model provided by the application takes the offline learning model as the initial model, avoids the over-poor initial performance, can ensure the smoothness of the network performance, and is beneficial to commercial application. Meanwhile, the network model is periodically updated, so that the adaptability of the model to the environment can be improved.

The application provides a method for multi-user pairing, which comprises the following steps: acquiring user characteristic information, wherein the user characteristic information comprises a user channel, a channel quality indicator and a historical rate; acquiring related information among users according to the user characteristic information; obtaining the selected probability distribution of user pairing according to the related information among the users; and sampling the users according to the selected probability distribution of the user pairing, and pairing the users according to the sampling result. The multi-user pairing method can acquire the related information among the users according to the user characteristic information, and then acquire the selected probability distribution of the user pairing according to the related information among the users. The users are sampled according to the selected probability distribution of the user pairing, and then the users are paired according to the sampling result, iterative operation is not needed, and compared with the prior art, the method has the advantages that the calculation amount is small, the operation complexity can be reduced, the time complexity is reduced, and the instantaneity is improved. The performance is higher than the greedy algorithm.

Referring to fig. 7, the present application further provides an apparatus 50 for pairing multiple users, wherein the apparatus 50 is used for performing the method for pairing multiple users. The apparatus 50 comprises:

a first processing module 501, configured to obtain user characteristic information, where the user characteristic information includes a user channel, a channel quality indicator, and a historical rate. Please refer to fig. 3 and step 301 for understanding, which are not described herein again.

A second processing module 502, configured to obtain relevant information between users according to the user feature information. Specifically, the second processing module 502 is configured to perform attention mechanism processing based on natural language processing on the user feature information, and acquire related information between users. Please refer to fig. 3 and step 302 for understanding, which are not described herein again.

A third processing module 503, configured to obtain a selected probability distribution of the user pair according to the related information between the users. Specifically, the third processing module 503 is configured to perform multi-layer perceptron processing on the related information between the users to obtain a selected probability distribution of the user pair. Please refer to fig. 3 and step 303 for understanding, which are not described herein again.

A fourth processing module 504, configured to sample users according to the selected probability distribution of the user pairs, and pair users according to a sampling result. Specifically, the fourth processing module 504 is configured to sample the selected probability distribution of the user pair by a bisection method, and pair the users whose sampling result is 1. Please refer to fig. 3 and step 304 for understanding, which are not described herein again.

And a fifth processing module 505, configured to perform reinforcement learning on the cost function according to the selected probability distribution of the user pairs and the number of users. Please refer to fig. 3 and step 305 for understanding, which are not described herein again.

The application provides a device 50 for multi-user pairing, and the device 50 can acquire related information between users according to user characteristic information and then acquire selected probability distribution of user pairing according to the related information between the users. The users are sampled according to the selected probability distribution of the user pairing, and then the users are paired according to the sampling result, iterative operation is not needed, and compared with the prior art, the method has the advantages that the calculation amount is small, the operation complexity can be reduced, the time complexity is reduced, and the instantaneity is improved.

Fig. 8 is a schematic structural diagram of a multi-user pairing apparatus provided in the present application. As shown in fig. 8, the apparatus 60 includes a processor 601, a memory 602, and a transceiver 603, and the processor 601, the memory 602, and the transceiver 603 may be connected by a bus 604.

The apparatus 60 is a hardware structure apparatus, and can be used for the functions of the various functional modules in the apparatus 50 shown in fig. 7. For example, those skilled in the art will appreciate that the first processing module 501 in the apparatus 50 shown in fig. 7 can be implemented by the transceiver 603 for obtaining the user feature information. The second processing module 502 in the apparatus 50 shown in fig. 7 is used for obtaining a high-dimensional matrix expression of the related information between users according to the user feature information, and may be implemented by the processor 601 calling the code in the memory 602.

Alternatively, the processor 601 may be one or more Central Processing Units (CPUs), microprocessors, application-specific integrated circuits (ASICs), or one or more integrated circuits for controlling the execution of programs according to the present disclosure.

The processor 601 is configured to execute the instructions in the memory 602 to perform the processing steps applied to the multi-user pairing method shown in fig. 3.

The memory 602, processor 601 and transceiver 603 may be interconnected via a bus 604, but are not limited to being connected only via the bus 604; the bus 604 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by program instructions instructing associated hardware, and the program may be stored in a computer-readable storage medium, which may include: ROM, RAM, magnetic or optical disks, and the like.

The present application also provides a neural network, please refer to fig. 9, the neural network 70 is used for implementing the method for pairing multiple users provided by the present application. The system comprises an attention mechanism network 701 based on natural language processing, a multi-layer perceptron network 702 and a reinforcement learning network 703. The multi-layer perceptron network 702 includes a probability distribution acquisition module 7021, and a sampling module 7022.

The attention mechanism network 701 based on natural language processing is configured to obtain user characteristic information, where the user characteristic information includes a user channel, a channel quality indication, and a historical rate. And acquiring related information among users according to the user characteristic information. The attention mechanism network 701 based on natural language processing is specifically configured to perform attention mechanism processing based on natural language processing on user feature information and acquire related information between users. Please refer to fig. 3 and step 302 for understanding, which are not described herein again.

The multi-layer perceptron network 702 may specifically include a probability distribution obtaining module 7021 and a sampling module 7022, where the probability distribution obtaining module 7021 is configured to obtain a selected probability distribution of a user pair according to related information between users. Further, the probability distribution obtaining module 7021 is specifically configured to perform multi-layer perceptron processing on the related information between the users to obtain a selected probability distribution of the user pair. Please refer to fig. 3 and step 303 for understanding, which are not described herein again.

The sampling module 7022 is configured to sample users according to the selected probability distribution of the user pairs, and pair the users according to a sampling result. Further, the sampling module 7022 is specifically configured to sample, by bisection, the selected probability distribution of the user pair, and pair the users whose sampling result is 1. Please refer to fig. 3 and step 304 for understanding, which are not described herein again.

The reinforcement learning network 703 is configured to perform reinforcement learning on the cost function according to the selected probability distribution of the user pairs and the number of users. Please refer to fig. 3 and step 305 for understanding, which are not described herein again.

The application provides a multi-user paired neural network 70, and the neural network 70 can acquire relevant information among users according to user characteristic information and then acquire selected probability distribution of user pairing according to the relevant information among the users. The users are sampled according to the selected probability distribution of the user pairing, and then the users are paired according to the sampling result, iterative operation is not needed, and compared with the prior art, the method has the advantages that the calculation amount is small, the operation complexity can be reduced, the time complexity is reduced, and the instantaneity is improved.

The embodiment of the application also provides a base station, and the base station is used for executing the multi-user pairing method provided by the application.

The embodiment of the application also provides a computer storage medium, wherein a computer program is stored in the storage medium, and the computer program is used for executing the multi-user pairing method provided by the application.

Embodiments of the present application further provide a computer program product including instructions, which when run on a computer device, cause the computer device to perform the method for multi-user pairing provided in embodiments of the present application.

An embodiment of the present application further provides a chip, including: a processor coupled to a memory for storing computer programs or instructions, the processor for executing the computer programs or instructions in the memory to implement the method for multi-user pairing provided herein.

The method, the apparatus, and the system for multi-user pairing provided in the embodiments of the present application are described in detail above, and a specific example is applied in the present application to explain the principle and the implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention. Although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

21页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：用于多个多波束卫星通信的预编码方法、通信系统及装置

Multi-user pairing method, device and system

相关技术

网友询问留言