Terminal scheduling method and device in wireless federal learning

文档序号:1905631 发布日期:2021-11-30 浏览:18次 中文

阅读说明:本技术 无线联邦学习中的终端调度方法和装置 (Terminal scheduling method and device in wireless federal learning ) 是由 施文琦 姜淼 耿璐 马元琛 周盛 牛志升 于 2020-05-25 设计创作,主要内容包括:本发明实施例提供了无线联邦学习中的终端调度方法及设备,本发明实施例提供的方法,通过每轮联邦学习中各终端进行的梯度信息估计以及由无线接入点进行的当轮耗时估计,动态地调整每轮联邦学习中的被调度终端,可以在受限的训练延时内最大化联邦学习能够获得的模型正确率,从而解决了现有终端调度算法只能使用某些预设的固定参数,导致难以在动态的无线环境以及多变的训练数据分布下保证联邦学习收敛速度的问题。(The method provided by the embodiment of the invention dynamically adjusts the scheduled terminal in each round of federal learning through gradient information estimation performed by each terminal in each round of federal learning and current-round time consumption estimation performed by a wireless access point, can maximize the model accuracy rate which can be obtained by federal learning within limited training delay, and thus solves the problem that the existing terminal scheduling algorithm can only use certain preset fixed parameters, which causes difficulty in ensuring the convergence speed of federal learning in a dynamic wireless environment and variable training data distribution.)

1. A terminal scheduling method in wireless federal learning is characterized by comprising the following steps:

the wireless access point receives a local model and values of local loss functions, a convexity estimation value and a smoothness estimation value which are sent by each scheduled terminal of the current round after the completion of the federal learning of the current round;

updating to obtain a global model of the current wheel according to the local models sent by each terminal; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

calculating a gradient diversity estimation value of a local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

2. The method of claim 1, wherein the step of generating the gradient diversity of the local loss function of each terminal in the current round according to the global model of the previous round and the local model transmitted by each terminal comprises:

respectively estimating the gradient of the local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

calculating the gradient of the global loss function of the current round according to the gradients of the local loss functions of all the terminals scheduled to the current round;

and calculating the gradient diversity estimated value of the local loss function of each terminal in the current round according to the gradient of the local loss function of each terminal in the current round and the gradient of the global loss function of the current round.

3. The method of claim 1, wherein the step of generating scheduled terminals for the next round of federal learning based on gradient estimation information of each terminal in the current round comprises:

according to the gradient estimation information of each terminal in the current round and the ratio of the local training data set of the terminal in the global training data set as the weight of the terminal, carrying out weighted summation on the gradient estimation information of all terminals participating in federal learning in the current round to obtain the global value of the gradient estimation information of all terminals in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function;

and generating scheduled terminals for next round of federal learning according to the gradient estimation information of each terminal in the current round and the global value of the gradient estimation information in the current round.

4. The method of claim 3, wherein the step of performing a next round of federally learned terminal scheduling based on the gradient estimation information of each terminal in the current round and the global value of the gradient estimation information in the current round comprises:

initializing a first set whose contents are empty;

under the condition that the number of the terminals in the candidate terminal set is greater than 0, repeatedly executing the following steps until the training cost C value obtained by calculation is not reduced any more, and taking the terminals in the first set as the scheduled terminals for the next round of federal learning:

traversing the terminals in the candidate terminal set, respectively estimating the time consumption of the federal learning with the terminal and the first set as scheduled terminals in the current round of learning, and determining a target terminal with the shortest time consumption;

estimating the total number of rounds of federal learning according to a preset total budget of training time and the shortest time consumption; calculating values of the target terminal, the first set and a training cost C value of federal learning as a scheduled terminal according to the total number of rounds, the global value of the convexity estimation value in the current round, the global value of the smoothness estimation value in the current round and the gradient diversity of local loss functions of all terminals in the current round;

and when the value of the training cost C value obtained by current calculation is reduced relative to the C value maintained locally, adding the target terminal into the first set, deleting the target terminal from the candidate terminal set, and updating the C value maintained locally into the value of the training cost C value obtained by current calculation.

5. The method of claim 4, wherein the value of the training cost C value is further calculated according to the following formula:

wherein the content of the first and second substances,

η represents the learning rate;is a preset system parameter;an estimate representing a number of federal learning executables; tau represents the updating times of the local model in each round of federal learning; rho represents a global value of a convexity estimation value of local loss functions of all terminals participating in federal learning; h (τ) represents; m represents the set of all terminals participating in federal learningThe number of middle elements; II, representing the number of the scheduled terminals of the current wheel; beta represents the global value of the smoothness estimation value of the local loss function; diRepresenting the size of the local training data set; deltaiA gradient diversity estimate representing a local loss function of terminal i; δ represents the global value of the gradient diversity of the local loss functions of all terminals participating in federal learning; d represents the size of the global training data set.

6. The method according to any of claims 1 to 5, characterized in that before the step of calculating the gradient diversity of the local loss function of each terminal in the current round based on the global model of the previous round and the local model transmitted by each terminal, the method further comprises:

judging whether the consumed time of the current round exceeds a preset total training time budget or not in the federal learning process;

under the condition that the total budget of the preset training time is exceeded, outputting the optimal global model as a training result;

and under the condition that the total budget of the preset training time is not exceeded, executing the step of calculating the gradient diversity of the local loss function of each terminal in the current round according to the global model of the previous round and the local model transmitted by each terminal.

7. A terminal scheduling method in wireless federal learning is characterized by comprising the following steps:

in the current round of federal learning, a terminal updates a local model to obtain a value of a local loss function, and estimates a convexity estimation value and a smoothness estimation value of the local loss function;

and the terminal sends the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

8. The method of claim 7, wherein the step of estimating the convexity and smoothness estimates of the local penalty function comprises:

respectively calculating a first loss value of a global model received in the current round of federal learning and a second loss value of a local model obtained by updating the current round of federal learning by using a local loss function; calculating a first norm of a difference between the first loss value and the second loss value, and calculating a second norm of a difference between a global model received in the current round of federal learning and a local model obtained by updating the current round of federal learning; calculating the ratio of the first norm to the second norm to obtain a convexity estimation value of the local loss function;

calculating a third norm of the difference between the gradient of the first loss value and the gradient of the second loss value; and calculating the ratio of the third norm to the second norm to obtain the estimated smoothness value of the local loss function.

9. A wireless access point, comprising:

the data receiving module is used for receiving a local model and values of local loss functions, a convexity estimation value and a smoothness estimation value which are sent by each scheduled terminal of the current round after the completion of the federal learning of the current round by the wireless access point;

the model updating module is used for updating to obtain a global model of the current wheel according to the local models sent by the terminals; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

the diversity calculation module is used for calculating a gradient diversity estimation value of a local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and the scheduling terminal generation module is used for generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

10. The wireless access point of claim 9,

the diversity calculation module is further configured to:

respectively estimating the gradient of the local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

calculating the gradient of the global loss function of the current round according to the gradients of the local loss functions of all the terminals scheduled to the current round;

and calculating the gradient diversity estimated value of the local loss function of each terminal in the current round according to the gradient of the local loss function of each terminal in the current round and the gradient of the global loss function of the current round.

11. The wireless access point of claim 9,

the scheduling terminal generating module is further used for

According to the gradient estimation information of each terminal in the current round and the ratio of the local training data set of the terminal in the global training data set as the weight of the terminal, carrying out weighted summation on the gradient estimation information of all terminals participating in federal learning in the current round to obtain the global value of the gradient estimation information of all terminals in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function;

and generating scheduled terminals for next round of federal learning according to the gradient estimation information of each terminal in the current round and the global value of the gradient estimation information in the current round.

12. The wireless access point of claim 11,

the scheduling terminal generating module is further used for

Initializing a first set whose contents are empty;

under the condition that the number of the terminals in the candidate terminal set is greater than 0, repeatedly executing the following steps until the training cost C value obtained by calculation is not reduced any more, and taking the terminals in the first set as the scheduled terminals for the next round of federal learning:

traversing the terminals in the candidate terminal set, respectively estimating the time consumption of the federal learning with the terminal and the first set as scheduled terminals in the current round of learning, and determining a target terminal with the shortest time consumption;

estimating the total number of rounds of federal learning according to a preset total budget of training time and the shortest time consumption; calculating values of the target terminal, the first set and a training cost C value of federal learning as a scheduled terminal according to the total number of rounds, the global value of the convexity estimation value in the current round, the global value of the smoothness estimation value in the current round and the gradient diversity of local loss functions of all terminals in the current round;

and when the value of the training cost C value obtained by current calculation is reduced relative to the C value maintained locally, adding the target terminal into the first set, deleting the target terminal from the candidate terminal set, and updating the C value maintained locally into the value of the training cost C value obtained by current calculation.

13. The wireless access point of claim 12, wherein the scheduling terminal generating module is further configured to calculate the value of the training cost C according to the following formula:

wherein the content of the first and second substances,

η represents the learning rate;is a preset system parameter;an estimate representing a number of federal learning executables; tau represents each wheel unitUpdating times of local models in nation learning; rho represents a global value of a convexity estimation value of local loss functions of all terminals participating in federal learning; h (τ) represents; m represents the set of all terminals participating in federal learningThe number of middle elements; II, representing the number of the scheduled terminals of the current wheel; beta represents the global value of the smoothness estimation value of the local loss function; diRepresenting the size of the local training data set; deltaiA gradient diversity estimate representing a local loss function of terminal i; δ represents the global value of the gradient diversity of the local loss functions of all terminals participating in federal learning; d represents the size of the global training data set.

14. A terminal, comprising:

the model updating module is used for updating a local model in the current round of federal learning, obtaining the value of a local loss function, and estimating the convexity estimation value and the smoothness estimation value of the local loss function;

and the data sending module is used for sending the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

15. The terminal of claim 14,

the model update module is further configured to:

respectively calculating a first loss value of a global model received in the current round of federal learning and a second loss value of a local model obtained by updating the current round of federal learning by using a local loss function; calculating a first norm of a difference between the first loss value and the second loss value, and calculating a second norm of a difference between a global model received in the current round of federal learning and a local model obtained by updating the current round of federal learning; calculating the ratio of the first norm to the second norm to obtain a convexity estimation value of the local loss function;

calculating a third norm of the difference between the gradient of the first loss value and the gradient of the second loss value; and calculating the ratio of the third norm to the second norm to obtain the estimated smoothness value of the local loss function.

Technical Field

The invention relates to the technical field of machine learning, in particular to a terminal scheduling method and equipment in wireless federal learning.

Background

According to Cisco estimates, in 2021, approximately 850 ze bytes of data will be generated at the network edge each year. These valuable data can bring various Artificial Intelligence (AI) services to end users by utilizing the deep learning techniques that have rapidly developed in recent years. However, training an AI model (typically a deep neural network) via conventional centralized training methods requires aggregating all raw data to a central server. Using traditional centralized training methods in wireless networks is impractical because uploading raw data over a wireless channel can consume large wireless bandwidth resources and introduce significant transmission delays, and can cause privacy issues when raw data is uploaded to a central server.

To address the above problems, the prior art has proposed a new distributed model training framework called Federal Learning (FL). Federal Learning (FL) is a new framework for analyzing large amounts of distributed data and training Learning models at the network edge, which can play a role in protecting data privacy.

A typical wireless FL system controls the computational power of a plurality of terminal devices coordinated by a central controller, usually a Base Station (BS), to train the model in an iterative manner. In each iteration (also referred to as a round) of the FL, the participating devices update the local model using their local data and then send the local model to the BS for global model aggregation. By updating the model parameters locally, the FL utilizes data and computational power distributed across the device, and thus can reduce model training latency and maintain data privacy. Therefore, FL becomes a potential technology for distributed data analysis and model training in wireless networks and has been used in many applications, such as resource allocation optimization in vehicle-to-vehicle (V2V) communications and content recommendation for smartphones, among others.

However, implementing FL in practical wireless networks encounters several key challenges that have not yet been fully solved. Due to scarce wireless spectrum resources and limited training delay budget, the local model is only allowed to be uploaded on a limited number of devices in each round, and the device scheduling policy affects the FL convergence rate in two ways. On the one hand, in each round, the BS cannot perform global model aggregation until all scheduled devices have completed the update and uploaded their local model updates. Therefore, laggard devices with limited computational power or poor channel conditions can significantly slow down model aggregation. Scheduling more devices results in longer latency per round due to the reduced bandwidth allocated to each scheduled device and the higher probability of having a laggard device. On the other hand, scheduling more devices can increase the convergence rate on the rounds and can potentially reduce the number of rounds required to achieve the same accuracy. Thus, if the total training time, i.e. the number of rounds multiplied by the average waiting time per round, is considered, the device scheduling is necessary and should be carefully optimized to balance the waiting time per round and the number of rounds needed to optimize the total training time. Furthermore, the scheduling policy should also adapt itself to the dynamic wireless environment.

Recently, there have been many studies considering the implementation of FL in wireless networks. To reduce the upload latency introduced by global model aggregation, the prior art proposes a new analog aggregation technique. For analog aggregation, scheduled devices transmit their local models simultaneously in a wireless multiple access channel via analog modulation, and the BS can receive the aggregated models due to the waveform superposition characteristics of the wireless channel. Although analog aggregation techniques can greatly reduce upload latency, strict time synchronization is required between devices. For FL based digital transmission, however, scheduled devices need to share limited radio resources, and there have been a series of efforts to study the problem of resource allocation. For example, there is a related study to employ Time Division Multiple Access (TDMA) technology at the Media Access Control (MAC) layer and jointly optimize device CPU frequency, transmission latency, and local model accuracy to minimize the weighted sum of training latency and total device energy consumption. There is also a related study considering a similar FL system with Frequency Division Multiple Access (FDMA), in which bandwidth allocation, CPU Frequency, transmission latency and local model accuracy are jointly optimized. There is also related work to optimize the frequency of global aggregation of FL systems under heterogeneous resource constraints. In all of the above studies, each round of FL involves all devices, but this is generally not feasible in practical wireless FL applications because the wireless bandwidth is limited. In addition, another series of work suggests using device scheduling to optimize the convergence speed of the FL. For example, one prior art technique proposes a heuristic scheduling strategy that jointly considers the importance of the channel state and the local update model. However, the proposed scheduling strategy is only evaluated experimentally and the convergence performance of the FL cannot be guaranteed theoretically.

It can be seen that the terminal scheduling method in wireless federal learning is often based on some preset fixed parameters, such as scheduling a fixed number of terminals in each round of training; or to schedule as many terminals as possible for a fixed duration. The preset fixed parameters are often difficult to dynamically adjust after wireless federal learning is deployed, and the training convergence speed of the wireless federal learning in a dynamically changing wireless environment is possibly caused to slow, so that the performance of the wireless federal learning in a time delay limited scene is influenced.

Disclosure of Invention

The technical problem to be solved by the embodiment of the invention is to provide a terminal scheduling method and equipment in wireless federal learning, which can ensure the convergence speed of federal learning under dynamic wireless environment and variable training data distribution.

In order to solve the above technical problem, according to an aspect of the present invention, there is provided a terminal scheduling method in wireless federal learning, including:

the wireless access point receives a local model and values of local loss functions, a convexity estimation value and a smoothness estimation value which are sent by each scheduled terminal of the current round after the completion of the federal learning of the current round;

updating to obtain a global model of the current wheel according to the local models sent by each terminal; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

calculating a gradient diversity estimation value of a local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

According to another aspect of the present invention, a terminal scheduling method in wireless federal learning is provided, which includes:

in the current round of federal learning, a terminal updates a local model to obtain a value of a local loss function, and estimates a convexity estimation value and a smoothness estimation value of the local loss function;

and the terminal sends the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

According to another aspect of the present invention, there is also provided a wireless access point, including:

the data receiving module is used for receiving a local model and values of local loss functions, a convexity estimation value and a smoothness estimation value which are sent by each scheduled terminal of the current round after the completion of the federal learning of the current round by the wireless access point;

the model updating module is used for updating to obtain a global model of the current wheel according to the local models sent by the terminals; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

the diversity calculation module is used for calculating a gradient diversity estimation value of a local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and the scheduling terminal generation module is used for generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

According to another aspect of the present invention, there is also provided a wireless access point, including: a processor, a memory, and a program stored on the memory and executable on the processor, the program when executed by the processor implementing the steps of the method for terminal scheduling in wireless federal learning as described above.

According to another aspect of the present invention, there is also provided a terminal, including:

the model updating module is used for updating a local model in the current round of federal learning, obtaining the value of a local loss function, and estimating the convexity estimation value and the smoothness estimation value of the local loss function;

and the data sending module is used for sending the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

According to another aspect of the present invention, there is also provided a terminal, including: a processor, a memory, and a program stored on the memory and executable on the processor, the program when executed by the processor implementing the steps of the method for terminal scheduling in wireless federal learning as described above.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the terminal scheduling method in wireless federal learning are implemented as described above.

Compared with the prior art, the terminal scheduling method and device in wireless federal learning provided by the embodiment of the invention at least have the following beneficial effects: according to the method and the device, the scheduled terminal in each round can be dynamically adjusted through gradient information estimation performed by each terminal in each round of federal learning and current round time consumption estimation performed by a wireless access point, so that the model accuracy rate which can be obtained by the federal learning within limited training delay is maximized, and the problem that the convergence speed of the federal learning is difficult to guarantee in a dynamic wireless environment and variable training data distribution due to the fact that the conventional terminal scheduling algorithm can only use certain preset fixed parameters is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without inventive labor.

Fig. 1 is a schematic view of an application scenario of a terminal scheduling method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart illustrating a terminal scheduling method applied to a wireless access point side according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of scheduling performed by traversing terminals according to an embodiment of the present invention;

fig. 4 is a schematic flowchart of a terminal scheduling method applied to a terminal side according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart illustrating terminal model update in each round of federal learning according to an embodiment of the present invention;

fig. 6 is an interaction flow diagram of a terminal scheduling method according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a wireless access point according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a wireless access point according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present invention;

fig. 10 is another schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments. In the following description, specific details such as specific configurations and components are provided only to help the full understanding of the embodiments of the present invention. Thus, it will be apparent to those skilled in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the invention. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

In various embodiments of the present invention, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

The embodiment of the invention provides a terminal scheduling method in wireless federal learning, which can maximize the accuracy of a model obtained by limited training delay internal training under the condition of dynamic wireless environment and random training data distribution, thereby solving the problem that the existing terminal scheduling algorithm can only use certain preset fixed parameters, which causes difficulty in ensuring the federal learning convergence speed under the dynamic wireless environment and variable training data distribution.

Referring to fig. 1, fig. 1 is a block diagram of a wireless communication system to which an embodiment of the present invention is applicable. The wireless communication system includes a plurality of terminals 101 and a wireless access point 102. The terminal 101 may also be referred to as a User terminal or a User Equipment (UE), where the terminal 101 may specifically be a Mobile phone, a Tablet Personal Computer (Tablet Personal Computer), a Laptop Computer (Laptop Computer), a Personal Digital Assistant (PDA), a Mobile Internet Device (MID), a Wearable Device (Wearable Device), or a vehicle-mounted Device, and it should be noted that a specific type of the terminal 101 is not limited in the embodiment of the present invention. The wireless access point 102 may be a wireless transceiver point (TRP), a base station and a core network element, wherein, the base station can be a base station of 5G and later releases (for example: gNB, 5G NR NB, etc.), or a base station in other communication systems (for example: eNB, WLAN access point, or other access points, etc.), wherein, a Base Station may also be referred to as a node B, an enodeb, an access point, a Base Transceiver Station (BTS), a radio Base Station, a radio Transceiver, a Basic Service Set (BSS), an Extended Service Set (ESS), a node B, an evolved node B (eNB), a home node B, a home enodeb, a WLAN access point, a WiFi node, or some other suitable terminology in the field, as long as the same technical effect is achieved, the Base Station is not limited to a specific technical vocabulary.

The terminal scheduling method provided by the embodiment of the invention is suitable for the wireless federal learning process which comprises a wireless access point and a plurality of terminals in the coverage area thereof and is shown in figure 1, and the method mainly comprises the following steps: (1) and the terminal estimates the gradient on the basis of wireless federal learning and sends gradient estimation information to the wireless access point. (2) And the wireless access point performs terminal scheduling according to the gradient estimation information. Specifically, the method comprises the following steps:

the flow of a federated learning application generally includes: (1) the core network receives a model training task; (2) the core network distributes the training tasks to each wireless access point, wherein the training tasks comprise models to be trained, training learning rate, local model updating times, limitation of total training delay and the like; (3) after receiving the training task distributed by the core network, a wireless access point (such as the wireless access point 102 in fig. 1) schedules a mobile terminal (such as the terminal 101 in fig. 1) in the coverage area to participate in federal learning; (4) after the federal learning is completed, the wireless access point can transmit the model parameters obtained by training back to the core network or stay in the local terminal according to the requirements of the training task.

Fig. 1 provides a wireless federal learning system within the coverage area of a wireless access point. A wireless federal study includes several mobile terminals 101 and a wireless access point 102. Wireless federal learning is an iterative, distributed machine learning model training framework in which, at each iteration (called a round), each participating terminal first downloads current global model parameters from the wireless access point (typically in the form of a broadcast by the wireless access point, as in step 103 of fig. 1). After the downloading is completed, all terminals respectively update the local model according to their local training data sets (as in step 104 in fig. 1), and after the updating is completed, the local model update is uploaded to the wireless access point through the wireless network (as in step 105 in fig. 1). After receiving the model updates uploaded by all participating devices in the current round, the wireless access point performs a global model update (step 106 in fig. 1), and may enter the next round of federal learning.

The method of the embodiment of the present invention will be described below separately from the wireless access point side and the terminal side.

Referring to fig. 2, a terminal scheduling method in wireless federation learning according to an embodiment of the present invention, when applied to the wireless access point 102 shown in fig. 1, includes:

and step 21, the wireless access point receives the local model and the value of the local loss function, the convexity estimation value and the smoothness estimation value which are sent by each terminal scheduled in the current round after the completion of the federal learning in the current round.

Here, in each round of federal learning, each scheduled terminal updates a local model of the terminal in the current round of federal learning to obtain a value of a local loss function, and estimates a convexity estimation value and a smoothness estimation value of the local loss function, and then the terminal transmits the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point. The wireless access point may receive the above data transmitted by each scheduled terminal.

Step 22, updating to obtain a global model of the current wheel according to the local models sent by each terminal; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model.

Here, after receiving the updated local models sent by all terminals scheduled in the current round of learning, the wireless access point according to the embodiment of the present invention may update the local models sent by all terminals to obtain the global model of the current round.

In addition, in the embodiment of the present invention, the wireless access point may further calculate a value of the global loss function of the current round according to values of local loss functions of all terminals scheduled to the current round, and determine whether to update the optimal global model according to whether the value of the global loss function of the current round is better than a value of a global loss function corresponding to the optimal global model. And if the value of the global loss function of the current wheel is superior to that of the global loss function corresponding to the optimal global model, updating the optimal global model to the global model obtained by the current wheel, otherwise, keeping the optimal global model unchanged. Here, the optimal global model is an optimal global model obtained by the time of the current round of federal learning.

And step 23, calculating the gradient diversity estimated value of the local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal.

Here, the wireless access point also calculates the gradient diversity of the local loss function of each terminal in the current round based on the local model transmitted by each terminal. Specifically, the wireless access point may respectively estimate the gradient of the local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal; calculating the gradient of the global loss function of the current round according to the gradients of the local loss functions of all the terminals scheduled to the current round; and then, calculating the gradient diversity of the local loss function of each terminal in the current round according to the gradient of the local loss function of each terminal in the current round and the gradient of the global loss function of the current round.

And 24, generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

Here, the wireless access point may perform weighted summation on each piece of gradient estimation information of all terminals participating in federal learning in the current round according to the gradient estimation information of each terminal in the current round and a ratio of a local training data set of the terminal in a global training data set as a weight of the terminal, so as to obtain a global value of the gradient estimation information of all terminals in the current round, where the gradient estimation information includes a convexity estimation value, a smoothness estimation value, and a diversity of gradients of a local loss function; and then, generating scheduled terminals for next round of federal learning according to the gradient estimation information of each terminal in the current round and the global value of the gradient estimation information in the current round.

For example, a global value of the gradient estimation information of all terminals in the current round is calculated according to the following formula:

and

wherein the content of the first and second substances,representing all terminals participating in federal learning; diRepresenting the size of the local training data set; d represents the size of the global training data set;indicating participation in federal studiesLearning a global value of the convexity estimation value of the local loss function of all the terminals; aA global value representing a smoothness estimate for the local loss function of all terminals participating in federal learning;a global value representing a gradient diversity estimate of local loss functions of all terminals participating in federal learning;a convexity estimation value representing a local loss function of the terminal i;a smoothness estimate representing a local loss function of terminal i;represents a gradient diversity estimate of the local loss function of terminal i.

More specifically, the next round of federal learning terminal scheduling is performed according to the gradient estimation information of each terminal in the current round and the global value of the gradient estimation information in the current round, which may be as shown in fig. 3.

Referring to fig. 3, when the number of terminals in the candidate terminal set is greater than 0, the following steps are repeatedly performed until the calculated training cost C value does not decrease any more, and the terminals in the first set are used as the scheduled terminals in the next round of federal learning:

step 301, traversing the terminals in the candidate terminal set, respectively estimating the time consumed by the federal learning with the terminal and the first set as the scheduled terminal in the current round of learning, and determining the target terminal with the shortest time consumption.

Here, the time-consuming manner of federal learning with the terminal and the first set as the scheduled terminal in the current round of learning is estimated, which may be specifically estimated according to factors such as a wireless network format, a network available bandwidth, and a channel condition between the terminal and a wireless access point. For a more detailed estimation, reference may be made to the related art, which is not described herein again.

Step 302, estimating the total number of rounds of federal learning according to a preset total budget of training time and the shortest time consumption; and calculating values of the target terminal, the first set and a training cost C value of federal learning as a scheduled terminal according to the total round number, the global value of the convexity estimation value in the current round, the global value of the smoothness estimation value in the current round and the gradient diversity of local loss functions of all terminals in the current round.

Here, the preset total training time budget is rounded down by the ratio of the shortest time consumption to obtain an estimated value of the total number of federally learned rounds.

Step 303, determining whether the training cost C value obtained by the current calculation is decreased relative to the training cost C value obtained by the last calculation, if so, entering step 304, otherwise, ending the process.

And 304, when the value of the training cost C value obtained by current calculation is reduced relative to the C value maintained locally, adding the target terminal to the first set, deleting the target terminal from the candidate terminal set, and updating the C value maintained locally to the value of the training cost C value obtained by current calculation. In addition, when the C value is calculated for the first time, because the C value is not maintained locally, the target terminal is directly added into the first set at the moment, and the calculated C value is used as the initial value of the locally maintained C value to start maintenance.

A calculation mode of the value of the training cost C value is provided below, and specifically, the embodiment of the present invention may calculate the value of the training cost C value according to the following formula:

wherein the content of the first and second substances,

eta represents a preset learning rate;is a preset system parameter;an estimate representing a number of federal learning executables; tau represents the updating times of the local model in each round of federal learning; ρ represents all terminals participating in federal learningA global value of a convexity estimation of the local loss function of (1); h (τ) represents; m represents the set of all terminals participating in federal learningThe number of middle elements; II, representing the number of the scheduled terminals of the current wheel; beta represents the global value of the smoothness estimation value of the local loss function; diRepresenting the size of the local training data set; d represents the size of the global training data set; deltaiA gradient diversity estimate representing a local loss function of terminal i; delta denotes all terminals participating in federal learningIs used to locally lose a global value of the gradient diversity of the function. The derivation process associated with the above formula is further described below.

As can be seen,and | Π | is determined by the scheduled terminal of the current round, so that a first set enabling federal learning to converge faster can be obtained by calculating the magnitude of the C value in the case of different first sets (i.e., different scheduled terminals) and comparing the influence of the various first sets on the federal learning convergence speed.

Through the steps, when the terminal scheduled in each round is determined, the gradient estimation information of the terminal in the current round is introduced, and the scheduled terminal estimates and uploads the gradient information besides basic federal learning local model updating and model updating information uploading. And the wireless access point estimates the contribution of each terminal to the federal learning convergence rate according to the gradient estimation information uploaded by the terminal, so that the equipment consuming the least time in model updating is continuously selected, and the maximum federal learning convergence rate is realized through terminal scheduling.

Before the step 23, the wireless access point may determine whether the elapsed time of the federal learning until the current round exceeds a preset total training time budget; under the condition that the total budget of the preset training time is exceeded, outputting the optimal global model as a training result, and then ending the process; in case the preset total training time budget is not exceeded, the above step 23 is performed, so that an iterative training process can be performed until the final global model is obtained.

Referring to fig. 4, a terminal scheduling method in wireless federation learning according to an embodiment of the present invention, when applied to the terminal 101 shown in fig. 1, includes:

and step 41, updating a local model by the terminal in the current round of federal learning to obtain a value of a local loss function, and estimating a convexity estimation value and a smoothness estimation value of the local loss function.

Here, the terminal may respectively calculate a first loss value of the global model received in the current round of federal learning and a second loss value of the local model obtained by updating the current round of federal learning by using a local loss function; calculating a first norm of a difference between the first loss value and the second loss value, and calculating a second norm of a difference between a global model received in the current round of federal learning and a local model obtained by updating the current round of federal learning; calculating the ratio of the first norm to the second norm to obtain a convexity estimation value of the local loss function; then, calculating a third norm of the difference between the gradient of the first loss value and the gradient of the second loss value; and calculating the ratio of the third norm to the second norm to obtain the estimated smoothness value of the local loss function.

And step 41, the terminal sends the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

Through the steps, the terminal sends the gradient estimation information obtained in the current round of learning to the wireless access point, so that the wireless access point can generate the scheduled terminal of the next round of federal learning by using the gradient estimation information, the wireless access point can be helped to select the equipment consuming the least time in model updating, and the convergence speed of the federal learning is maximized through terminal scheduling.

Fig. 5 further shows a schematic flow chart of the terminal in each round of federal learning, which includes:

step 501, in each round of federal learning, the terminal determines whether local model updating is needed according to whether the current round is scheduled or not. In this step, the terminal determines whether the information required for updating the local model includes: the set of scheduled terminals determined by the previous round of federal learning, i.e., whether each terminal participates in the current round of training.

Step 502, if yes, the terminal firstly uses a random Gradient Descent (SGD) or Gradient Descent (GD) algorithm to update local machine learning model parameters by using a local training data set, where the information required for updating the model includes: local model parameters, local training data set.

Step 503, the terminal then estimates the smoothness estimation value and the convexity estimation value (collectively referred to as gradient estimation information herein) of the local loss function according to the model parameter changes before and after updating, and uploads the gradient estimation information and the local model update information to the wireless access point, where the information required for calculating the gradient estimation information includes: the local model parameters before updating, the local model parameters after updating and the local training data set.

Fig. 6 is an interaction flowchart between a wireless access point and a terminal according to the method provided in the embodiment of the present invention, which specifically includes:

step 601, considering a wireless federal study with total training delay limit, firstly connecting all terminals participating in the federal study by a wireless access point, and initializing gradient estimation information of each terminal.

Step 602, when each round of federal learning starts, judging whether the training time consumed by the federal learning at the current moment exceeds a preset total budget of the training time, if so, ending the federal learning, otherwise, starting the federal learning of the current round, wherein the required relevant information in the step is as follows: the total budget of training time, the training time that has been consumed.

Step 603, in each round, firstly executing the terminal scheduling algorithm shown in fig. 3 to determine the terminal scheduled in the current round, wherein the relevant information required in this step is as follows: all information required for steps 301-304 in FIG. 3.

Step 604, then, all terminals execute in parallel, the terminal federation learning flow shown in fig. 4, and the relevant information required in this step: all information required for steps 501-503 in FIG. 5.

Step 605, finally, after the wireless access point receives the gradient estimation information and the local model update information uploaded by each scheduled terminal, the wireless access point updates the global model information and records the gradient estimation information, the information required in this step: model parameters updated by the scheduled terminal, estimated values of smoothness, convexity and gradient diversity of a local loss function of the scheduled terminal and the like enter the next round of federal learning after the estimation is completed.

Taking a wireless federal study including two terminals A, B and a wireless access point C as an example, when applied to the above method of the embodiment of the present invention, the method includes the following steps:

1) the terminal A, B is respectively connected with the wireless access point C, and the wireless access point C initializes the gradient estimation information of the terminal A, B stored by the wireless access point C;

2) the wireless access point C judges whether the total training time is exhausted, if not, the step 3 is continuously executed, otherwise, the current federal learning process is ended;

3) the wireless access point C executes the terminal scheduling algorithm shown in fig. 3;

4) the terminals A, B execute the terminal federal learning procedures shown in fig. 5, respectively;

5) the wireless access point C receives model update and gradient estimation information uploaded by a scheduled terminal;

6) wireless access point C performs a global model update, updates its stored gradient estimation information for terminal A, B, and returns to step 2.

The terminal scheduling method according to the embodiment of the present invention is introduced above, and the embodiment of the present invention performs terminal scheduling by using a wireless federal learning algorithm for a terminal deployed in a wireless access point and a coverage area thereof, and has the following advantages:

according to the method and the device, the scheduled terminal in each round can be dynamically adjusted through gradient information estimation performed by each terminal in each round of federal learning and current round time consumption estimation performed by a wireless access point, so that the model accuracy rate which can be obtained by the federal learning within limited training delay is maximized, and the problem that the convergence speed of the federal learning is difficult to guarantee in a dynamic wireless environment and variable training data distribution due to the fact that the conventional terminal scheduling algorithm can only use certain preset fixed parameters is solved.

The derivation process associated with the above formula for calculating the training cost C value is described here. Table 1 gives definitions of relevant parameters or variables that may be involved in embodiments of the present invention. In addition, the symbol "←" usually denotes the assignment of a parameter to the right of the symbol to a variable to the left of the symbol. It should be noted that the following derivation process is only derived by taking an application scenario of the FDMA system as an example, the application scenario is not limited to the application scenario of the embodiment of the present invention, and the embodiment of the present invention may also be applied to other scenarios.

TABLE 1

Firstly, a system model:

consider a FL system consisting of one BS and M terminals, and these devices are composed of And (4) indexing. Each terminal i has a local data setComprises thatA training data sample. Where x isi,dIs the d-th s-dimensional input data vector at terminal i, yi,dIs xi,dThe tag output of (1). The whole data set is composed ofIs shown, wherein the total number of samplesIt is assumed here that all local data sets do not overlap each other.

The goal of the federally learned training process is to find the model parameters w in order to minimize a specific loss function for the entire data set. The optimization objective can be expressed as

Wherein the data set DiLocal loss function ofi(w) is defined as

Loss function f (w, x)i,d,yi,d) For capturing model parameters w versus input-output data pairs { xi,d,yi,dThe error of. Table 2 gives examples of some commonly used loss functions used in machine learning models.

TABLE 2

A. Federated learning on wireless networks

FL uses an iterative approach to solve the problem of equation (1) and each loop indexed by k contains the following 3 steps.

1) The BS first decides which devices to schedule to participate in the current round, i.e., the set of terminal devices scheduled in round k (i.e., kth round) is denoted by ΠkAnd (4) showing. The base station then broadcasts the current global modelTo all scheduling devices, whereinIndicating historical scheduling decisions by the (k-1) th round.

2) Each scheduled device i e pikReceive global models (e.g.) Updating its local model by applying a gradient descent algorithm to its local dataset:

where η is the learning rate. The local model update is repeated τ times, and τ is considered a fixed system parameter. Then, the updated local modelIs uploaded to the BS. In the following sections of the text, w is used unless otherwise statedi,kTo represent

3) After receiving all uploaded models, the BS aggregates them (i.e., weighted average the uploaded local models according to the size of the local data set) to obtain a new global model:

B. delay model

Considering an arbitrary round k, the total latency of the kth round consists of:

1) calculating the delay: to characterize the randomness of the computation delay of the local model update, a shifted exponential distribution (shifted exponential distribution) may be used:

wherein a isi> 0 and mui> 0 are parameters indicating the maximum value and fluctuation of the computing power, respectively. Suppose aiAnd muiAnd remains constant throughout the training process. Furthermore, due to the relatively strong computational power of the BS and the low complexity of model aggregation, the computational delay of model aggregation at the BS is ignored here.

2) Communication delay: regarding the local model upload phase of scheduled terminal devices, consider an FDMA system with a total bandwidth B, where the bandwidth allocated to a terminal device i is denoted γi,kB, wherein, γi,kIs to satisfyIn the distribution ratio of (1), wherein gamma is not less than 0i,kLess than or equal to 1. Thus, the achievable transmission rate (bits/sec) can be recorded asWherein P isiIndicating the transmission power, h, of the terminal device i remaining constant between different roundsi,kRepresenting the corresponding channel gain, N0Is the noise power density. The communication delay time of terminal device i is therefore:

wherein S represents wi,kIs in bits. Since the transmit power of the BS is much higher than the transmit power of the terminal device and the BS broadcasts the model using the entire downlink bandwidth, the latency of broadcasting the global model is ignored here.

Due to the synchronous model aggregation of the FL, the total delay per round is determined by the slowest device of all scheduling devices, i.e.,

C. problem formulation

A joint bandwidth allocation and scheduling problem is formulated to optimize the FL convergence rate with respect to time. In particular, K is used to represent the total number of rounds within the training time budget T and to minimize TA global loss function of (2), whereinThe optimal model parameter with the smallest global loss function value in the whole training process is defined as follows:

for simplicity, [ K ] and [ M ] are used to represent {1, 2,.. K } and {1, 2,. eta., M } respectively. The optimization problem can be expressed as:

s.t.

whereinHere, the first and second liquid crystal display panels are,

to solve for P1, K and Π need to be known[K]How to influence the final global model, i.e.Is measured. Since it is almost impossible to find information about K and Π[K]Precise analytical expression ofSo as to be converted to findThe upper limit of (3). While computing latency locallyAnd radio channel status hi,kCan vary with different k and is therefore the optimal scheduling strategyMay be non-stationary. Furthermore, due to the iterative nature of the FL, the global model is relevant to the scheduling policy of all past rounds. Therefore, it is difficult to obtain under the non-smooth scheduling policyThe upper limit of (3). Another difficulty is that the problem has a high dimensional solution space because of the optimization variable Π[K],γ[K]Related to K, which is itself an optimization variable.

Hereinafter, P1 is solved as follows. First, P1 is split into two sub-problems, device scheduling and bandwidth allocation. And then resolving the bandwidth allocation subproblem. Further, based on the optimal bandwidth allocation and the derived convergence bounds of the FL under a fixed random scheduling policy, the sub-problem of device scheduling is approximately solved using a joint device scheduling and bandwidth allocation algorithm.

Here, a solution to joint device scheduling and bandwidth allocation is provided.

P1 is broken down into the following forms. First, the scheduling policy for the kth round (i.e. Π) is givenk) The bandwidth of the k-th roundThe assignment sub-problem can be expressed as:

s.t.

then, willIs expressed asThe device scheduling sub-problem may be expressed as:

s.t.

A. bandwidth allocation

The P2 optimal solution may be obtained using the following theorem:

theorem 1: the optimal bandwidth allocation for P2 is:

wherein the content of the first and second substances,w (-) is a Lambertian W (Lambert-W) function, t*k) Is the target value of P2, such that:

due to the Lambert-W function term in (10), where the argument is via Γi,kT of*k). Therefore, a binary search algorithm was proposed to numerically obtain the optimal value of P2. From equal to the initial search area tlow,tup]Starting with the upper bound target value t, iteratively calculating the bandwidth required for the current target value t according to (9), and halving the search area according to whether the bandwidth satisfies the bandwidth constraint. Given the accuracy requirement (i.e., ε) of the search results, the complexity of the algorithm isOf the order of magnitude of (d).

B. Convergence analysis

To solve P3, the convergence of the FL is analyzed under a fixed random scheduling strategy Π, which refers to randomly scheduling a fixed number (specific number | Π |) of devices among all devices in each round. ExportingLower limit of (2), description thereofAccuracy and w*The error between the accuracies of (1) to (2), whereinIs the best model parameter, w, with the smallest global loss function value during the entire training process*Is the true best model parameter that minimizes f (w).

Before convergence analysis, some symbols are introduced first, as shown in table 1, for a fixed random scheduling policy ii, use is made ofTo representIntroducing two auxiliary model parameter vectors, where wk(k.gtoreq.1) is used to indicate the time at which the kth wheel starts andsynchronize and by scheduling all devices in the kth round (i.e.) And the updated model parameter vector, vk(k.gtoreq.1) is used to indicate the time at which the kth wheel starts andmodel parameter vectors that are synchronized and updated by a centralized gradient descent. During the k-th concentrated gradient descent, vkAccording toUpdate τ times.

For ease of analysis, the following assumptions are made for the loss function F (·).

Assume 1 that the following loss functions are assumed for all terminal devices:

Fi(w) is convex;

Fi(w) is ρ -Lipschitz, i.e., for any w, w', | | Fi(w)-Fi(w′)||≤ρ||w-w′||;

Fi(w) is β -smoothed, i.e. for any w, w',

for any of i and w, the difference between the local gradient and the global gradient may be determined byIs defined and defines

These assumptions are widely used in the literature for FL convergence analysis, although the loss functions of some machine learning models (e.g., neural networks) do not fully satisfy them, especially the convex assumptions. However, the inventors have found through experimental results that the proposed scheduling strategy works well even for neural networks.

First, a slave fixed random scheduling policy II (i.e., II) is obtained) Global model obtained by polymerization and wkThe upper limit of the difference therebetween.

Definition 1: if and only if the strategy II is from all terminal equipmentsThe size of the uniform random sampling is a subset of | pi | and when | pi | remains constant during the entire training process, the strategy pi is defined as a fixed random scheduling strategy.

Theorem 2: for any k and fixed random scheduling strategy II (II | ≧ 1), there have been already

WhereinThe expected value takes into account the randomness of Π.

Note that the learning rate η > 0, otherwise the gradient descent process becomes unimportant. Also has beta > 0 and deltai> 0, otherwise the loss function and its gradient become negligible. Thus, for x ═ 1, 2, ·, τ, gi(x) > 0, thus A > 0, wherein A is defined in formula (11). Obviously, A is independent of Π, anddecreasing with | pi | as well. Thus, scheduling fewer devices results inA greater upper limit, i.e.This means that scheduling fewer devices results in slower convergence in terms of the number of rounds. In addition, when(i.e. scheduling all devices) B (II) reaches its lower bound of zero, which is in contrast to wkThe definitions of (a) and (b) are consistent.

Will theorem 2 on wkCan obtain the theorem that defines the optimal model parameters during the entire training processLoss function of and true best model parameter w*The difference between them.

Theorem 3: when in useWhen the sum of pi is a fixed random scheduling strategy,and F (w)*) The difference between satisfies:

wherein the content of the first and second substances,

the expected value takes into account randomness of pi.

Theorem 3 quantifies the trade-off between latency per round and the number of rounds required. Scheduling more devices increases the latency per round and thus reduces the number of possible rounds (i.e., K) within a given training time budget T, while a smaller K may reduceThe lower limit of (3). At the same time, scheduling more devices decreases the value of B (Π) shown in theorem 2, while smaller B (Π) may increaseThe lower limit of (3). The scheduling strategy should therefore be carefully optimized to balance the tradeoff between latency per round and the number of rounds required, in order to minimize the penalty function of the optimal global model (i.e.,)。

C. equipment scheduling algorithm

In practical wireless networks, the delay time is calculated locally due to fluctuations in the wireless channel and device computing powerAnd radio channel status hi,kVariations may occur in different runs k. Thus, for k' > k, at the kth round,and hi,k′Is unknown, which makes the constraint (C3.1) in P3 difficult to handle, since for k' > k,is unknown. To solve the problemThis problem, here solved for P3 in near vision. Consider any round k and any scheduling policy ΠkApproximately consider ΠkUsed in the whole training process, therefore the total number of communication rounds can be approximated asWhereinRepresenting the floor function. Thus, P3 can be approximated as a near vision problem in each round:

s.t.

for a given global penalty function, F (w)*) Is constant and is therefore minimizedIs equal to maximumFurther, the learning rate η may be selected to be small enough to satisfyThus, the goal of P4 can be maximized by following theorem 3Is approximated, which is equivalent to minimizing the denominator on the right hand side of (12):

s.t.

the approximate scheduling subproblem P5 is the training cost C value in the foregoing.

The derivation of the formula for training the value of the cost C is described above.

A specific implementation of the terminal scheduling scheme to which the embodiments of the present invention are applied will be further described below. Since the constraint (C5.2) is still combinatorial optimization, the approximate scheduling sub-problem P5 is difficult to solve. Therefore, the following greedy scheduling algorithm is proposed for the scheduled device.

Greedy scheduling algorithm

S1, initializing pi as an empty set

S2, executing greedy scheduling:wherein, t*() is given by a preset algorithm, which can be various existing algorithms in the prior art, and is not limited herein;

s3, estimatingAnd updateAnd II ← II ^ X;

s4, calculating

S5, whenThe following loop is executed:

s6, greedy schedulingWherein, t*() given by a preset algorithm;

s7, estimating

S8, calculating

S9, if C' > C, executing the following loop:

s10, C' > C, ending the process;

s11, otherwise, entering S12;

s12, updateΠ←Π∪{x},C←C′

S13, ending the loop of S9;

s14, ending the loop of S5;

s15, return to Π.

In the greedy scheduling algorithm described above, the device that consumes the least time for model update and upload is iteratively selected to the set of scheduling devices (S6) until the objective function of P5 starts to increase (S9-S10). The greedy scheduling algorithm has the complexity ofOf the order of magnitude ofNatural brute force search algorithms of the order of magnitude of (g) are more efficient.

However, due to the unknown true optimal model w*Of which toAnalytically estimatingIs not insignificant and will thereforeSystem parameters are considered to remain fixed throughout the training process. In the experiment, show thatPerform well on different system settings, such as data distribution and cell radius, and searchIs not difficult.

Various methods of embodiments of the present invention are described above, and further apparatus for performing the methods are provided below.

Referring to fig. 7, an embodiment of a wireless access point 70 includes:

the data receiving module 71 is configured to receive, by the wireless access point, a local model and values of a local loss function, a convexity estimation value and a smoothness estimation value, which are sent by each terminal scheduled in the current round after the completion of the federal learning of the current round;

the model updating module 72 is used for updating the local models sent by the terminals to obtain the global model of the current wheel; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

the diversity calculation module 73 is configured to calculate a gradient diversity estimation value of the local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and a scheduling terminal generating module 74, configured to generate scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, where the gradient estimation information includes a convexity estimation value, a smoothness estimation value, and a gradient diversity estimation value of a local loss function.

Through the modules, the wireless access point of the embodiment of the invention can maximize the model accuracy rate which can be obtained by federal learning within limited training delay.

Optionally, the diversity calculation module is further configured to:

respectively estimating the gradient of the local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

calculating the gradient of the global loss function of the current round according to the gradients of the local loss functions of all the terminals scheduled to the current round;

and calculating the gradient diversity estimated value of the local loss function of each terminal in the current round according to the gradient of the local loss function of each terminal in the current round and the gradient of the global loss function of the current round.

Optionally, the scheduling terminal generating module is further configured to

According to the gradient estimation information of each terminal in the current round and the ratio of the local training data set of the terminal in the global training data set as the weight of the terminal, carrying out weighted summation on the gradient estimation information of all terminals participating in federal learning in the current round to obtain the global value of the gradient estimation information of all terminals in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function;

and generating scheduled terminals for next round of federal learning according to the gradient estimation information of each terminal in the current round and the global value of the gradient estimation information in the current round.

Optionally, the scheduling terminal generating module is further configured to

Initializing a first set whose contents are empty;

under the condition that the number of the terminals in the candidate terminal set is greater than 0, repeatedly executing the following steps until the training cost C value obtained by calculation is not reduced any more, and taking the terminals in the first set as the scheduled terminals for the next round of federal learning:

traversing the terminals in the candidate terminal set, respectively estimating the time consumption of the federal learning with the terminal and the first set as scheduled terminals in the current round of learning, and determining a target terminal with the shortest time consumption;

estimating the total number of rounds of federal learning according to a preset total budget of training time and the shortest time consumption; calculating values of the target terminal, the first set and a training cost C value of federal learning as a scheduled terminal according to the total number of rounds, the global value of the convexity estimation value in the current round, the global value of the smoothness estimation value in the current round and the gradient diversity of local loss functions of all terminals in the current round;

and when the value of the training cost C value obtained by current calculation is reduced relative to the C value maintained locally, adding the target terminal into the first set, deleting the target terminal from the candidate terminal set, and updating the C value maintained locally into the value of the training cost C value obtained by current calculation.

Optionally, the scheduling terminal generating module is further configured to calculate a value of the training cost C according to the following formula:

wherein the content of the first and second substances,

η represents the learning rate;is a preset system parameter;an estimate representing a number of federal learning executables; tau represents the updating times of the local model in each round of federal learning; rho represents a global value of a convexity estimation value of local loss functions of all terminals participating in federal learning; h (τ) represents; m represents the set of all terminals participating in federal learningThe number of middle elements; II, representing the number of the scheduled terminals of the current wheel; beta represents the global value of the smoothness estimation value of the local loss function; diRepresenting the size of the local training data set; deltaiA gradient diversity estimate representing a local loss function of terminal i; δ represents the global value of the gradient diversity of the local loss functions of all terminals participating in federal learning; d represents the size of the global training data set.

Optionally, the wireless access point further includes:

the loop judgment module is used for judging whether the consumed time of the federal learning till the current round exceeds the preset total training time budget or not before calculating the gradient diversity of the local loss function of the current round of each terminal according to the global model of the previous round and the local model sent by each terminal; under the condition that the total budget of the preset training time is exceeded, outputting the optimal global model as a training result; and under the condition that the total budget of the preset training time is not exceeded, triggering the diversity calculation module to calculate the gradient diversity of the local loss function of each terminal in the current round.

As shown in fig. 8, the embodiment of the present invention further provides another wireless access point 80, and the wireless access point 80 specifically includes a processor 81, a memory 82, a bus system 83, a receiver 84, and a transmitter 85. Wherein, the processor 81, the memory 82, the receiver 84 and the transmitter 85 are connected by a bus system 83, the memory 82 is used for storing instructions, the processor 81 is used for executing the instructions stored in the memory 82 to control the receiver 84 to receive signals and control the transmitter 85 to transmit signals;

the processor 81 is configured to read a program in a memory, and execute the following processes:

receiving a local model and values of local loss functions, a convexity estimation value and a smoothness estimation value which are sent by each scheduled terminal of the current round after the completion of the federal learning of the current round;

updating to obtain a global model of the current wheel according to the local models sent by each terminal; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

calculating a gradient diversity estimation value of a local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

It should be understood that, in the embodiment of the present invention, the processor 81 may be a Central Processing Unit (CPU), and the processor 81 may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 82 may include a read-only memory and a random access memory, and provides instructions and data to the processor 81. A portion of the memory 82 may also include non-volatile random access memory. For example, the memory 82 may also store device type information.

The bus system 83 may include a power bus, a control bus, a status signal bus, and the like, in addition to the data bus. For clarity of illustration, however, the various buses are labeled as bus system 83 in the figures.

In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 81. The steps of a method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 82, and the processor 81 reads the information in the memory 82 and performs the steps of the above method in combination with the hardware thereof. To avoid repetition, it is not described in detail here.

When executed by the processor, the program can implement all implementation manners in the terminal scheduling method in wireless federation learning shown in fig. 2, and can achieve the same technical effect, and is not described herein again to avoid repetition.

In some embodiments of the invention, there is also provided a computer readable storage medium having a program stored thereon, which when executed by a processor, performs the steps of:

receiving a local model and values of local loss functions, a convexity estimation value and a smoothness estimation value which are sent by each scheduled terminal of the current round after the completion of the federal learning of the current round;

updating to obtain a global model of the current wheel according to the local models sent by each terminal; calculating the value of the global loss function of the current round according to the values of the local loss functions of all the terminals scheduled to the current round, and determining whether to update the optimal global model according to whether the value of the global loss function of the current round is superior to the value of the global loss function corresponding to the optimal global model;

calculating a gradient diversity estimation value of a local loss function of each terminal in the current round according to the global model of the previous round and the local model sent by each terminal;

and generating scheduled terminals for next round of federal learning according to gradient estimation information of each terminal in the current round, wherein the gradient estimation information comprises a convexity estimation value, a smoothness estimation value and a gradient diversity estimation value of a local loss function.

When executed by the processor, the program can implement all implementation manners in the terminal scheduling method applied to the wireless access point, and can achieve the same technical effect, and is not described herein again to avoid repetition.

As shown in fig. 9, an embodiment of the present invention further provides a terminal 90, including:

the model updating module 91 is used for updating a local model in the current round of federal learning, obtaining the value of a local loss function, and estimating the convexity estimation value and the smoothness estimation value of the local loss function;

and a data sending module 92, configured to send the updated local model, and the value of the local loss function, the convexity estimation value, and the smoothness estimation value to the wireless access point.

Optionally, the model updating module is further configured to:

respectively calculating a first loss value of a global model received in the current round of federal learning and a second loss value of a local model obtained by updating the current round of federal learning by using a local loss function; calculating a first norm of a difference between the first loss value and the second loss value, and calculating a second norm of a difference between a global model received in the current round of federal learning and a local model obtained by updating the current round of federal learning; calculating the ratio of the first norm to the second norm to obtain a convexity estimation value of the local loss function;

calculating a third norm of the difference between the gradient of the first loss value and the gradient of the second loss value; and calculating the ratio of the third norm to the second norm to obtain the estimated smoothness value of the local loss function.

As shown in fig. 10, another terminal 100 is provided in the embodiment of the present invention, where the terminal 100 specifically includes a processor 101, a memory 102, a bus system 103, a receiver 104, and a transmitter 105. Wherein, the processor 101, the memory 102, the receiver 104 and the transmitter 105 are connected through the bus system 103, the memory 102 is used for storing instructions, the processor 101 is used for executing the instructions stored in the memory 102 to control the receiver 104 to receive signals and control the transmitter 105 to transmit signals;

the processor 101 is configured to read a program in a memory, and execute the following processes:

in the current round of federal learning, updating a local model, obtaining the value of a local loss function, and estimating the convexity estimation value and the smoothness estimation value of the local loss function;

and the terminal sends the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

It should be understood that, in the embodiment of the present invention, the processor 101 may be a Central Processing Unit (CPU), and the processor 101 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 102 may include both read-only memory and random access memory and provides instructions and data to the processor 101. A portion of the memory 102 may also include non-volatile random access memory. For example, the memory 102 may also store device type information.

The bus system 103 may include a power bus, a control bus, a status signal bus, and the like, in addition to the data bus. For clarity of illustration, however, the various buses are labeled as bus system 103 in the figures.

In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 101. The steps of a method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 102, and the processor 101 reads the information in the memory 102 and completes the steps of the method in combination with the hardware thereof. To avoid repetition, it is not described in detail here.

When executed by the processor, the program can implement all implementation manners in the terminal scheduling method in wireless federation learning shown in fig. 4, and can achieve the same technical effect, and is not described herein again to avoid repetition.

In some embodiments of the invention, there is also provided a computer readable storage medium having a program stored thereon, which when executed by a processor, performs the steps of:

in the current round of federal learning, updating a local model, obtaining the value of a local loss function, and estimating the convexity estimation value and the smoothness estimation value of the local loss function;

and the terminal sends the updated local model, the value of the local loss function, the convexity estimation value and the smoothness estimation value to the wireless access point.

When executed by the processor, the program can implement all implementation manners in the terminal scheduling method applied to the terminal side, and can achieve the same technical effect, and is not described herein again to avoid repetition.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

34页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:一种纵向联邦学习建模方法、装置、设备及计算机介质

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!