Federated Learning Using Particle Swarm Optimization

Federated learning is a method that stores only learnt models on a server in order to protect data privacy. This approach does not collect data on the server but instead collects data from scattered clients directly. Due to the fact that federated learning clients frequently have limited transmission bandwidth, communication between servers and clients should be streamlined to maximize performance. As a result, researchers have created the FedPSO algorithm, which combines the particle swarm optimization technique with federated learning to boost network communication performance. We will attempt to cover certain aspects of this system and comprehend the proposed system in this post. The points to be focused on in this post are outlined below.

Understanding the Current Scenario
Federated Learning
Particle Swarm Optimization
FedPSO
Evaluation

Let’s start the discussion by knowing how the need for PSO is identified in a federated learning system. Most of the discussions below are referred from an official research paper by Sunghwan Park and his team.

Understanding the Current Scenario

The use of mobile devices such as smartphones and tablets has recently increased. On mobile devices, several types of data are generated and accumulated, including data generated by users and sensors such as cameras, microphones, and the global positioning system.

The gathered data on mobile devices is excellent for deep learning, which performs well when there is a large volume of data. Mobile device data can be used for machine learning (ML) in a variety of ways.

Calculation time accounts for significantly more than communication time in traditional ANN models, therefore various approaches, such as leveraging graphics processing unit (GPU) accelerators and linking numerous GPUs, are utilized to reduce calculation time. In federated learning, however, communication takes longer than calculation.

There have been numerous studies on client communication to increase federated learning performance. Federated learning suffers from a number of issues caused by mobile devices’ unstable network environment, such as frequent node crashes, regularly shifting node groups, high central server overhead, and increased latency as the number of nodes grows.

Furthermore, multi-layer models have been employed to improve learning accuracy, although the number of weights for the nodes rises as the layers deepen. Because it increases the size of the network transmission between the server and the client, data size is a restriction for federated learning.

Many research works have recently explored this issue. Particle swarm optimization is one of them. PSO has been used in machine learning and research to increase performance, such as a PSO convolutional neural network, which employs a PSO to classify images and a linearly decreasing weight PSO for CNN hyperparameter optimization.

In addition, a study used PSO to determine optimum hyperparameters to increase the learning performance of federated learning clients. As a result, PSO has been consistently applied to a variety of methodologies, including updating ML model weights and tweaking hyperparameters.

Federated Learning

Federated learning is a learning strategy for distributed datasets that have been proposed. It uses datasets dispersed across several devices to train a model while limiting data leakage. Federated learning has the advantage of improving privacy and lowering communication costs.

ANN models can learn without compromising data or personal information thanks to federated learning. Transferring data from multiple devices to a single server also increases network traffic and storage costs. By exchanging only the weights generated from training the models, federated learning greatly cuts communication costs.

Source

The above figure outlines the process involved in a Federated Learning system and can be described as below:

The learning model is sent from the server to each client.
Client data is used to train the received models.
Each client delivers the server its trained model.
The server aggregates all of the collected models into a single updated model.
The server updates each client’s model, and processes 1 through 5 are repeated.

To achieve the fourth step in Figure above, several federated learning research papers employ algorithms such as federated stochastic gradient descent (FedSGD) and federated averaging (FedAvg). Federated learning needs a mobile device environment that is distributed.

Mobile devices have the drawback of learning on a wireless network rather than a solid cable network connection. When sending the trained model, if the network is unstable, the learning client’s connection may be lost, or the client may be unable to communicate the entire dataset.

So to overcome this problem as stated in the introduction of this post, researchers have used PSO to aggregate all the distributed models in an optimized way. In the next part before proceeding to the proposed system we will take a brief look at PSO.

Particle Swarm Optimization

Kennedy and Eberhart created PSO, the most well-known metaheuristic global optimization algorithm, in 1995. The program uses techniques inspired by natural swarms of birds and fish to maximize multiple variables at once.

Because of its ease of implementation, scalability, resilience, quick convergence, and simple mathematical operations, the PSO method provides advantages in memory requirements and speed. The algorithm takes a probabilistic approach to optimization, which necessitates a huge number of iterations.

The swarm and particles are two types of PSO components. A swarm is a collection of particles. Each particle symbolizes one of the problem’s possible solutions. For the next phase, each particle has a position and a velocity V. Particles communicate with each other step by step to find the global optimal value and share their individual pbest (particle best) variable.

The gbest (global best) variable is set to the optimal value of the shared pbest values by each particle: gbest = maxi (pbest). Each particle calculates inertia (V^t-1, preceding step’s speed), pbest, and gbest values.

To have practical exposure to the PSO I recommend you to go through this article which discusses the PSO where you will learn how to use PSO in your daily ML practice.

Now let’s move the proposed system i.e., FedPSO.

FedPSO

Deepening the layers of the model is a common way to improve the accuracy of ANN models. A deep neural network is what this is called. The number of weight factors that must be trained grows as the layers go deeper.

When the model learned on the client is transferred to the server in universal federated learning, the network communication cost increases significantly. To overcome this, the FedPSO algorithm was employed, which delivers the best score (such as accuracy or loss) to the server by using PSO characteristics to send the trained model, independent of size, to the server.

The suggested model, FedPSO, only receives the model weights for the client who supplied the top score, eliminating the need for all clients’ model weights to be conveyed. The figure below depicts the procedure.

The lowest loss value derived after training on the client is used to calculate the best score. This loss is only 4 bytes long. FedPSO determines the best model using the pbest and gbest variables, then updates each weighted array element of the best model using the value of V.

Source

The weight update method in FedPSO shown in the above Figure is as follows: the server receives a client’s score and requests a learning model; the client then sends the best value to make it a global model.

In the next section, we will see the evaluation result of the FedPSO system

Evaluation

Researchers conducted experiments to determine accuracy and convergence speed in order to evaluate FedPSO’s performance, and all parameters were observed in an unstable network environment. Below are two graphs: the first depicts the accuracy of the network as a function of convergence speed for both methods, and the second depicts the communication cost incurred by both algorithms.

The accuracy benchmarks of the two methods, as well as a cost analysis of data communication between clients and servers, were all achieved using the Canadian Institute for Advanced Research (CIFAR-10) and Modified National Institute of Standards and Technology (MNIST) datasets.

Source

Conclusion

We’ve seen how a FedPSO technique based on particle swarm optimization can increase federated learning network communication performance and minimize the bulk of data delivered from clients to servers in this post. By sharing the score value, the proposed technique aggregates the model trained on the server. The client with the highest score sends the server the trained model. As we can see in the above two graphs FedPSO succeeds in both terms of accuracy and communication cost.