flgo.algorithm
flgo.benchmark
This module is designed for fast creating federated tasks. For example, in FL, a commonly used benchmark is federated MNIST that splits MNIST into 100 shards and each shard contains data of two types of labels.
In FLGo, three basic components are created to describe a general procedure that can easily convert various ML tasks into federated ones.
Components
-
TaskGenerator
- load the original dataset
- partition the original dataset into local_movielens_recommendation data
-
TaskPipe
- store the partition information of TaskGenerator into the disk when generating federated tasks
- load the original dataset and the partition information to create the federated scenario when optimizing models
-
TaskCalculator
- support task-specific computation when optimizing models, such as putting data into device, computing loss, evaluating models, and creating the data loader
The architecture of a complete federate benchmark is shown as follows:
benchmark_name # benchmark folder
├─ core.py # core file
│ ├─ TaskGenerator # class TaskGenerator(...)
│ ├─ TaskPipe # class TaskPipe(...)
│ └─ TaskCalculator # class TaskCalculator(...)
│
├─ model # model folder (i.e. contains various types of models)
│ ├─ model1_name.py # model 1 (e.g. CNN)
│ ├─ ...
│ └─ modelN_name.py # model N (e.g. ResNet)
│ ├─ init_local_module # the function initializes personal models for parties
│ └─ init_global_module # the function initializes the global models for parties
│
└─ __init__.py # containing the variable default_model
Example: The architecture of MNIST is
├─ core.py
│ ├─ TaskGenerator
│ ├─ TaskPipe
│ └─ TaskCalculator
├─ model
│ ├─ cnn.py
│ └─ mlp.py
│ ├─ init_local_module
│ └─ init_global_module
└─ __init__.py
The details of implementing a customized benchmark are in Tutorial.3
flgo.experiment
This module is created for various experimental purposes
flgo.simulator
This module is to simulate arbitrary system heterogeneity that may occur in practice. We conclude four types of system heterogeneity from existing works.
System Heterogeneity Description
-
Availability: the devices will be either available or unavailable at each moment, where only the available devices can be selected to participate in training.
-
Responsiveness: the responsiveness describes the length of the period from the server broadcasting the gloabl model to the server receiving the locally trained model from a particular client.
-
Completeness: since the server cannot fully control the behavior of devices,it's possible for devices to upload imcomplete model updates (i.e. only training for a few steps).
-
Connectivity: the clients who promise to complete training may suffer accidients so that the server may lose connections with these client who will never return the currently trained local_movielens_recommendation model.
We build up a client state machine to simulate the four types of system heterogeneity, and provide high-level APIs to allow customized system heterogeneity simulation.
Example: How to customize the system heterogeneity:
>>> class MySimulator(flgo.simulator.base.BasicSimulator):
... def update_client_availability(self):
... # update the variable 'prob_available' and 'prob_unavailable' for all the clients
... self.set_variable(self.all_clients, 'prob_available', [0.9 for _ in self.all_clients])
... self.set_variable(self.all_clients, 'prob_unavailable', [0.1 for _ in self.all_clients])
...
... def update_client_connectivity(self, client_ids):
... # update the variable 'prob_drop' for clients in client_ids
... self.set_variable(client_ids, 'prob_drop', [0.1 for _ in client_ids])
...
... def update_client_responsiveness(self, client_ids, *args, **kwargs):
... # update the variable 'latency' for clients in client_ids
... self.set_variable(client_ids, 'latency', [np.random.randint(5,100) for _ in client_ids])
...
... def update_client_completeness(self, client_ids, *args, **kwargs):
... # update the variable 'working_amount' for clients in client_ids
... self.set_variable(client_ids, 'working_amount', [max(int(self.clients[cid].num_steps*np.random.rand()), 1) for cid in client_ids])
>>> r = flgo.init(task, algorithm=fedavg, Simulator=MySimulator)
>>> # The runner r will be runned under the customized system heterogeneity, where the clients' states will be flushed by
>>> # MySimulator.update_client_xxx at each moment of the virtual clock or particular events happen (i.e. a client was selected)
We also provide some preset Simulator like flgo.simulator.DefaultSimulator and flgo.simulator.