flgo.experiment.device_scheduler
This module is for scheduling GPU devices to different runners. There are three pre-defined Schedulers: BasicScheduler, AutoScheduler, and RandomScheduler.
When the number of runners is large and GPU memory is limited, we recommend to use AutoScheduler. Otherwise, BasicScheduler and RandomScheduler are both good choices.
AbstractScheduler
Abstract Scheduler
Source code in flgo\experiment\device_scheduler.py
19 20 21 22 23 24 |
|
get_available_device(*args, **kwargs)
abstractmethod
Search for a currently available device and return it
Source code in flgo\experiment\device_scheduler.py
21 22 23 24 |
|
AutoScheduler
Bases: BasicScheduler
Automatically schedule GPUs by dynamically esimating the GPU memory occupation for all the runners and checking availability according to real-time memory information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
devices |
list
|
a list of the index numbers of GPUs |
required |
put_interval |
int
|
the minimal time interval (i.e. seconds) to allocate the same device |
5
|
mean_memory_occupated |
int
|
the initial mean memory occupation (i.e. MB) for all the runners |
1000
|
available_interval |
int
|
a gpu will be returned only if it is kept available for a period longer than this term |
5
|
dynamic_memory_occupated |
bool
|
whether to dynamically estimate the memory occupation |
True
|
dynamic_condition |
str
|
'mean' or 'max' |
'mean'
|
Example:
>>> import flgo.experiment.device_scheduler
>>> sc = flgo.experiment.device_scheduler.AutoScheduler([0,1])
>>> import flgo
>>> flgo.multi_init_and_run(runner_args, scheduler=sc)
Source code in flgo\experiment\device_scheduler.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
|
BasicScheduler
Bases: AbstractScheduler
Basic gpu scheduler. Each device will be always considered available and will be returned in turn.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
devices |
list
|
a list of the index numbers of GPUs |
required |
Source code in flgo\experiment\device_scheduler.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
add_process(pid=None)
Record the running process that uses the gpu from the scheduler
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pid |
int
|
the process id |
None
|
Source code in flgo\experiment\device_scheduler.py
54 55 56 57 58 59 60 61 62 |
|
get_available_device(*args, **kwargs)
Return the next device
Source code in flgo\experiment\device_scheduler.py
39 40 41 42 |
|
remove_process(pid=None)
Remove the running process that uses the gpu from the scheduler
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pid |
int
|
the process id |
None
|
Source code in flgo\experiment\device_scheduler.py
64 65 66 67 68 69 70 71 72 |
|
set_devices(devices)
Reset all the devices
Parameters:
Name | Type | Description | Default |
---|---|---|---|
devices |
list
|
a list of the index numbers of GPUs |
required |
Source code in flgo\experiment\device_scheduler.py
44 45 46 47 48 49 50 51 52 |
|
RandomScheduler
Bases: BasicScheduler
Random GPU Scheduler
Source code in flgo\experiment\device_scheduler.py
74 75 76 77 78 |
|
get_available_device(*args, **kwargs)
Return a random device
Source code in flgo\experiment\device_scheduler.py
76 77 78 |
|