Skip to content

Commit c6d4daf

Browse files
authored
Merge pull request #2229 from FedML-AI/dimitris/grpc_with_docker
Enabling grpc to work with docker containers.
2 parents f6b8c44 + 292bfb3 commit c6d4daf

File tree

22 files changed

+277
-62
lines changed

22 files changed

+277
-62
lines changed

python/examples/federate/cross_silo/grpc_fedavg_mnist_lr_example/README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,22 +8,25 @@ comm_args:
88
grpc_ipconfig_path: config/grpc_ipconfig.csv
99
```
1010
11-
`grpc_ipconfig_path` specifies the path of the config for gRPC communication. Config file specifies an ip address for each process through with they can communicate with each other. The config file should have the folliwng format:
11+
`grpc_ipconfig_path` specifies the path of the config for gRPC communication. Config file specifies an ip address for each process through with they can communicate with each other. The config file should have the following format:
1212

1313
```csv
14-
receiver_id,ip
15-
0,127.0.0.1
16-
1,127.0.0.1
17-
2,127.0.0.1
14+
eid,rank,grpc_server_ip,grpc_server_port
15+
0,0,0.0.0.0,8890
16+
1,1,0.0.0.0,8899
17+
2,2,0.0.0.0,8898
1818
```
1919

20-
Here the `receiver_id` is the rank of the process.
20+
Here, `eid, rank, ip, port` are the id, rank, ip address and port of the server or client process. For server processes the rank is always set to 0, while for clients is always set to 1 or above.
2121

2222
## One Line API Example
2323

24-
Example is provided at:
24+
Examples are provided at:
2525

2626
`python/examples/cross_silo/grpc_fedavg_mnist_lr_example/one_line`
27+
`python/examples/cross_silo/grpc_fedavg_mnist_lr_example/step_by_step`
28+
`python/examples/cross_silo/grpc_fedavg_mnist_lr_example/custom_data_and_model`
29+
2730
### Training Script
2831

2932
At the client side, the client ID (a.k.a rank) starts from 1.

python/examples/federate/cross_silo/grpc_fedavg_mnist_lr_example/custom_data_and_model/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ The default ip of every groc server is set to `0.0.0.0`, and all grpc ports star
99
> the aggregator (rank: 0). This record is mandatory. However, you can change the values of the `ip` and `port`
1010
> attributes as you see fit, and more records for grpc server of the rest of clients. For instance:
1111
```
12-
eid,rank,ip,port
12+
eid,rank,grpc_server_ip,grpc_server_port
1313
0,0,0.0.0.0,8890
14-
1,1,0.0.0.0,8899
15-
2,2,0.0.0.0,8898
14+
1,1,0.0.0.0,8891
15+
2,2,0.0.0.0,8892
1616
```
1717

1818
## Start Script
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
1-
eid,rank,ip,port
1+
eid,rank,grpc_server_ip,grpc_server_port
22
0,0,0.0.0.0,8890
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
2+
# Introduction
3+
In this working example, we will run 1 aggregation server and 2 clients on the same machine using Docker + gRPC and we will use the FEDML.ai platform to run the FL job.
4+
5+
# gRPC Configuration File
6+
The content of the gRPC configuration file is as follows:
7+
```
8+
eid,rank,grpc_server_ip,grpc_server_port,ingress_ip
9+
0,0,0.0.0.0,8890,fedml_server
10+
1,1,0.0.0.0,8899,fedml_client_1
11+
2,2,0.0.0.0,8898,fedml_client_2
12+
```
13+
The ingress_ip variable refers to the name of the container that we assign to either the server or the client, as we discuss in detail below:
14+
15+
16+
# Docker Configuration
17+
Before creating any docker container one our machine, we need to pull the latest fedml image (e.g., `fedml:v090`) and ensure that all spawned containers can communicate to each other through a network bridge (e.g., `fedml_grpc_network`).
18+
Specifically, what you need to do is:
19+
```bash
20+
docker pull fedml:v090
21+
docker network create fedml_grpc_network
22+
```
23+
24+
Once these two steps are configured we can start 1 aggregation server and 2 clients (without using a GPU) and register them using our <FEDML_API_KEY> with the fedml platform as follows:
25+
26+
```bash
27+
# Server
28+
docker run -it -p 8890:8890 --entrypoint /bin/bash --name fedml_server --network fedml_grpc_network fedml:dev090
29+
redis-server --daemonize yes
30+
source /fedml/bin/activate
31+
fedml login -s <FEDML_API_KEY>
32+
```
33+
34+
```bash
35+
# Client 1
36+
docker run -it -p 8891:8891 --entrypoint /bin/bash --name fedml_client_1 --network fedml_grpc_network fedml:dev090
37+
redis-server --daemonize yes
38+
source /fedml/bin/activate
39+
fedml login -c <FEDML_API_KEY>
40+
```
41+
42+
```bash
43+
# Client-2
44+
docker run -it -p 8892:8892 --entrypoint /bin/bash --name fedml_client_2 --network fedml_grpc_network fedml:dev090
45+
redis-server --daemonize yes
46+
source /fedml/bin/activate
47+
fedml login -c <FEDML_API_KEY>
48+
```
49+
50+
Then we only need to compile our job and submit to our dockerb-based cluster as it is also discussed in detail in the official FEDML documentation: https://fedml.ai/octopus/userGuides
51+

python/examples/federate/cross_silo/grpc_fedavg_mnist_lr_example/grpc_docker_fedmlai/__init__.py

Whitespace-only changes.
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
:: ### don't modify this part ###
2+
:: ##############################
3+
4+
5+
:: ### please customize your script in this region ####
6+
set DATA_PATH=%userprofile%\fedml_data
7+
if exist %DATA_PATH% (echo Exist %DATA_PATH%) else mkdir %DATA_PATH%
8+
9+
10+
:: ### don't modify this part ###
11+
echo [FedML]Bootstrap Finished
12+
:: ##############################
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
2+
# pip install fedml==0.7.15
3+
#pip install --upgrade fedml
4+
5+
### don't modify this part ###
6+
echo "[FedML]Bootstrap Finished"
7+
##############################
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
common_args:
2+
training_type: "cross_silo"
3+
scenario: "horizontal"
4+
using_mlops: false
5+
random_seed: 0
6+
7+
environment_args:
8+
bootstrap: config/bootstrap.sh
9+
10+
data_args:
11+
dataset: "mnist"
12+
data_cache_dir: "../../../../data/mnist"
13+
partition_method: "hetero"
14+
partition_alpha: 0.5
15+
16+
model_args:
17+
model: "lr"
18+
model_file_cache_folder: "./model_file_cache" # will be filled by the server automatically
19+
global_model_file_path: "./model_file_cache/global_model.pt"
20+
21+
train_args:
22+
federated_optimizer: "FedAvg"
23+
client_id_list:
24+
client_num_in_total: 1000
25+
client_num_per_round: 2
26+
comm_round: 50
27+
epochs: 1
28+
batch_size: 10
29+
client_optimizer: sgd
30+
learning_rate: 0.03
31+
weight_decay: 0.001
32+
33+
validation_args:
34+
frequency_of_the_test: 5
35+
36+
device_args:
37+
worker_num: 2
38+
using_gpu: false
39+
gpu_mapping_file: config/gpu_mapping.yaml
40+
gpu_mapping_key: mapping_default
41+
42+
comm_args:
43+
backend: "GRPC"
44+
grpc_ipconfig_path: config/grpc_ipconfig.csv
45+
46+
tracking_args:
47+
# When running on MLOps platform(open.fedml.ai), the default log path is at ~/.fedml/fedml-client/fedml/logs/ and ~/.fedml/fedml-server/fedml/logs/
48+
local_log_output_path: ./log
49+
enable_wandb: false
50+
wandb_key: ee0b5f53d949c84cee7decbe7a619e63fb1f8408
51+
wandb_project: fedml
52+
wandb_name: fedml_torch_fedavg_mnist_lr
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
eid,rank,grpc_server_ip,grpc_server_port,ingress_ip
2+
0,0,0.0.0.0,8890,fedml_server
3+
1,1,0.0.0.0,8891,fedml_client_1
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#!/usr/bin/env bash
2+
RANK=$1
3+
python3 torch_client.py --cf config/fedml_config.yaml --rank $RANK --role client

0 commit comments

Comments
 (0)