Note: This is a component of the Starfish federated learning platform. For the complete system overview and setup instructions, see the main README.
A federated learning (FL) system that is friendly to users with diverse backgrounds, for instance, in healthcare. This is the Controller component.
A Controller is installed on every site participating in federated learning. With the Controller running, a Site can act as either a Coordinator or a Participant.
The Controller includes optional LLM agent hooks that fire at key points in the task lifecycle (post-training summaries, pre-aggregation outlier detection, post-aggregation convergence check, failure triage). Enable per-task via the agent config block. Requires ANTHROPIC_API_KEY and poetry install --extras agent.
For information about the overall Starfish architecture, see the main documentation.
If you're working with the complete Starfish mono repo, use the workbench for a unified setup:
cd ../workbench
make build
make upSee the main README and workbench documentation for details.
For standalone controller deployment or development:
This is the easiest way to get started. Docker Compose will set up both the application, Redis cache, and Celery workers.
!!! note "FederatedUNet Support"
The default image does not include TensorFlow. To enable image segmentation tasks, build with docker-compose build --build-arg INSTALL_UNET=true.
-
Configure environment variables
Copy
.env.exampleto.envand update the values:cp .env.example .env
Update the following in
.env:SITE_UID=<generate-unique-uuid> ROUTER_URL=http://your-router-url:8000/starfish/api/v1 ROUTER_USERNAME=your_username ROUTER_PASSWORD=your_password
-
Build the images
docker-compose build
-
Start the services
docker-compose up -d
-
Run database migrations
docker exec -it starfish-controller poetry run python3 manage.py migrate -
Access the application
Open your browser and navigate to: http://localhost:8001/
-
Stop the services
docker-compose stop
To stop and remove containers:
docker-compose down
This method gives you more control but requires manual setup of Redis and Python environment.
- Python 3.10.10
- Redis server
- Poetry for dependency management
- R 4.x (required for R-based tasks) with packages:
jsonlite,survival,mice - System libraries for R package compilation:
gfortran,libnlopt-dev,cmake,liblapack-dev,libblas-dev - pyenv (recommended for Python version management on macOS/Linux)
- pyenv-win (recommended for Python version management on Windows)
-
Install pyenv (if not already installed)
macOS:
brew update brew install pyenv
Linux:
Follow the instructions here:
https://github.com/pyenv/pyenv?tab=readme-ov-file#installationWindows:
Follow the instructions here:
https://github.com/pyenv/pyenv?tab=readme-ov-file#installation
After installation, restart your PowerShell/Command Prompt.
-
Install and configure Python 3.10.10
macOS/Linux:
pyenv install 3.10.10 pyenv local 3.10.10Windows:
pyenv install 3.10.10 pyenv local 3.10.10
-
Create and activate virtual environment
macOS/Linux:
python -m venv .venv source .venv/bin/activateWindows (PowerShell):
python -m venv .venv .\.venv\Scripts\Activate.ps1 -
Install Poetry (if not already installed)
macOS/Linux:
curl -sSL https://install.python-poetry.org | python3 -Windows (PowerShell):
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -
After installation, add Poetry to your PATH if it's not already added.
-
Install project dependencies
poetry install
-
Install and start Redis
macOS:
brew install redis brew services start redis
Linux (Ubuntu/Debian):
sudo apt-get install redis-server sudo systemctl start redis
Windows:
Download Redis from https://redis.io/download or use WSL.
-
Configure environment variables
Copy
.env.exampleto.envand update the values:cp .env.example .env
Update Redis connection settings for local development:
SITE_UID=<generate-unique-uuid> ROUTER_URL=http://your-router-url:8000/starfish/api/v1 ROUTER_USERNAME=your_username ROUTER_PASSWORD=your_password CELERY_BROKER_URL=redis://localhost:6379 CELERY_RESULT_BACKEND=redis://localhost:6379 REDIS_HOST=localhost REDIS_PORT=6379 REDIS_DB=0
-
Run database migrations
python3 manage.py migrate
Note: On Windows, you might need to use
pythoninstead ofpython3:python manage.py migrate
-
Start Celery workers (in separate terminals)
Terminal 1 - Celery Beat (Scheduler):
celery -A starfish beat -l DEBUG
Terminal 2 - Run Worker:
celery -A starfish worker -l DEBUG -Q starfish.run
Terminal 3 - Processor Worker:
celery -A starfish worker -l DEBUG --concurrency=1 -Q starfish.processor
-
Start the development server (in a new terminal)
python3 manage.py runserver
Windows:
python manage.py runserver
-
Access the application
Open your browser and navigate to: http://localhost:8001/
macOS/Linux:
deactivateWindows:
deactivateThe controller supports FL tasks implemented in R via the AbstractRTask base class. R tasks use a Python-R bridge where:
- A Python wrapper class handles the FL lifecycle (data loading, artifact upload/download)
- Core ML logic lives in R scripts (
scripts/prepare_data.R,training.R,aggregate.R) - Communication between Python and R uses temporary JSON files
- R scripts are invoked via
Rscript --vanillasubprocess
Available R tasks: RLogisticRegression, RCoxProportionalHazards, RKaplanMeier, RCensoredRegression, RPoissonRegression, RNegativeBinomialRegression, RMultipleImputation
# Ubuntu/Debian
sudo apt-get install -y r-base-core r-recommended gfortran libnlopt-dev cmake liblapack-dev libblas-dev
sudo Rscript -e "install.packages(c('jsonlite', 'survival', 'mice'), repos='https://cloud.r-project.org')"
# macOS
brew install r
Rscript -e "install.packages(c('jsonlite', 'survival', 'mice'), repos='https://cloud.r-project.org')"R dependencies are automatically installed in the Docker image.
All regression tasks include a diagnostics sub-object in their mid-artifact output with privacy-safe summary statistics. The Python diagnostics module is at starfish/controller/tasks/diagnostics.py and the R equivalent at starfish/controller/tasks/r_diagnostics_utils.R.
See TASK_GUIDE.md for the full diagnostic field reference.
Run the test suite:
python3 manage.py testWindows:
python manage.py testFormat Python code using autopep8:
autopep8 --exclude='*/migrations/*' --in-place --recursive ./starfish/Add a new dependency:
poetry add <package_name>Add a development dependency:
poetry add --group=dev <package_name>Remove a dependency:
poetry remove <package_name>- Access to the git repository
- Docker and Docker Compose installed
- Access to a running Starfish Router instance
- Redis server (managed by Docker Compose)
- Internet access
- Properly configured firewall and network settings
Before deployment, configure the following files:
Configure application settings by copying .env.example to .env:
SITE_UID=<generate-unique-uuid> # Unique identifier for this site
ROUTER_URL=http://your-router:8000/starfish/api/v1 # URL to Starfish Router
ROUTER_USERNAME=your_username # Router username
ROUTER_PASSWORD=your_password # Router password
CELERY_BROKER_URL=redis://redis:6379 # Redis connection for Celery
CELERY_RESULT_BACKEND=redis://redis:6379 # Redis backend for results
REDIS_HOST=redis # Redis host (service name)
REDIS_PORT=6379 # Redis port
REDIS_DB=0 # Redis database numberImportant:
- Generate a unique
SITE_UIDfor each deployment (useuuidgenor similar tool) - Update
ROUTER_URLto point to your Starfish Router instance - Use secure credentials for
ROUTER_USERNAMEandROUTER_PASSWORD - Set
REDIS_HOST=redis(the service name in Docker Compose)
Also Note:
- Service Port: 8001 is by default and it is forwarded from the docker container. Please update if it has conflict with your existing service
- Volumes: The service will be running inside the docker container, but the mounted volumes will keep the intermedia files(logs and models). /starfish-controller/local by default, please update it if needed.
- Database: The redis is used as a cache storage and pub-sub service. By default, /opt/redis/data will store the cache database data as the mount volume.
Ensure proper network configuration:
- Firewall: Configure to allow access from trusted sources (other sites and users)
- IP Whitelist: Restrict access to specific IP addresses if needed
- Port Access: Ensure the service port (default 8001) is accessible from authorized networks
- Router Connectivity: Ensure the controller can reach the Routing Server
- Additional Security: Consider using:
- Reverse proxy (e.g., Nginx)
- SSL/TLS certificates
- VPN for site-to-site communication
-
Configure environment
Copy and update the
.envfile:cp .env.example .env # Edit .env with your settings -
Build the images
docker-compose build
-
Start the services
docker-compose up -d
-
Run database migrations
docker exec -it starfish-controller poetry run python3 manage.py migrate -
Verify the deployment
Visit http://your-server-ip:8001/ (replace with your actual server address and port)
-
Register the site with the Router
The controller will automatically attempt to register with the Router using the credentials in
.env. Check the logs to verify successful registration:docker-compose logs -f starfish-controller
# All services
docker-compose logs -f
# Specific service
docker-compose logs -f starfish-controller
docker-compose logs -f controller-run-workerdocker-compose restartdocker-compose stopdocker-compose downgit pull
docker-compose build
docker-compose up -d# Backup SQLite database
docker cp starfish-controller:/app/db.sqlite3 ./backup_db_$(date +%Y%m%d).sqlite3A common issue when using docker compose is:
ERROR: for redis Cannot start service redis: Ports are not available: exposing port TCP 0.0.0.0:6379 -> 0.0.0.0:0: listen tcp 0.0.0.0:6379: bind: address already in useIt happens because redis running in docker container conflicts with the local redis already running and occupying port 6379. Solution: Simply stop the local redis service.
sudo systemctl stop redis