Ansible Automation¶
The Federated Learning Platform uses Ansible extensively for automated deployment and orchestration of federated learning components across multiple devices. This is a core feature that enables distributed training across edge devices.
Overview¶
Ansible automates the deployment of: - Flower Superlink (orchestrator/aggregator) - Flower Supernodes (client coordinators) - Client Applications (federated learning participants) - Inference Servers (model serving endpoints)
Directory Structure¶
backend/ansible/
├── inventory/ # Host inventory files
│ ├── aggregator.ini # Aggregator/orchestrator hosts
│ ├── client.ini # Client device hosts
│ └── devices.ini # Combined device inventory
├── group_vars/ # Variable definitions
│ ├── all.yml # Global variables
│ ├── aggregator.yml # Aggregator-specific vars
│ └── client.yml # Client-specific vars
├── roles/ # Ansible roles
│ ├── common/ # Common setup tasks
│ │ └── tasks/main.yml
│ ├── aggregator/ # Aggregator deployment
│ │ ├── tasks/main.yml
│ │ └── templates/docker-compose.yml.j2
│ └── client/ # Client deployment
│ ├── tasks/main.yml
│ └── templates/docker-compose.yml.j2
├── deploy.yml # Main deployment playbook
├── start.yml # Service startup playbook
├── stop.yml # Service shutdown playbook
├── copy_dataset.yml # Dataset distribution
└── start_inference_server.yml # Inference server startup
Core Playbooks¶
1. deploy.yml - Component Deployment¶
Installs dependencies, configures Docker, and deploys component files:
---
- name: Deploy common components
hosts: all
roles:
- common
- name: Deploy aggregator components
hosts: aggregator
roles:
- aggregator
- name: Deploy client components
hosts: client
roles:
- client
2. start.yml - Service Startup¶
Launches services using Docker Compose with profile-specific settings.
3. stop.yml - Service Shutdown¶
Stops all services using Docker Compose.
Inventory Configuration¶
Aggregator Inventory (inventory/aggregator.ini)¶
[aggregator]
aggregator_host ansible_host=<DEVICE_IP> ansible_connection=ssh ansible_user=<USERNAME> ansible_ssh_private_key_file=/root/.ssh/id_ansible
Client Inventory (inventory/client.ini)¶
[client]
client0 ansible_host=<DEVICE_IP_1>
client1 ansible_host=<DEVICE_IP_2>
clientN ansible_host=<DEVICE_IP_N>
[client:vars]
ansible_connection=ssh
ansible_user=<DEVICE_USERNAME>
ansible_ssh_private_key_file=/root/.ssh/id_ansible
Backend Integration¶
The FastAPI backend integrates with Ansible through dedicated services:
Ansible Service (app/services/ansible_service.py)¶
Key functions:
- run_ansible_command_with_logs() - Execute Ansible playbooks with real-time logging
- create_ansible_job() - Create and track Ansible job execution
- execute_and_update_job_status() - Monitor job progress
- upload_file_to_federated_learning() - Distribute ML models/datasets
Ansible Routes (app/routes/ansible.py)¶
API endpoints for Ansible operations:
- POST /ansible/run - Execute Ansible commands
- GET /ansible/jobs - List Ansible job status
- POST /ansible/upload - Upload files for distribution
- GET /ansible/files - List distributed files
Deployment Workflow¶
1. SSH Setup¶
2. Configure Inventory¶
Edit inventory files with target device IPs and credentials.
3. Deploy Components¶
# Deploy aggregator
ansible-playbook -i inventory/aggregator.ini deploy.yml --ask-become-pass
# Deploy clients
ansible-playbook -i inventory/client.ini deploy.yml --ask-become-pass
# Deploy all simultaneously
ansible-playbook -i inventory/aggregator.ini -i inventory/client.ini deploy.yml --ask-become-pass
4. Start Services¶
# Start aggregator
ansible-playbook -i inventory/aggregator.ini start.yml --ask-become-pass
# Start clients
ansible-playbook -i inventory/client.ini start.yml --ask-become-pass
5. Monitor via Web Interface¶
The FastAPI backend provides real-time monitoring of Ansible job execution through WebSocket connections.
Aggregator Role¶
The aggregator role (roles/aggregator/tasks/main.yml) performs:
-
Directory Setup
-
Docker Compose Configuration
-
ML Project Deployment
-
Docker Image Building
-
Service Startup
Client Role¶
The client role (roles/client/tasks/main.yml) performs:
-
Client-Specific Configuration
-
Dataset Management
-
Multi-Image Building
-
Client Service Startup
Advanced Features¶
Dynamic Configuration¶
- Project-specific deployments with variable project paths
- Client-specific dataset paths for data partitioning
- Superlink address configuration for network topology
Error Handling & Retry Logic¶
failed_when: >
(aggregator_containers.rc != 0) or
not (('serverapp' in aggregator_containers.stdout) and
('superlink' in aggregator_containers.stdout))
until: >
(aggregator_containers.rc != 0) or
(('serverapp' in aggregator_containers.stdout) and
('superlink' in aggregator_containers.stdout))
retries: 5
Real-time Job Monitoring¶
The backend tracks Ansible job execution with: - Job status tracking (pending, running, completed, failed) - Real-time log streaming via WebSocket - Progress monitoring through database updates
Production Considerations¶
Security¶
- SSH key management for device access
- Encrypted variable storage using Ansible Vault
- Secure inventory management with proper file permissions
Scalability¶
- Parallel deployment across multiple devices
- Batch operations for large device fleets
- Resource monitoring during deployment
Reliability¶
- Health checks for deployed services
- Rollback capabilities for failed deployments
- Service dependency management
This Ansible automation is essential for the federated learning platform's core functionality, enabling seamless deployment and orchestration across distributed edge devices.