Skip to content

Ansible Automation

The Federated Learning Platform uses Ansible extensively for automated deployment and orchestration of federated learning components across multiple devices. This is a core feature that enables distributed training across edge devices.

Overview

Ansible automates the deployment of: - Flower Superlink (orchestrator/aggregator) - Flower Supernodes (client coordinators) - Client Applications (federated learning participants) - Inference Servers (model serving endpoints)

Directory Structure

backend/ansible/
├── inventory/                    # Host inventory files
│   ├── aggregator.ini           # Aggregator/orchestrator hosts
│   ├── client.ini              # Client device hosts  
│   └── devices.ini             # Combined device inventory
├── group_vars/                  # Variable definitions
│   ├── all.yml                 # Global variables
│   ├── aggregator.yml          # Aggregator-specific vars
│   └── client.yml              # Client-specific vars
├── roles/                       # Ansible roles
│   ├── common/                 # Common setup tasks
│   │   └── tasks/main.yml
│   ├── aggregator/             # Aggregator deployment
│   │   ├── tasks/main.yml
│   │   └── templates/docker-compose.yml.j2
│   └── client/                 # Client deployment
│       ├── tasks/main.yml
│       └── templates/docker-compose.yml.j2
├── deploy.yml                   # Main deployment playbook
├── start.yml                   # Service startup playbook
├── stop.yml                    # Service shutdown playbook
├── copy_dataset.yml            # Dataset distribution
└── start_inference_server.yml  # Inference server startup

Core Playbooks

1. deploy.yml - Component Deployment

Installs dependencies, configures Docker, and deploys component files:

---
- name: Deploy common components
  hosts: all
  roles:
    - common

- name: Deploy aggregator components  
  hosts: aggregator
  roles:
    - aggregator

- name: Deploy client components
  hosts: client
  roles:
    - client

2. start.yml - Service Startup

Launches services using Docker Compose with profile-specific settings.

3. stop.yml - Service Shutdown

Stops all services using Docker Compose.

Inventory Configuration

Aggregator Inventory (inventory/aggregator.ini)

[aggregator]
aggregator_host ansible_host=<DEVICE_IP> ansible_connection=ssh ansible_user=<USERNAME> ansible_ssh_private_key_file=/root/.ssh/id_ansible

Client Inventory (inventory/client.ini)

[client]
client0 ansible_host=<DEVICE_IP_1>
client1 ansible_host=<DEVICE_IP_2>
clientN ansible_host=<DEVICE_IP_N>

[client:vars]
ansible_connection=ssh
ansible_user=<DEVICE_USERNAME>
ansible_ssh_private_key_file=/root/.ssh/id_ansible

Backend Integration

The FastAPI backend integrates with Ansible through dedicated services:

Ansible Service (app/services/ansible_service.py)

Key functions: - run_ansible_command_with_logs() - Execute Ansible playbooks with real-time logging - create_ansible_job() - Create and track Ansible job execution - execute_and_update_job_status() - Monitor job progress - upload_file_to_federated_learning() - Distribute ML models/datasets

Ansible Routes (app/routes/ansible.py)

API endpoints for Ansible operations: - POST /ansible/run - Execute Ansible commands - GET /ansible/jobs - List Ansible job status - POST /ansible/upload - Upload files for distribution - GET /ansible/files - List distributed files

Deployment Workflow

1. SSH Setup

# Setup SSH access to target devices
./setup-device-ssh.sh -u <username> -i <device_ip>

2. Configure Inventory

Edit inventory files with target device IPs and credentials.

3. Deploy Components

# Deploy aggregator
ansible-playbook -i inventory/aggregator.ini deploy.yml --ask-become-pass

# Deploy clients  
ansible-playbook -i inventory/client.ini deploy.yml --ask-become-pass

# Deploy all simultaneously
ansible-playbook -i inventory/aggregator.ini -i inventory/client.ini deploy.yml --ask-become-pass

4. Start Services

# Start aggregator
ansible-playbook -i inventory/aggregator.ini start.yml --ask-become-pass

# Start clients
ansible-playbook -i inventory/client.ini start.yml --ask-become-pass

5. Monitor via Web Interface

The FastAPI backend provides real-time monitoring of Ansible job execution through WebSocket connections.

Aggregator Role

The aggregator role (roles/aggregator/tasks/main.yml) performs:

  1. Directory Setup

    - name: Ensure destination directory exists
      file:
        path: "{{ destination_dir }}"
        state: directory
        mode: '0755'
    

  2. Docker Compose Configuration

    - name: Render docker-compose template on the device
      ansible.builtin.template:
        src: "docker-compose.yml.j2"
        dest: "{{ destination_dir }}/docker-compose.yml"
    

  3. ML Project Deployment

    - name: Copy mlproject folder
      ansible.builtin.copy:
        src: "{{ project_dir }}/fl-core/mlproject/{{ project_name }}/"
        dest: "{{ destination_dir }}/fl-core/mlproject/"
    

  4. Docker Image Building

    - name: Build Docker image
      community.docker.docker_image:
        name: flip/serverapp
        source: build
        build:
          path: "{{ destination_dir }}/fl-core"
          dockerfile: Dockerfile.serverapp
    

  5. Service Startup

    - name: Start aggregator service
      shell: |
        AGGREGATOR_COMPOSE='-f docker-compose.yml --profile aggregator up -d'
        docker compose $AGGREGATOR_COMPOSE || docker-compose $AGGREGATOR_COMPOSE
    

Client Role

The client role (roles/client/tasks/main.yml) performs:

  1. Client-Specific Configuration

    - name: Set client-specific variables
      set_fact:
        client_id: "{{ groups['client'].index(inventory_hostname) }}"
        superlink_address: "{{ superlink_address | default('192.168.1.100') }}"
    

  2. Dataset Management

    - name: Ensure client-specific dataset directory exists
      file:
        path: "{{ dataset_path }}"
        state: directory
    

  3. Multi-Image Building

    - name: Build clientapp Docker image on remote
      community.docker.docker_image:
        name: flip/clientapp
        source: build
    
    - name: Build inference Docker image on remote
      community.docker.docker_image:
        name: flip/inference
        source: build
    

  4. Client Service Startup

    - name: Start client services
      shell: |
        CLIENT_COMPOSE='--profile client up -d supernode clientapp'
        docker compose $CLIENT_COMPOSE || docker-compose $CLIENT_COMPOSE
    

Advanced Features

Dynamic Configuration

  • Project-specific deployments with variable project paths
  • Client-specific dataset paths for data partitioning
  • Superlink address configuration for network topology

Error Handling & Retry Logic

failed_when: >
  (aggregator_containers.rc != 0) or
  not (('serverapp' in aggregator_containers.stdout) and
  ('superlink' in aggregator_containers.stdout))
until: >
  (aggregator_containers.rc != 0) or
  (('serverapp' in aggregator_containers.stdout) and
  ('superlink' in aggregator_containers.stdout))
retries: 5

Real-time Job Monitoring

The backend tracks Ansible job execution with: - Job status tracking (pending, running, completed, failed) - Real-time log streaming via WebSocket - Progress monitoring through database updates

Production Considerations

Security

  • SSH key management for device access
  • Encrypted variable storage using Ansible Vault
  • Secure inventory management with proper file permissions

Scalability

  • Parallel deployment across multiple devices
  • Batch operations for large device fleets
  • Resource monitoring during deployment

Reliability

  • Health checks for deployed services
  • Rollback capabilities for failed deployments
  • Service dependency management

This Ansible automation is essential for the federated learning platform's core functionality, enabling seamless deployment and orchestration across distributed edge devices.