Skip to content

Data Flow Architecture

This document describes how data flows through the Federated Learning Platform, including user interactions, federated learning processes, and system monitoring data.

Overall Data Flow

flowchart TD
    subgraph "User Interface Layer"
        USER[User]
        BROWSER[Web Browser]
    end

    subgraph "API Gateway Layer"
        NGINX[Nginx Reverse Proxy]
        CORS[CORS Middleware]
        AUTH[Authentication Middleware]
    end

    subgraph "Application Layer"
        FRONTEND[Next.js Frontend]
        BACKEND[FastAPI Backend]
        WS[WebSocket Service]
    end

    subgraph "Data Processing Layer"
        VALIDATION[Data Validation]
        BUSINESS_LOGIC[Business Logic]
        FILE_HANDLER[File Upload Handler]
    end

    subgraph "Federated Learning Layer"
        SUPERLINK[Flower Superlink]
        AGGREGATOR[FL Aggregator]
        CLIENTS[FL Clients]
    end

    subgraph "Persistence Layer"
        MONGODB[(MongoDB)]
        FILE_SYSTEM[File System]
        MODEL_STORAGE[Model Storage]
    end

    subgraph "Observability Layer"
        OTEL[OpenTelemetry]
        TEMPO[Tempo]
        GRAFANA[Grafana]
    end

    USER --> BROWSER
    BROWSER --> NGINX
    NGINX --> CORS
    CORS --> AUTH
    AUTH --> FRONTEND
    AUTH --> BACKEND

    FRONTEND --> BACKEND
    BACKEND --> WS
    BACKEND --> VALIDATION
    VALIDATION --> BUSINESS_LOGIC
    BUSINESS_LOGIC --> FILE_HANDLER

    BACKEND --> SUPERLINK
    SUPERLINK --> AGGREGATOR
    AGGREGATOR --> CLIENTS

    BUSINESS_LOGIC --> MONGODB
    FILE_HANDLER --> FILE_SYSTEM
    AGGREGATOR --> MODEL_STORAGE

    BACKEND --> OTEL
    AGGREGATOR --> OTEL
    CLIENTS --> OTEL
    OTEL --> TEMPO
    TEMPO --> GRAFANA

User Authentication Flow

sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant AuthService
    participant Database
    participant JWT

    User->>Frontend: Enter Credentials
    Frontend->>Frontend: Validate Input
    Frontend->>Backend: POST /auth/login
    Backend->>AuthService: Validate Credentials
    AuthService->>Database: Query User
    Database-->>AuthService: User Record
    AuthService->>AuthService: Verify Password
    AuthService->>JWT: Generate Token
    JWT-->>AuthService: JWT Token
    AuthService-->>Backend: Token + User Data
    Backend-->>Frontend: Authentication Response
    Frontend->>Frontend: Store Token in Memory
    Frontend->>Frontend: Update Auth Context
    Frontend-->>User: Redirect to Dashboard

    Note over Frontend,Database: Subsequent API Calls

    Frontend->>Backend: API Request + JWT Header
    Backend->>AuthService: Validate JWT
    AuthService->>JWT: Verify Token
    JWT-->>AuthService: Token Claims
    AuthService-->>Backend: User Context
    Backend->>Backend: Process Request
    Backend-->>Frontend: API Response

Project Configuration Flow

flowchart TD
    subgraph "Configuration Upload"
        USER_UPLOAD[User Uploads<br/>ML Project ZIP]
        FRONTEND_UPLOAD[Frontend<br/>File Upload Component]
        BACKEND_UPLOAD[Backend<br/>File Upload Handler]
        FILE_VALIDATION[File Validation<br/>ZIP Structure Check]
        FILE_STORAGE[File System<br/>Storage]
    end

    subgraph "Configuration Processing"
        EXTRACT[Extract ZIP<br/>Contents]
        VALIDATE_CONFIG[Validate<br/>pyproject.toml]
        PARSE_CONFIG[Parse Configuration<br/>Dependencies & Settings]
        STORE_METADATA[Store Metadata<br/>in MongoDB]
    end

    subgraph "Deployment Preparation"
        ANSIBLE_TEMPLATE[Generate Ansible<br/>Templates]
        DOCKER_CONFIG[Generate Docker<br/>Configurations]
        INVENTORY_UPDATE[Update Ansible<br/>Inventory]
    end

    USER_UPLOAD --> FRONTEND_UPLOAD
    FRONTEND_UPLOAD --> BACKEND_UPLOAD
    BACKEND_UPLOAD --> FILE_VALIDATION
    FILE_VALIDATION --> FILE_STORAGE

    FILE_STORAGE --> EXTRACT
    EXTRACT --> VALIDATE_CONFIG
    VALIDATE_CONFIG --> PARSE_CONFIG
    PARSE_CONFIG --> STORE_METADATA

    STORE_METADATA --> ANSIBLE_TEMPLATE
    ANSIBLE_TEMPLATE --> DOCKER_CONFIG
    DOCKER_CONFIG --> INVENTORY_UPDATE

Federated Learning Training Flow

sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant Superlink
    participant Aggregator
    participant Supernode
    participant Client
    participant Database
    participant Monitoring

    User->>Frontend: Start Training
    Frontend->>Backend: POST /training/start
    Backend->>Database: Create Training Job
    Database-->>Backend: Job ID
    Backend->>Superlink: Initialize FL Session
    Superlink->>Aggregator: Start Server App
    Aggregator->>Aggregator: Initialize Global Model
    Aggregator->>Monitoring: Log Initialization

    Backend->>Supernode: Deploy Client Configs
    Supernode->>Client: Start Client Apps
    Client->>Client: Load Local Data Partition
    Client->>Monitoring: Log Client Ready

    loop Training Rounds
        Aggregator->>Supernode: Broadcast Global Model
        Supernode->>Client: Send Model Parameters
        Client->>Client: Local Training
        Client->>Monitoring: Log Training Metrics
        Client->>Supernode: Send Model Updates
        Supernode->>Aggregator: Aggregate Updates
        Aggregator->>Aggregator: Update Global Model
        Aggregator->>Database: Update Job Status
        Aggregator->>Backend: Training Progress
        Backend->>Frontend: WebSocket Update
        Frontend->>User: Display Progress
    end

    Aggregator->>Database: Mark Job Complete
    Aggregator->>Monitoring: Log Final Metrics
    Backend->>Frontend: Training Complete
    Frontend->>User: Show Results

Real-Time Monitoring Data Flow

flowchart LR
    subgraph "Data Sources"
        FL_SERVER[FL Server<br/>Aggregator]
        FL_CLIENTS[FL Clients<br/>Training Nodes]
        BACKEND_API[Backend API<br/>FastAPI]
        FRONTEND_APP[Frontend App<br/>Next.js]
    end

    subgraph "Telemetry Collection"
        OTEL_COLLECTOR[OpenTelemetry<br/>Collector]
        TRACE_PROCESSOR[Trace<br/>Processor]
        METRIC_PROCESSOR[Metric<br/>Processor]
    end

    subgraph "Storage & Processing"
        TEMPO_STORAGE[Tempo<br/>Trace Storage]
        PROMETHEUS[Prometheus<br/>Metrics Storage]
        GRAFANA_DB[Grafana<br/>Dashboard DB]
    end

    subgraph "Visualization"
        GRAFANA_UI[Grafana<br/>Dashboards]
        FRONTEND_CHARTS[Frontend<br/>Real-time Charts]
        ALERTS[Alert<br/>Manager]
    end

    FL_SERVER --> OTEL_COLLECTOR
    FL_CLIENTS --> OTEL_COLLECTOR
    BACKEND_API --> OTEL_COLLECTOR
    FRONTEND_APP --> OTEL_COLLECTOR

    OTEL_COLLECTOR --> TRACE_PROCESSOR
    OTEL_COLLECTOR --> METRIC_PROCESSOR

    TRACE_PROCESSOR --> TEMPO_STORAGE
    METRIC_PROCESSOR --> PROMETHEUS

    TEMPO_STORAGE --> GRAFANA_UI
    PROMETHEUS --> GRAFANA_UI
    PROMETHEUS --> GRAFANA_DB

    GRAFANA_UI --> ALERTS
    BACKEND_API --> FRONTEND_CHARTS

Database Operations Flow

User Management Operations

flowchart TD
    subgraph "User Operations"
        CREATE_USER[Create User]
        LOGIN_USER[Login User]
        UPDATE_PROFILE[Update Profile]
        DELETE_USER[Delete User]
    end

    subgraph "Validation Layer"
        EMAIL_VALIDATION[Email Validation]
        PASSWORD_VALIDATION[Password Validation]
        DUPLICATE_CHECK[Duplicate Check]
        PERMISSION_CHECK[Permission Check]
    end

    subgraph "Security Layer"
        PASSWORD_HASH[Password Hashing]
        JWT_GENERATION[JWT Generation]
        TOKEN_VALIDATION[Token Validation]
        RATE_LIMITING[Rate Limiting]
    end

    subgraph "Database Layer"
        USERS_COLLECTION[(Users Collection)]
        SESSIONS_COLLECTION[(Sessions Collection)]
        AUDIT_LOG[(Audit Log)]
    end

    CREATE_USER --> EMAIL_VALIDATION
    CREATE_USER --> PASSWORD_VALIDATION
    EMAIL_VALIDATION --> DUPLICATE_CHECK
    PASSWORD_VALIDATION --> PASSWORD_HASH

    LOGIN_USER --> TOKEN_VALIDATION
    LOGIN_USER --> JWT_GENERATION

    UPDATE_PROFILE --> PERMISSION_CHECK
    DELETE_USER --> PERMISSION_CHECK

    PASSWORD_HASH --> USERS_COLLECTION
    JWT_GENERATION --> SESSIONS_COLLECTION
    PERMISSION_CHECK --> AUDIT_LOG

    DUPLICATE_CHECK --> RATE_LIMITING
    RATE_LIMITING --> USERS_COLLECTION

Training Job Operations

flowchart TD
    subgraph "Job Lifecycle"
        CREATE_JOB[Create Training Job]
        START_JOB[Start Training]
        MONITOR_JOB[Monitor Progress]
        COMPLETE_JOB[Complete Training]
        CLEANUP_JOB[Cleanup Resources]
    end

    subgraph "Data Operations"
        VALIDATE_CONFIG[Validate Configuration]
        STORE_CONFIG[Store Configuration]
        UPDATE_STATUS[Update Job Status]
        STORE_METRICS[Store Metrics]
        STORE_RESULTS[Store Results]
    end

    subgraph "Database Collections"
        TRAINING_JOBS[(Training Jobs)]
        CONFIGURATIONS[(Configurations)]
        METRICS[(Metrics)]
        RESULTS[(Results)]
    end

    CREATE_JOB --> VALIDATE_CONFIG
    VALIDATE_CONFIG --> STORE_CONFIG
    STORE_CONFIG --> TRAINING_JOBS
    STORE_CONFIG --> CONFIGURATIONS

    START_JOB --> UPDATE_STATUS
    MONITOR_JOB --> STORE_METRICS
    COMPLETE_JOB --> STORE_RESULTS

    UPDATE_STATUS --> TRAINING_JOBS
    STORE_METRICS --> METRICS
    STORE_RESULTS --> RESULTS

    CLEANUP_JOB --> TRAINING_JOBS
    CLEANUP_JOB --> CONFIGURATIONS

File Upload and Processing Flow

sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant FileValidator
    participant FileSystem
    participant Database
    participant AnsibleService

    User->>Frontend: Select ML Project ZIP
    Frontend->>Frontend: Client-side Validation
    Frontend->>Backend: Upload File (multipart/form-data)
    Backend->>FileValidator: Validate File Type
    FileValidator-->>Backend: Validation Result

    alt Valid File
        Backend->>FileSystem: Store Temporary File
        FileSystem-->>Backend: File Path
        Backend->>FileValidator: Extract & Validate Structure
        FileValidator->>FileValidator: Check pyproject.toml
        FileValidator->>FileValidator: Validate Dependencies
        FileValidator-->>Backend: Structure Valid

        Backend->>FileSystem: Move to Permanent Location
        Backend->>Database: Store File Metadata
        Database-->>Backend: Metadata Stored
        Backend->>AnsibleService: Generate Templates
        AnsibleService-->>Backend: Templates Ready
        Backend-->>Frontend: Upload Success
        Frontend-->>User: Success Notification
    else Invalid File
        Backend->>FileSystem: Delete Temporary File
        Backend-->>Frontend: Validation Error
        Frontend-->>User: Error Message
    end

WebSocket Communication Flow

sequenceDiagram
    participant Frontend
    participant WebSocketService
    participant Backend
    participant TrainingService
    participant Database
    participant FlowerService

    Frontend->>WebSocketService: Connect WebSocket
    WebSocketService->>Backend: Establish Connection
    Backend->>Backend: Authenticate Connection
    Backend-->>WebSocketService: Connection Established
    WebSocketService-->>Frontend: Connected

    Frontend->>WebSocketService: Subscribe to Training Updates
    WebSocketService->>Backend: Register Subscription
    Backend->>Backend: Add to Subscription List

    loop Training Progress Updates
        FlowerService->>TrainingService: Training Progress
        TrainingService->>Database: Update Job Status
        TrainingService->>Backend: Broadcast Update
        Backend->>WebSocketService: Send Update
        WebSocketService->>Frontend: Real-time Update
        Frontend->>Frontend: Update UI
    end

    Frontend->>WebSocketService: Unsubscribe
    WebSocketService->>Backend: Remove Subscription
    Backend->>Backend: Remove from List

    Frontend->>WebSocketService: Disconnect
    WebSocketService->>Backend: Close Connection
    Backend->>Backend: Cleanup Resources

Error Handling and Recovery Flow

flowchart TD
    subgraph "Error Detection"
        API_ERROR[API Error]
        FL_ERROR[FL Training Error]
        NETWORK_ERROR[Network Error]
        VALIDATION_ERROR[Validation Error]
    end

    subgraph "Error Processing"
        ERROR_HANDLER[Error Handler]
        ERROR_LOGGER[Error Logger]
        ERROR_CLASSIFIER[Error Classifier]
        RECOVERY_STRATEGY[Recovery Strategy]
    end

    subgraph "Recovery Actions"
        RETRY_OPERATION[Retry Operation]
        FALLBACK_MODE[Fallback Mode]
        USER_NOTIFICATION[User Notification]
        ADMIN_ALERT[Admin Alert]
    end

    subgraph "Monitoring"
        ERROR_METRICS[Error Metrics]
        ALERT_SYSTEM[Alert System]
        DASHBOARD_UPDATE[Dashboard Update]
    end

    API_ERROR --> ERROR_HANDLER
    FL_ERROR --> ERROR_HANDLER
    NETWORK_ERROR --> ERROR_HANDLER
    VALIDATION_ERROR --> ERROR_HANDLER

    ERROR_HANDLER --> ERROR_LOGGER
    ERROR_HANDLER --> ERROR_CLASSIFIER
    ERROR_CLASSIFIER --> RECOVERY_STRATEGY

    RECOVERY_STRATEGY --> RETRY_OPERATION
    RECOVERY_STRATEGY --> FALLBACK_MODE
    RECOVERY_STRATEGY --> USER_NOTIFICATION
    RECOVERY_STRATEGY --> ADMIN_ALERT

    ERROR_LOGGER --> ERROR_METRICS
    ERROR_METRICS --> ALERT_SYSTEM
    ALERT_SYSTEM --> DASHBOARD_UPDATE

Data Security and Privacy Flow

flowchart TD
    subgraph "Data Sources"
        USER_DATA[User Input Data]
        CLIENT_DATA[Client Training Data]
        MODEL_DATA[Model Parameters]
        SYSTEM_DATA[System Metrics]
    end

    subgraph "Security Layers"
        INPUT_VALIDATION[Input Validation]
        ENCRYPTION[Data Encryption]
        ACCESS_CONTROL[Access Control]
        AUDIT_LOGGING[Audit Logging]
    end

    subgraph "Privacy Protection"
        DATA_MINIMIZATION[Data Minimization]
        FEDERATED_LEARNING[Federated Learning<br/>No Raw Data Sharing]
        DIFFERENTIAL_PRIVACY[Differential Privacy<br/>Future Enhancement]
        SECURE_AGGREGATION[Secure Aggregation]
    end

    subgraph "Compliance"
        GDPR_COMPLIANCE[GDPR Compliance]
        DATA_RETENTION[Data Retention Policy]
        RIGHT_TO_DELETE[Right to Delete]
        CONSENT_MANAGEMENT[Consent Management]
    end

    USER_DATA --> INPUT_VALIDATION
    CLIENT_DATA --> FEDERATED_LEARNING
    MODEL_DATA --> SECURE_AGGREGATION
    SYSTEM_DATA --> AUDIT_LOGGING

    INPUT_VALIDATION --> ENCRYPTION
    FEDERATED_LEARNING --> DATA_MINIMIZATION
    SECURE_AGGREGATION --> ACCESS_CONTROL

    ENCRYPTION --> GDPR_COMPLIANCE
    DATA_MINIMIZATION --> DATA_RETENTION
    ACCESS_CONTROL --> RIGHT_TO_DELETE
    AUDIT_LOGGING --> CONSENT_MANAGEMENT

Performance Optimization Flow

flowchart LR
    subgraph "Performance Monitoring"
        METRICS_COLLECTION[Metrics Collection]
        PERFORMANCE_ANALYSIS[Performance Analysis]
        BOTTLENECK_DETECTION[Bottleneck Detection]
    end

    subgraph "Optimization Strategies"
        CACHING[Caching Strategy]
        CONNECTION_POOLING[Connection Pooling]
        ASYNC_PROCESSING[Async Processing]
        LOAD_BALANCING[Load Balancing]
    end

    subgraph "Implementation"
        REDIS_CACHE[Redis Cache]
        MONGO_POOL[MongoDB Pool]
        ASYNC_TASKS[Async Tasks]
        NGINX_LB[Nginx Load Balancer]
    end

    METRICS_COLLECTION --> PERFORMANCE_ANALYSIS
    PERFORMANCE_ANALYSIS --> BOTTLENECK_DETECTION

    BOTTLENECK_DETECTION --> CACHING
    BOTTLENECK_DETECTION --> CONNECTION_POOLING
    BOTTLENECK_DETECTION --> ASYNC_PROCESSING
    BOTTLENECK_DETECTION --> LOAD_BALANCING

    CACHING --> REDIS_CACHE
    CONNECTION_POOLING --> MONGO_POOL
    ASYNC_PROCESSING --> ASYNC_TASKS
    LOAD_BALANCING --> NGINX_LB

Next: Continue to Network Architecture to understand the network topology and communication protocols.