Data Flow Architecture¶
This document describes how data flows through the Federated Learning Platform, including user interactions, federated learning processes, and system monitoring data.
Overall Data Flow¶
flowchart TD
subgraph "User Interface Layer"
USER[User]
BROWSER[Web Browser]
end
subgraph "API Gateway Layer"
NGINX[Nginx Reverse Proxy]
CORS[CORS Middleware]
AUTH[Authentication Middleware]
end
subgraph "Application Layer"
FRONTEND[Next.js Frontend]
BACKEND[FastAPI Backend]
WS[WebSocket Service]
end
subgraph "Data Processing Layer"
VALIDATION[Data Validation]
BUSINESS_LOGIC[Business Logic]
FILE_HANDLER[File Upload Handler]
end
subgraph "Federated Learning Layer"
SUPERLINK[Flower Superlink]
AGGREGATOR[FL Aggregator]
CLIENTS[FL Clients]
end
subgraph "Persistence Layer"
MONGODB[(MongoDB)]
FILE_SYSTEM[File System]
MODEL_STORAGE[Model Storage]
end
subgraph "Observability Layer"
OTEL[OpenTelemetry]
TEMPO[Tempo]
GRAFANA[Grafana]
end
USER --> BROWSER
BROWSER --> NGINX
NGINX --> CORS
CORS --> AUTH
AUTH --> FRONTEND
AUTH --> BACKEND
FRONTEND --> BACKEND
BACKEND --> WS
BACKEND --> VALIDATION
VALIDATION --> BUSINESS_LOGIC
BUSINESS_LOGIC --> FILE_HANDLER
BACKEND --> SUPERLINK
SUPERLINK --> AGGREGATOR
AGGREGATOR --> CLIENTS
BUSINESS_LOGIC --> MONGODB
FILE_HANDLER --> FILE_SYSTEM
AGGREGATOR --> MODEL_STORAGE
BACKEND --> OTEL
AGGREGATOR --> OTEL
CLIENTS --> OTEL
OTEL --> TEMPO
TEMPO --> GRAFANA
User Authentication Flow¶
sequenceDiagram
participant User
participant Frontend
participant Backend
participant AuthService
participant Database
participant JWT
User->>Frontend: Enter Credentials
Frontend->>Frontend: Validate Input
Frontend->>Backend: POST /auth/login
Backend->>AuthService: Validate Credentials
AuthService->>Database: Query User
Database-->>AuthService: User Record
AuthService->>AuthService: Verify Password
AuthService->>JWT: Generate Token
JWT-->>AuthService: JWT Token
AuthService-->>Backend: Token + User Data
Backend-->>Frontend: Authentication Response
Frontend->>Frontend: Store Token in Memory
Frontend->>Frontend: Update Auth Context
Frontend-->>User: Redirect to Dashboard
Note over Frontend,Database: Subsequent API Calls
Frontend->>Backend: API Request + JWT Header
Backend->>AuthService: Validate JWT
AuthService->>JWT: Verify Token
JWT-->>AuthService: Token Claims
AuthService-->>Backend: User Context
Backend->>Backend: Process Request
Backend-->>Frontend: API Response
Project Configuration Flow¶
flowchart TD
subgraph "Configuration Upload"
USER_UPLOAD[User Uploads<br/>ML Project ZIP]
FRONTEND_UPLOAD[Frontend<br/>File Upload Component]
BACKEND_UPLOAD[Backend<br/>File Upload Handler]
FILE_VALIDATION[File Validation<br/>ZIP Structure Check]
FILE_STORAGE[File System<br/>Storage]
end
subgraph "Configuration Processing"
EXTRACT[Extract ZIP<br/>Contents]
VALIDATE_CONFIG[Validate<br/>pyproject.toml]
PARSE_CONFIG[Parse Configuration<br/>Dependencies & Settings]
STORE_METADATA[Store Metadata<br/>in MongoDB]
end
subgraph "Deployment Preparation"
ANSIBLE_TEMPLATE[Generate Ansible<br/>Templates]
DOCKER_CONFIG[Generate Docker<br/>Configurations]
INVENTORY_UPDATE[Update Ansible<br/>Inventory]
end
USER_UPLOAD --> FRONTEND_UPLOAD
FRONTEND_UPLOAD --> BACKEND_UPLOAD
BACKEND_UPLOAD --> FILE_VALIDATION
FILE_VALIDATION --> FILE_STORAGE
FILE_STORAGE --> EXTRACT
EXTRACT --> VALIDATE_CONFIG
VALIDATE_CONFIG --> PARSE_CONFIG
PARSE_CONFIG --> STORE_METADATA
STORE_METADATA --> ANSIBLE_TEMPLATE
ANSIBLE_TEMPLATE --> DOCKER_CONFIG
DOCKER_CONFIG --> INVENTORY_UPDATE
Federated Learning Training Flow¶
sequenceDiagram
participant User
participant Frontend
participant Backend
participant Superlink
participant Aggregator
participant Supernode
participant Client
participant Database
participant Monitoring
User->>Frontend: Start Training
Frontend->>Backend: POST /training/start
Backend->>Database: Create Training Job
Database-->>Backend: Job ID
Backend->>Superlink: Initialize FL Session
Superlink->>Aggregator: Start Server App
Aggregator->>Aggregator: Initialize Global Model
Aggregator->>Monitoring: Log Initialization
Backend->>Supernode: Deploy Client Configs
Supernode->>Client: Start Client Apps
Client->>Client: Load Local Data Partition
Client->>Monitoring: Log Client Ready
loop Training Rounds
Aggregator->>Supernode: Broadcast Global Model
Supernode->>Client: Send Model Parameters
Client->>Client: Local Training
Client->>Monitoring: Log Training Metrics
Client->>Supernode: Send Model Updates
Supernode->>Aggregator: Aggregate Updates
Aggregator->>Aggregator: Update Global Model
Aggregator->>Database: Update Job Status
Aggregator->>Backend: Training Progress
Backend->>Frontend: WebSocket Update
Frontend->>User: Display Progress
end
Aggregator->>Database: Mark Job Complete
Aggregator->>Monitoring: Log Final Metrics
Backend->>Frontend: Training Complete
Frontend->>User: Show Results
Real-Time Monitoring Data Flow¶
flowchart LR
subgraph "Data Sources"
FL_SERVER[FL Server<br/>Aggregator]
FL_CLIENTS[FL Clients<br/>Training Nodes]
BACKEND_API[Backend API<br/>FastAPI]
FRONTEND_APP[Frontend App<br/>Next.js]
end
subgraph "Telemetry Collection"
OTEL_COLLECTOR[OpenTelemetry<br/>Collector]
TRACE_PROCESSOR[Trace<br/>Processor]
METRIC_PROCESSOR[Metric<br/>Processor]
end
subgraph "Storage & Processing"
TEMPO_STORAGE[Tempo<br/>Trace Storage]
PROMETHEUS[Prometheus<br/>Metrics Storage]
GRAFANA_DB[Grafana<br/>Dashboard DB]
end
subgraph "Visualization"
GRAFANA_UI[Grafana<br/>Dashboards]
FRONTEND_CHARTS[Frontend<br/>Real-time Charts]
ALERTS[Alert<br/>Manager]
end
FL_SERVER --> OTEL_COLLECTOR
FL_CLIENTS --> OTEL_COLLECTOR
BACKEND_API --> OTEL_COLLECTOR
FRONTEND_APP --> OTEL_COLLECTOR
OTEL_COLLECTOR --> TRACE_PROCESSOR
OTEL_COLLECTOR --> METRIC_PROCESSOR
TRACE_PROCESSOR --> TEMPO_STORAGE
METRIC_PROCESSOR --> PROMETHEUS
TEMPO_STORAGE --> GRAFANA_UI
PROMETHEUS --> GRAFANA_UI
PROMETHEUS --> GRAFANA_DB
GRAFANA_UI --> ALERTS
BACKEND_API --> FRONTEND_CHARTS
Database Operations Flow¶
User Management Operations¶
flowchart TD
subgraph "User Operations"
CREATE_USER[Create User]
LOGIN_USER[Login User]
UPDATE_PROFILE[Update Profile]
DELETE_USER[Delete User]
end
subgraph "Validation Layer"
EMAIL_VALIDATION[Email Validation]
PASSWORD_VALIDATION[Password Validation]
DUPLICATE_CHECK[Duplicate Check]
PERMISSION_CHECK[Permission Check]
end
subgraph "Security Layer"
PASSWORD_HASH[Password Hashing]
JWT_GENERATION[JWT Generation]
TOKEN_VALIDATION[Token Validation]
RATE_LIMITING[Rate Limiting]
end
subgraph "Database Layer"
USERS_COLLECTION[(Users Collection)]
SESSIONS_COLLECTION[(Sessions Collection)]
AUDIT_LOG[(Audit Log)]
end
CREATE_USER --> EMAIL_VALIDATION
CREATE_USER --> PASSWORD_VALIDATION
EMAIL_VALIDATION --> DUPLICATE_CHECK
PASSWORD_VALIDATION --> PASSWORD_HASH
LOGIN_USER --> TOKEN_VALIDATION
LOGIN_USER --> JWT_GENERATION
UPDATE_PROFILE --> PERMISSION_CHECK
DELETE_USER --> PERMISSION_CHECK
PASSWORD_HASH --> USERS_COLLECTION
JWT_GENERATION --> SESSIONS_COLLECTION
PERMISSION_CHECK --> AUDIT_LOG
DUPLICATE_CHECK --> RATE_LIMITING
RATE_LIMITING --> USERS_COLLECTION
Training Job Operations¶
flowchart TD
subgraph "Job Lifecycle"
CREATE_JOB[Create Training Job]
START_JOB[Start Training]
MONITOR_JOB[Monitor Progress]
COMPLETE_JOB[Complete Training]
CLEANUP_JOB[Cleanup Resources]
end
subgraph "Data Operations"
VALIDATE_CONFIG[Validate Configuration]
STORE_CONFIG[Store Configuration]
UPDATE_STATUS[Update Job Status]
STORE_METRICS[Store Metrics]
STORE_RESULTS[Store Results]
end
subgraph "Database Collections"
TRAINING_JOBS[(Training Jobs)]
CONFIGURATIONS[(Configurations)]
METRICS[(Metrics)]
RESULTS[(Results)]
end
CREATE_JOB --> VALIDATE_CONFIG
VALIDATE_CONFIG --> STORE_CONFIG
STORE_CONFIG --> TRAINING_JOBS
STORE_CONFIG --> CONFIGURATIONS
START_JOB --> UPDATE_STATUS
MONITOR_JOB --> STORE_METRICS
COMPLETE_JOB --> STORE_RESULTS
UPDATE_STATUS --> TRAINING_JOBS
STORE_METRICS --> METRICS
STORE_RESULTS --> RESULTS
CLEANUP_JOB --> TRAINING_JOBS
CLEANUP_JOB --> CONFIGURATIONS
File Upload and Processing Flow¶
sequenceDiagram
participant User
participant Frontend
participant Backend
participant FileValidator
participant FileSystem
participant Database
participant AnsibleService
User->>Frontend: Select ML Project ZIP
Frontend->>Frontend: Client-side Validation
Frontend->>Backend: Upload File (multipart/form-data)
Backend->>FileValidator: Validate File Type
FileValidator-->>Backend: Validation Result
alt Valid File
Backend->>FileSystem: Store Temporary File
FileSystem-->>Backend: File Path
Backend->>FileValidator: Extract & Validate Structure
FileValidator->>FileValidator: Check pyproject.toml
FileValidator->>FileValidator: Validate Dependencies
FileValidator-->>Backend: Structure Valid
Backend->>FileSystem: Move to Permanent Location
Backend->>Database: Store File Metadata
Database-->>Backend: Metadata Stored
Backend->>AnsibleService: Generate Templates
AnsibleService-->>Backend: Templates Ready
Backend-->>Frontend: Upload Success
Frontend-->>User: Success Notification
else Invalid File
Backend->>FileSystem: Delete Temporary File
Backend-->>Frontend: Validation Error
Frontend-->>User: Error Message
end
WebSocket Communication Flow¶
sequenceDiagram
participant Frontend
participant WebSocketService
participant Backend
participant TrainingService
participant Database
participant FlowerService
Frontend->>WebSocketService: Connect WebSocket
WebSocketService->>Backend: Establish Connection
Backend->>Backend: Authenticate Connection
Backend-->>WebSocketService: Connection Established
WebSocketService-->>Frontend: Connected
Frontend->>WebSocketService: Subscribe to Training Updates
WebSocketService->>Backend: Register Subscription
Backend->>Backend: Add to Subscription List
loop Training Progress Updates
FlowerService->>TrainingService: Training Progress
TrainingService->>Database: Update Job Status
TrainingService->>Backend: Broadcast Update
Backend->>WebSocketService: Send Update
WebSocketService->>Frontend: Real-time Update
Frontend->>Frontend: Update UI
end
Frontend->>WebSocketService: Unsubscribe
WebSocketService->>Backend: Remove Subscription
Backend->>Backend: Remove from List
Frontend->>WebSocketService: Disconnect
WebSocketService->>Backend: Close Connection
Backend->>Backend: Cleanup Resources
Error Handling and Recovery Flow¶
flowchart TD
subgraph "Error Detection"
API_ERROR[API Error]
FL_ERROR[FL Training Error]
NETWORK_ERROR[Network Error]
VALIDATION_ERROR[Validation Error]
end
subgraph "Error Processing"
ERROR_HANDLER[Error Handler]
ERROR_LOGGER[Error Logger]
ERROR_CLASSIFIER[Error Classifier]
RECOVERY_STRATEGY[Recovery Strategy]
end
subgraph "Recovery Actions"
RETRY_OPERATION[Retry Operation]
FALLBACK_MODE[Fallback Mode]
USER_NOTIFICATION[User Notification]
ADMIN_ALERT[Admin Alert]
end
subgraph "Monitoring"
ERROR_METRICS[Error Metrics]
ALERT_SYSTEM[Alert System]
DASHBOARD_UPDATE[Dashboard Update]
end
API_ERROR --> ERROR_HANDLER
FL_ERROR --> ERROR_HANDLER
NETWORK_ERROR --> ERROR_HANDLER
VALIDATION_ERROR --> ERROR_HANDLER
ERROR_HANDLER --> ERROR_LOGGER
ERROR_HANDLER --> ERROR_CLASSIFIER
ERROR_CLASSIFIER --> RECOVERY_STRATEGY
RECOVERY_STRATEGY --> RETRY_OPERATION
RECOVERY_STRATEGY --> FALLBACK_MODE
RECOVERY_STRATEGY --> USER_NOTIFICATION
RECOVERY_STRATEGY --> ADMIN_ALERT
ERROR_LOGGER --> ERROR_METRICS
ERROR_METRICS --> ALERT_SYSTEM
ALERT_SYSTEM --> DASHBOARD_UPDATE
Data Security and Privacy Flow¶
flowchart TD
subgraph "Data Sources"
USER_DATA[User Input Data]
CLIENT_DATA[Client Training Data]
MODEL_DATA[Model Parameters]
SYSTEM_DATA[System Metrics]
end
subgraph "Security Layers"
INPUT_VALIDATION[Input Validation]
ENCRYPTION[Data Encryption]
ACCESS_CONTROL[Access Control]
AUDIT_LOGGING[Audit Logging]
end
subgraph "Privacy Protection"
DATA_MINIMIZATION[Data Minimization]
FEDERATED_LEARNING[Federated Learning<br/>No Raw Data Sharing]
DIFFERENTIAL_PRIVACY[Differential Privacy<br/>Future Enhancement]
SECURE_AGGREGATION[Secure Aggregation]
end
subgraph "Compliance"
GDPR_COMPLIANCE[GDPR Compliance]
DATA_RETENTION[Data Retention Policy]
RIGHT_TO_DELETE[Right to Delete]
CONSENT_MANAGEMENT[Consent Management]
end
USER_DATA --> INPUT_VALIDATION
CLIENT_DATA --> FEDERATED_LEARNING
MODEL_DATA --> SECURE_AGGREGATION
SYSTEM_DATA --> AUDIT_LOGGING
INPUT_VALIDATION --> ENCRYPTION
FEDERATED_LEARNING --> DATA_MINIMIZATION
SECURE_AGGREGATION --> ACCESS_CONTROL
ENCRYPTION --> GDPR_COMPLIANCE
DATA_MINIMIZATION --> DATA_RETENTION
ACCESS_CONTROL --> RIGHT_TO_DELETE
AUDIT_LOGGING --> CONSENT_MANAGEMENT
Performance Optimization Flow¶
flowchart LR
subgraph "Performance Monitoring"
METRICS_COLLECTION[Metrics Collection]
PERFORMANCE_ANALYSIS[Performance Analysis]
BOTTLENECK_DETECTION[Bottleneck Detection]
end
subgraph "Optimization Strategies"
CACHING[Caching Strategy]
CONNECTION_POOLING[Connection Pooling]
ASYNC_PROCESSING[Async Processing]
LOAD_BALANCING[Load Balancing]
end
subgraph "Implementation"
REDIS_CACHE[Redis Cache]
MONGO_POOL[MongoDB Pool]
ASYNC_TASKS[Async Tasks]
NGINX_LB[Nginx Load Balancer]
end
METRICS_COLLECTION --> PERFORMANCE_ANALYSIS
PERFORMANCE_ANALYSIS --> BOTTLENECK_DETECTION
BOTTLENECK_DETECTION --> CACHING
BOTTLENECK_DETECTION --> CONNECTION_POOLING
BOTTLENECK_DETECTION --> ASYNC_PROCESSING
BOTTLENECK_DETECTION --> LOAD_BALANCING
CACHING --> REDIS_CACHE
CONNECTION_POOLING --> MONGO_POOL
ASYNC_PROCESSING --> ASYNC_TASKS
LOAD_BALANCING --> NGINX_LB
Next: Continue to Network Architecture to understand the network topology and communication protocols.