FedPilot
FedPilot

Distributed & Trustworthy AI

Trustworthy AI at Scale

FedPilot is a Ray-backed platform for topology-aware federated learning and Trustworthy AI research. Define schemas, virtual nodes, and adaptation policies before the runtime materializes actors and placement groups.

Train globally, keep data local

Federated learning coordinates model updates across clients without centralizing raw data — the foundation of privacy-preserving distributed AI.

Cross-silo & cross-device Institutions to edge fleets
Non-IID & heterogeneity FedProx, sampling, robust agg
Vision & NLP models CNN, ResNet, BERT, custom PyTorch
Privacy budgets DP-SGD, secure aggregation

Four core contributions

FedPilot treats distributed systems concerns as first-class — not hidden behind FL-only abstractions.

Layered architecture

Schema, core, communication, infrastructure, and observability layers stay cleanly separated.

Systems abstractions

Lazy virtual-node materialization, topology-aware routing, and ICRF as a core primitive.

Topology adaptation

Data-driven clustering from label distributions drives placement and horizontal scaling.

Grounded observability

OpenTelemetry, Prometheus, Grafana, and Streamlit capture pressure and network I/O as experiment artifacts.

Inter-Cluster Ray Fabric (ICRF)

The ICRF is the spine of multi-cluster federation: one logical graph, hybrid transport chosen automatically per hop.

Ray shared memory · intra-cluster HTTP / Ray Serve · inter-cluster

Clustering wires the fabric; HybridAdjacencyMatrix encodes routes; HybridTopologyManager enforces them at runtime.

ICRF deep-dive →

Explore by operational layer

The docs mirror how FedPilot runs in production — from boot configuration through telemetry.

Ready to run an experiment?

Install FedPilot, configure a topology, and ship reproducible FL runs on a laptop or Ray cluster.