Model Registry & Datasets
FedPilot abstracts data loading and model instantiation processes away from the core execution logic. All models and datasets are accessed via string-based keys defined in config.yaml, allowing researchers to swap architectures and datasets without altering a single line of training code.
Built-In Models & Performance Characteristics
FedPilot ships with a comprehensive suite of vision and language models, optimized for different experimental scales:
| Model Type | Parameters | Typical Use Case | Training Speed | Memory (Batch 64) |
|---|---|---|---|---|
cnn | ~200K | Baseline prototyping | Very Fast | ~600MB |
lenet | ~60K | Embedded constraints | Very Fast | ~300MB |
mobilenet | ~4M | Edge Devices | Fast | ~1GB |
resnet18 | ~11M | Standard benchmark | Fast | ~3GB |
resnet50 | ~25M | Fine-grained tasks | Medium | ~6GB |
vgg16 | ~138M | Heavy CV Transfer | Slow | ~10GB |
vit_small | ~22M | Vision Transformer | Medium | ~4GB |
swin_base | ~87M | Advanced SOTA CV | Slow | ~8GB |
bert | ~110M | NLP (Base) | Slow | ~6GB |
albert | ~12M | Parameter-efficient NLP | Medium | ~3GB |
Built-In Datasets
Datasets are automatically downloaded, partitioned according to the requested non-IID distribution, and served to the VirtualNode instances.
| Dataset | Type | Classes | Distribution Focus |
|---|---|---|---|
mnist | Image (Gray) | 10 | IID Testing / Baseline |
fashion-mnist | Image (Gray) | 10 | Standard Benchmark |
cifar10 | Image (RGB) | 10 | Standard Benchmark |
cifar100 | Image (RGB) | 100 | Advanced CV |
femnist | Image (Gray) | 62 | Natural Non-IID (Writer-based) |
svhn | Image (RGB) | 10 | Real-world digits |
tiny-imagenet | Image (RGB) | 200 | Hard CV |
shakespeare | Text | 80 chars | Natural Non-IID |
bbc | Text | 5 | Text Classification |
yahoo | Text | 10 | Large-scale NLP |
Compatibility Matrix
Not all models work with all datasets due to input dimensionality differences (e.g., trying to feed 1D text tokens into a 2D CNN). Use the chart below to ensure your config.yaml is valid:
| Model | MNIST / F-MNIST | CIFAR-10 / 100 | ImageNet / SVHN | Shakespeare / BBC |
|---|---|---|---|---|
| CNN / LeNet | ✅ | ✅ | ✅ | ❌ |
| ResNet / VGG / MobileNet | ✅ | ✅ | ✅ | ❌ |
| ViT / Swin | ✅ | ✅ | ✅ | ❌ |
| BERT / ALBERT | ❌ | ❌ | ❌ | ✅ |
Registering Custom Models
FedPilot uses a decorator-based registry pattern. To inject a custom PyTorch model into the platform without editing the framework core:
- Create a standard
torch.nn.Module. - Decorate it with
@register_model("your_custom_key"). - In your
config.yaml, setmodel_type: "your_custom_key".
Example
import torch.nn as nn
from src.registries.models.model_registry import register_model
@register_model("custom_autoencoder")
class MyAutoencoder(nn.Module):
def __init__(self, config):
super().__init__()
# The ConfigValidator is automatically passed to __init__
self.encoder = nn.Linear(config.INPUT_DIMENSION, 128)
self.decoder = nn.Linear(128, config.INPUT_DIMENSION)
def forward(self, x):
return self.decoder(self.encoder(x))
The app_factory.py dynamically discovers and instantiates your model when the VirtualNodes materialize. You can pass the entire config object to your model to parameterize it dynamically (e.g., scaling layer sizes based on dataset choice).
See also: Ray & Virtual Nodes · Configuration Reference