Configuration Guide¶

This guide covers how to write configuration files for the Segmentation Robustness Framework.

📋 Overview¶

The framework uses a YAML/JSON-based configuration system that allows you to define complete evaluation pipelines without writing code. The configuration system handles model loading, dataset setup, attack configuration, and pipeline parameters automatically.

🏗️ Configuration Structure¶

📝 Basic Configuration Layout¶

# Model configuration
model:
  type: "torchvision"
  config:
    name: "deeplabv3_resnet50"
    num_classes: 21
  weights_path: null
  weight_type: "full"
  adapter: null

# Dataset configuration
dataset:
  name: "voc"
  root: "./data/VOCdevkit/VOC2012"
  split: "val"
  image_shape: [512, 512]
  download: true

# Attack configurations
attacks:
  - name: "fgsm"
    eps: 0.02
  - name: "pgd"
    eps: 0.02
    alpha: 0.01
    iters: 10
    targeted: false

# Pipeline configuration
pipeline:
  batch_size: 8
  device: "cuda"
  output_dir: "./runs"
  auto_resize_masks: true
  output_formats: ["json", "csv"]
  metric_precision: 4
  num_workers: 0
  pin_memory: false
  persistent_workers: false

# Metrics configuration
metrics:
  ignore_index: 255
  selected_metrics:
    - "mean_iou"
    - "pixel_accuracy"
    - {"name": "dice_score", "average": "micro"}
    - "custom_metric_name"

🤖 Model Configuration¶

🎯 Supported Model Types¶

The framework supports four model types:

torchvision - Pre-trained models from torchvision
smp - Models from Segmentation Models PyTorch
huggingface - Models from HuggingFace Transformers
custom - Your own custom models

🖼️ Torchvision Models¶

model:
  type: "torchvision"
  config:
    name: "deeplabv3_resnet50"  # Model name
    num_classes: 21              # Number of segmentation classes
  weights_path: null             # Optional: path to custom weights
  weight_type: "full"            # "full" or "encoder"
  adapter: null                  # Optional: custom adapter class

📋 Available Torchvision Models:

deeplabv3_resnet50
deeplabv3_resnet101
fcn_resnet50
fcn_resnet101
lraspp_mobilenet_v3_large

Note: For a complete list of available models and their specifications, refer to the PyTorch Torchvision documentation.

🏗️ SMP Models¶

model:
  type: "smp"
  config:
    architecture: "Unet"           # Model architecture
    encoder_name: "resnet34"       # Encoder name
    encoder_weights: "imagenet"    # Pre-trained weights
    in_channels: 3                 # Input channels
    classes: 21                    # Number of classes
  weights_path: null
  weight_type: "full"
  adapter: null

🏛️ Available SMP Architectures:

Unet
UnetPlusPlus
MAnet
Linknet
FPN
PSPNet
PAN
DeepLabV3
DeepLabV3Plus

🔧 Available SMP Encoders:

resnet18, resnet34, resnet50, resnet101, resnet152
densenet121, densenet169, densenet201
efficientnet-b0 through efficientnet-b7
mobilenet_v2
xception

Note: For a complete list of available architectures, encoders, and their specifications, refer to the Segmentation Models PyTorch documentation.

🤗 HuggingFace Models¶

model:
  type: "huggingface"
  config:
    model_name: "nvidia/segformer-b0-finetuned-ade-512-512"
    revision: "main"              # Optional: git revision
    trust_remote_code: false      # Optional: trust remote code
  weights_path: null
  weight_type: "full"
  adapter: null

📚 Available HuggingFace Models:

nvidia/segformer-b0-finetuned-ade-512-512
nvidia/segformer-b1-finetuned-ade-512-512
nvidia/segformer-b2-finetuned-ade-512-512
nvidia/segformer-b3-finetuned-ade-512-512
nvidia/segformer-b4-finetuned-ade-512-512
nvidia/segformer-b5-finetuned-ade-512-512

Note: For a complete list of available models and their specifications, refer to the HuggingFace Transformers documentation and the HuggingFace Hub.

🔧 Custom Models¶

model:
  type: "custom"
  config:
    model_class: "path.to.CustomModel"
    model_args:
      num_classes: 21
      pretrained: true
  weights_path: "./path/to/weights.pth"
  weight_type: "full"
  adapter: "path.to.CustomAdapter"

⚙️ Model Configuration Options¶

Option	Type	Default	Description
`type`	str	Required	Model type: "torchvision", "smp", "huggingface", "custom"
`config`	dict	Required	Model-specific configuration
`weights_path`	str	null	Path to custom model weights
`weight_type`	str	"full"	"full" or "encoder" for SMP models
`adapter`	str	null	Custom adapter class path

📊 Dataset Configuration¶

🗂️ Supported Datasets¶

The framework supports four built-in datasets:

voc - PASCAL VOC 2012
ade20k - MIT Scene Parsing
cityscapes - Urban Scene Understanding
stanford_background - Stanford Background Dataset

🏷️ VOC Dataset¶

dataset:
  name: "voc"
  root: "./data/VOCdevkit/VOC2012"  # Dataset root directory
  split: "val"                       # "train", "val", "trainval"
  image_shape: [512, 512]           # [height, width]
  download: true                     # Auto-download if not found

🏙️ ADE20K Dataset¶

dataset:
  name: "ade20k"
  root: "data/"
  split: "val"
  image_shape: [512, 512]
  download: true

🚗 Cityscapes Dataset¶

dataset:
  name: "cityscapes"
  root: "data/cityscapes"
  split: "val"
  mode: "fine"
  target_type: "semantic"
  image_shape: [512, 512]

🌳 Stanford Background Dataset¶

dataset:
  name: "stanford_background"
  root: "data/"
  image_shape: [512, 512]
  download: true

⚙️ Dataset Configuration Options¶

Option	Type	Default	Description
`name`	str	Required	Dataset name: "voc", "ade20k", "cityscapes", "stanford_background"
`root`	str	null	Dataset root directory (uses cache if null)
`split`	str	"val"	Dataset split: "train", "val", "trainval"
`image_shape`	list	Required	Target image size [height, width]
`download`	bool	false	Auto-download dataset if not found

⚔️ Attack Configuration¶

🎯 Supported Attacks¶

The framework supports multiple adversarial attacks:

fgsm - Fast Gradient Sign Method
pgd - Projected Gradient Descent
rfgsm - R-FGSM with random start
tpgd - Two-Phase Gradient Descent

⚡ FGSM Attack¶

attacks:
  - name: "fgsm"
    eps: 0.02                    # Maximum perturbation magnitude

🔄 PGD Attack¶

attacks:
  - name: "pgd"
    eps: 0.02                    # Maximum perturbation magnitude
    alpha: 0.01                  # Step size
    iters: 10                    # Number of iterations
    targeted: false               # Targeted or untargeted attack

🎲 RFGSM Attack¶

attacks:
  - name: "rfgsm"
    eps: 0.02                    # Maximum perturbation magnitude
    alpha: 0.01                  # Step size
    iters: 10                    # Number of iterations

⚖️ TPGD Attack¶

attacks:
  - name: "tpgd"
    eps: 0.02                    # Maximum perturbation magnitude
    alpha: 0.01                  # Step size
    iters: 10                    # Number of iterations
    targeted: false               # Targeted or untargeted attack

🔀 Multiple Attacks¶

attacks:
  - name: "fgsm"
    eps: 0.02
  - name: "fgsm"
    eps: 0.05
  - name: "pgd"
    eps: 0.02
    alpha: 0.01
    iters: 10
    targeted: false
  - name: "pgd"
    eps: 0.05
    alpha: 0.02
    iters: 20
    targeted: false

⚙️ Attack Configuration Options¶

Option	Type	Default	Description
`name`	str	Required	Attack name: "fgsm", "pgd", "rfgsm", "tpgd"
`eps`	float	Required	Maximum perturbation magnitude
`alpha`	float	0.01	Step size (PGD, RFGSM, TPGD)
`iters`	int	10	Number of iterations (PGD, RFGSM, TPGD)
`targeted`	bool	false	Targeted attack (PGD, TPGD)

🔧 Pipeline Configuration¶

⚙️ Basic Pipeline Settings¶

pipeline:
  batch_size: 8                    # Batch size for evaluation
  device: "cuda"                   # Device: "cuda", "cpu", "mps"
  output_dir: "./runs"             # Output directory
  auto_resize_masks: true          # Auto-resize masks to model output
  output_formats: ["json", "csv"]  # Output formats
  metric_precision: 4              # Decimal places for metrics
  num_workers: 0                   # DataLoader workers
  pin_memory: false                # Pin memory for GPU
  persistent_workers: false         # Keep workers alive

📋 Pipeline Configuration Options¶

Option	Type	Default	Description
`batch_size`	int	8	Batch size for evaluation
`device`	str	"cpu"	Device: "cuda", "cpu", "mps"
`output_dir`	str	"./runs"	Output directory for results
`auto_resize_masks`	bool	true	Auto-resize masks to model output
`output_formats`	list	["json"]	Output formats: ["json", "csv"]
`metric_precision`	int	4	Decimal places for metric values
`num_workers`	int	0	Number of DataLoader workers
`pin_memory`	bool	false	Pin memory for faster GPU transfer
`persistent_workers`	bool	false	Keep workers alive between epochs

📈 Metrics Configuration¶

📊 Basic Metrics Setup¶

metrics:
  ignore_index: 255                # Ignore index for evaluation
  selected_metrics:
    - "mean_iou"                   # Mean Intersection over Union
    - "pixel_accuracy"             # Pixel accuracy
    - {"name": "dice_score", "average": "micro"}  # Dice score with micro averaging
    - "custom_metric_name"         # Custom metric

📋 Available Metrics¶

🔧 Built-in Metrics: - mean_iou - Mean Intersection over Union - pixel_accuracy - Overall pixel accuracy - precision - Precision (per-class or averaged) - recall - Recall (per-class or averaged) - dice_score - Dice coefficient (F1-score)

📊 Averaging Options: - macro - Macro averaging (default) - micro - Micro averaging

📝 Metric Configuration Examples¶

# All default metrics with macro averaging
metrics:
  ignore_index: 255
  selected_metrics: null  # Use all default metrics

# Specific metrics with different averaging
metrics:
  ignore_index: 255
  selected_metrics:
    - "mean_iou"
    - "pixel_accuracy"
    - {"name": "precision", "average": "macro"}
    - {"name": "precision", "average": "micro"}
    - {"name": "recall", "average": "macro"}
    - {"name": "recall", "average": "micro"}
    - {"name": "dice_score", "average": "macro"}
    - {"name": "dice_score", "average": "micro"}

# Custom metrics only
metrics:
  ignore_index: 255
  selected_metrics:
    - "custom_iou_metric"
    - "custom_accuracy_metric"

⚙️ Metrics Configuration Options¶

Option	Type	Default	Description
`ignore_index`	int	255	Index to ignore in evaluation
`selected_metrics`	list	null	List of metrics to compute (null = all)
`include_pixel_accuracy`	bool	true	Include pixel accuracy in default metrics

📝 Complete Configuration Examples¶

🏷️ Basic VOC Evaluation¶

model:
  type: "torchvision"
  config:
    name: "deeplabv3_resnet50"
    num_classes: 21

dataset:
  name: "voc"
  root: "data/VOCdevkit/VOC2012"
  split: "val"
  image_shape: [512, 512]
  download: false

attacks:
  - name: "fgsm"
    eps: 0.02
  - name: "pgd"
    eps: 0.02
    alpha: 0.01
    iters: 10
    targeted: false

pipeline:
  batch_size: 4
  device: "cuda"
  output_dir: "./runs/voc_evaluation"
  auto_resize_masks: true
  output_formats: ["json", "csv"]

metrics:
  ignore_index: 255
  selected_metrics:
    - "mean_iou"
    - "pixel_accuracy"
    - {"name": "dice_score", "average": "micro"}

🏙️ Advanced ADE20K Evaluation¶

model:
  type: "smp"
  config:
    architecture: "UnetPlusPlus"
    encoder_name: "efficientnet-b0"
    encoder_weights: "imagenet"
    in_channels: 3
    classes: 150

dataset:
  name: "ade20k"
  root: "data/ADEChallengeData2016"
  split: "val"
  image_shape: [512, 512]
  download: false

attacks:
  - name: "fgsm"
    eps: 0.02
  - name: "fgsm"
    eps: 0.05
  - name: "pgd"
    eps: 0.02
    alpha: 0.01
    iters: 10
    targeted: false
  - name: "pgd"
    eps: 0.05
    alpha: 0.02
    iters: 20
    targeted: false
  - name: "rfgsm"
    eps: 0.02
    alpha: 0.01
    iters: 10

pipeline:
  batch_size: 2
  device: "cuda"
  output_dir: "./runs/ade20k_evaluation"
  auto_resize_masks: true
  output_formats: ["json", "csv"]
  metric_precision: 4
  num_workers: 2
  pin_memory: true
  persistent_workers: true

metrics:
  ignore_index: 255
  selected_metrics:
    - "mean_iou"
    - "pixel_accuracy"
    - {"name": "precision", "average": "macro"}
    - {"name": "precision", "average": "micro"}
    - {"name": "recall", "average": "macro"}
    - {"name": "recall", "average": "micro"}
    - {"name": "dice_score", "average": "macro"}
    - {"name": "dice_score", "average": "micro"}

🤗 HuggingFace Model Evaluation¶

model:
  type: "huggingface"
  config:
    model_name: "nvidia/segformer-b0-finetuned-ade-512-512"
    revision: "main"
    trust_remote_code: false

dataset:
  name: "ade20k"
  root: "data/"
  split: "val"
  image_shape: [512, 512]
  download: true

attacks:
  - name: "fgsm"
    eps: 0.02
  - name: "pgd"
    eps: 0.02
    alpha: 0.01
    iters: 10
    targeted: false

pipeline:
  batch_size: 1  # Smaller batch size for large models
  device: "cuda"
  output_dir: "./runs/segformer_evaluation"
  auto_resize_masks: true
  output_formats: ["json"]

metrics:
  ignore_index: 255
  selected_metrics:
    - "mean_iou"
    - "pixel_accuracy"

🔄 Using Configuration Files¶

📄 Loading from YAML¶

from segmentation_robustness_framework.pipeline.config import PipelineConfig

# Load configuration from YAML file
config = PipelineConfig.from_yaml("config.yaml")

# Create and run pipeline
pipeline = config.create_pipeline()
results = pipeline.run(save=True, show=False)

📋 Loading from JSON¶

from segmentation_robustness_framework.pipeline.config import PipelineConfig

# Load configuration from JSON file
config = PipelineConfig.from_json("config.json")

# Create and run pipeline
pipeline = config.create_pipeline()
results = pipeline.run(save=True, show=False)

📚 Loading from Dictionary¶

from segmentation_robustness_framework.pipeline.config import PipelineConfig

# Configuration as dictionary
config_dict = {
    "model": {
        "type": "torchvision",
        "config": {"name": "deeplabv3_resnet50", "num_classes": 21}
    },
    "dataset": {
        "name": "voc",
        "root": "data/",
        "split": "val",
        "image_shape": [512, 512],
        "download": True
    },
    "attacks": [{"name": "fgsm", "eps": 0.02}],
    "pipeline": {"batch_size": 4, "device": "cuda"}
}

# Create configuration from dictionary
config = PipelineConfig.from_dict(config_dict)

# Create and run pipeline
pipeline = config.create_pipeline()
results = pipeline.run()

💻 Command Line Usage¶

# Run evaluation from configuration file
python -m segmentation_robustness_framework.cli.main run config.yaml

# Run with custom output directory
python -m segmentation_robustness_framework.cli.main run config.yaml --output-dir ./custom_output

# Run with specific device
python -m segmentation_robustness_framework.cli.main run config.yaml --device cpu

⚠️ Common Configuration Issues¶

🔗 Model-Dataset Compatibility¶

Issue: Model and dataset have different numbers of classes

Solution: Ensure num_classes in model config matches dataset

# VOC has 21 classes
model:
  type: "torchvision"
  config:
    name: "deeplabv3_resnet50"
    num_classes: 21  # Must match VOC

dataset:
  name: "voc"  # Has 21 classes

💾 Memory Issues¶

Issue: CUDA out of memory

Solution: Reduce batch size or use CPU

pipeline:
  batch_size: 1  # Reduce from 8
  device: "cpu"   # Or use CPU instead of CUDA

⚔️ Attack Parameter Issues¶

Issue: Attack parameters are invalid

Solution: Check parameter ranges and relationships

attacks:
  - name: "pgd"
    eps: 0.02      # Must be positive
    alpha: 0.01    # Must be <= eps
    iters: 10      # Must be positive

📁 Dataset Path Issues¶

Issue: Dataset not found

Solution: Enable download or specify correct path

dataset:
  name: "voc"
  root: "./data/VOCdevkit/VOC2012"  # Correct path
  download: true                      # Auto-download if not found

🎯 Best Practices¶

📋 Configuration Organization¶

Use descriptive names for output directories
Group related configurations logically
Include comments for complex configurations
Use consistent formatting and indentation

⚡ Performance Optimization¶

Start with small batch sizes and increase gradually
Use GPU when available for faster evaluation
Enable pin_memory for GPU evaluations
Use multiple workers for data loading (but be careful with memory)

✅ Validation¶

Test configurations with small datasets first
Verify model-dataset compatibility before running
Check memory requirements for your hardware
Validate attack parameters are within reasonable ranges

🔧 Maintenance¶

Version control your configuration files
Document custom configurations with comments
Keep backup configurations for different scenarios
Update configurations when framework versions change

📚 Configuration Templates¶

🎯 Minimal Configuration¶

model:
  type: "torchvision"
  config:
    name: "deeplabv3_resnet50"
    num_classes: 21

dataset:
  name: "voc"
  split: "val"
  image_shape: [512, 512]
  download: true

attacks:
  - name: "fgsm"
    eps: 0.02

pipeline:
  batch_size: 4
  device: "cuda"

metrics:
  ignore_index: 255

🏭 Production Configuration¶

model:
  type: "smp"
  config:
    architecture: "UnetPlusPlus"
    encoder_name: "efficientnet-b0"
    encoder_weights: "imagenet"
    in_channels: 3
    classes: 21

dataset:
  name: "voc"
  root: "./data/VOCdevkit/VOC2012"
  split: "val"
  image_shape: [512, 512]
  download: false

attacks:
  - name: "fgsm"
    eps: 0.02
  - name: "fgsm"
    eps: 0.05
  - name: "pgd"
    eps: 0.02
    alpha: 0.01
    iters: 10
    targeted: false
  - name: "pgd"
    eps: 0.05
    alpha: 0.02
    iters: 20
    targeted: false

pipeline:
  batch_size: 8
  device: "cuda"
  output_dir: "./runs/production_evaluation"
  auto_resize_masks: true
  output_formats: ["json", "csv"]
  metric_precision: 4
  num_workers: 4
  pin_memory: true
  persistent_workers: true

metrics:
  ignore_index: 255
  selected_metrics:
    - "mean_iou"
    - "pixel_accuracy"
    - {"name": "precision", "average": "macro"}
    - {"name": "recall", "average": "macro"}
    - {"name": "dice_score", "average": "macro"}

Happy configuring! ⚙️✨