Attacks API¶
This page documents the adversarial attack components of the Segmentation Robustness Framework.
Attack Classes¶
segmentation_robustness_framework.attacks.attack
¶
Classes¶
AdversarialAttack(model: nn.Module)
¶
Bases: ABC
Define the base class for adversarial attacks.
Attributes:
| Name | Type | Description |
|---|---|---|
model |
Module
|
Segmentation model to be attacked. |
device |
str | device
|
The device to use for the attack. |
Initialize the adversarial attack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
Segmentation model to be attacked. |
required |
Source code in segmentation_robustness_framework/attacks/attack.py
Functions¶
set_device(device: str | torch.device) -> None
¶
Set the device for the attack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device
|
str | device
|
The device to use for the attack. |
required |
Source code in segmentation_robustness_framework/attacks/attack.py
apply(image: torch.Tensor, labels: torch.Tensor) -> torch.Tensor
abstractmethod
¶
Perform an attack on the segmentation model.
This method should be implemented by subclasses to define the attack logic.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Tensor
|
The input image tensor to be perturbed. |
required |
labels
|
Tensor
|
The true or target labels for the image. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: The perturbed image tensor. |
Source code in segmentation_robustness_framework/attacks/attack.py
segmentation_robustness_framework.attacks.fgsm
¶
Classes¶
FGSM(model: nn.Module, eps: float = 2 / 255)
¶
Bases: AdversarialAttack
Fast Gradient Sign Method (FGSM) method from "Explaining and harnessing adversarial examples". Paper: https://arxiv.org/abs/1412.6572
Attributes:
| Name | Type | Description |
|---|---|---|
model |
Module
|
The model that the adversarial attack will be applied to. |
eps |
float
|
The magnitude of the perturbation. |
Initialize FGSM attack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model that the adversarial attack will be applied to. |
required |
eps
|
float
|
The magnitude of the perturbation. Defaults to 2/255. |
2 / 255
|
Source code in segmentation_robustness_framework/attacks/fgsm.py
Functions¶
get_params() -> dict[str, float]
¶
Get attack parameters.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
dict[str, float]: Dictionary containing attack parameters. |
apply(image: torch.Tensor, labels: torch.Tensor) -> torch.Tensor
¶
Apply FGSM attack to input images.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Tensor
|
Input image tensor [B, C, H, W]. |
required |
labels
|
Tensor
|
Target labels tensor [B, H, W]. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: Adversarial image tensor [B, C, H, W]. |
Source code in segmentation_robustness_framework/attacks/fgsm.py
Functions¶
segmentation_robustness_framework.attacks.pgd
¶
Classes¶
PGD(model: nn.Module, eps: float = 2 / 255, alpha: float = 2 / 255, iters: int = 10, targeted: bool = False)
¶
Bases: AdversarialAttack
Projected Gradient Descent (PGD) method from "Towards Deep Learning Models Resistant to Adversarial Attacks". Paper: https://arxiv.org/abs/1706.06083
Attributes:
| Name | Type | Description |
|---|---|---|
model |
Module
|
The model that the adversarial attack will be applied to. |
eps |
float
|
The magnitude of the perturbation. |
alpha |
float
|
The step size for each iteration. |
iters |
int
|
The number of iterations. |
targeted |
bool
|
Indicates whether the attack is targeted or not. |
Initializes PGD attack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model that the adversarial attack will be applied to. |
required |
eps
|
float
|
The magnitude of the perturbation. Defaults to 2/255. |
2 / 255
|
alpha
|
float
|
The step size for each iteration. Defaults to 2/255. |
2 / 255
|
iters
|
int
|
The number of iterations. Defaults to 10. |
10
|
targeted
|
bool
|
If True, performs a targeted attack; otherwise, performs an untargeted attack. Defaults to False. |
False
|
Source code in segmentation_robustness_framework/attacks/pgd.py
Functions¶
get_params() -> dict[str, float]
¶
Get attack parameters.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
dict[str, float]: Dictionary containing attack parameters. |
Source code in segmentation_robustness_framework/attacks/pgd.py
apply(images: torch.Tensor, labels: torch.Tensor) -> torch.Tensor
¶
Apply PGD attack to a batch of images.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
Tensor
|
Batch of input images [B, C, H, W]. |
required |
labels
|
Tensor
|
Batch of target labels [B, H, W]. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: Batch of adversarial images [B, C, H, W]. |
Source code in segmentation_robustness_framework/attacks/pgd.py
Functions¶
segmentation_robustness_framework.attacks.rfgsm
¶
Classes¶
RFGSM(model: nn.Module, eps: float = 8 / 255, alpha: float = 2 / 255, iters: int = 10, targeted: bool = False)
¶
Bases: AdversarialAttack
Random Fast Gradient Sign Method (R+FGSM) from the paper "Ensemble Adversarial Training : Attacks and Defences". Paper: https://arxiv.org/abs/1705.07204
Attributes:
| Name | Type | Description |
|---|---|---|
model |
Module
|
The model that the adversarial attack will be applied to. |
eps |
float
|
Strength of the attack or maximum perturbation. |
alpha |
float
|
Step size. |
iters |
int
|
Number of iterations. |
targeted |
bool
|
Indicates whether the attack is targeted or not. |
Initializes R+FGSM attack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model that the adversarial attack will be applied to. |
required |
eps
|
float
|
Strength of the attack or maximum perturbation. Defaults to 8/255. |
8 / 255
|
alpha
|
float
|
Step size. Defaults to 2/255. |
2 / 255
|
iters
|
int
|
Number of iterations. Defaults to 10. |
10
|
targeted
|
bool
|
If True, performs a targeted attack; otherwise, performs an untargeted attack. Defaults to False. |
False
|
Source code in segmentation_robustness_framework/attacks/rfgsm.py
Functions¶
get_params() -> dict[str, float]
¶
Get attack parameters.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
dict[str, float]: Dictionary containing attack parameters. |
Source code in segmentation_robustness_framework/attacks/rfgsm.py
apply(images: torch.Tensor, labels: torch.Tensor) -> torch.Tensor
¶
Apply R+FGSM attack to a batch of images.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
Tensor
|
Batch of input images [B, C, H, W]. |
required |
labels
|
Tensor
|
Batch of target labels [B, H, W]. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: Batch of adversarial images [B, C, H, W]. |
Source code in segmentation_robustness_framework/attacks/rfgsm.py
Functions¶
segmentation_robustness_framework.attacks.tpgd
¶
Classes¶
TPGD(model: nn.Module, eps: float = 8 / 255, alpha: float = 2 / 255, iters: int = 10)
¶
Bases: AdversarialAttack
PGD based on KL-Divergence loss from the paper "Theoretically Principled Trade-off between Robustness and Accuracy". Paper: https://arxiv.org/abs/1901.08573
Attributes:
| Name | Type | Description |
|---|---|---|
model |
Module
|
The model that the adversarial attack will be applied to. |
eps |
float
|
Strength of the attack or maximum perturbation. |
alpha |
float
|
Step size. |
iters |
int
|
Number of iterations. |
Initializes TPGD attack.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model that the adversarial attack will be applied to. |
required |
eps
|
float
|
Strength of the attack or maximum perturbation. |
8 / 255
|
alpha
|
float
|
Step size. |
2 / 255
|
iters
|
int
|
Number of iterations. |
10
|
Source code in segmentation_robustness_framework/attacks/tpgd.py
Functions¶
apply(images: torch.Tensor, labels: torch.Tensor = None) -> torch.Tensor
¶
Apply TPGD attack to a batch of images.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
Tensor
|
Batch of input images [B, C, H, W]. |
required |
labels
|
Tensor
|
Batch of target labels [B, H, W]. Not used in TPGD. |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
torch.Tensor: Batch of adversarial images [B, C, H, W]. |
Source code in segmentation_robustness_framework/attacks/tpgd.py
Functions¶
Attack Overview¶
The framework provides a comprehensive suite of adversarial attacks designed to test the robustness of segmentation models. All attacks inherit from the AdversarialAttack base class.
AdversarialAttack Base Class¶
The base class that all attacks must implement:
from abc import ABC, abstractmethod
import torch
class AdversarialAttack(ABC):
"""Base class for adversarial attacks."""
def __init__(self, model, eps=0.1, device="cuda"):
self.model = model
self.eps = eps
self.device = device
@abstractmethod
def apply(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
"""Apply the attack to input x with target y."""
pass
Available Attacks¶
FGSM (Fast Gradient Sign Method)¶
A simple but effective first-order attack:
from segmentation_robustness_framework.attacks import FGSM
# Create FGSM attack
attack = FGSM(model, eps=0.02)
# Apply attack
adversarial_x = attack.apply(x, y)
Parameters:
- eps: Maximum perturbation magnitude (default: 2/255 ≈ 0.008)
PGD (Projected Gradient Descent)¶
A more powerful iterative attack:
from segmentation_robustness_framework.attacks import PGD
# Create PGD attack
attack = PGD(
model=model,
eps=0.02,
alpha=0.02,
iters=10,
targeted=False
)
# Apply attack
adversarial_x = attack.apply(x, y)
Parameters:
- eps: Maximum perturbation magnitude (default: 2/255 ≈ 0.008)
- alpha: Step size for each iteration (default: 2/255 ≈ 0.008)
- iters: Number of iterations (default: 10)
- targeted: Whether to perform targeted attack (default: False)
RFGSM (R-FGSM with Momentum)¶
FGSM with momentum for better convergence:
from segmentation_robustness_framework.attacks import RFGSM
# Create RFGSM attack
attack = RFGSM(
model=model,
eps=0.1,
alpha=0.01,
iters=10,
targeted=False
)
# Apply attack
adversarial_x = attack.apply(x, y)
Parameters:
- eps: Maximum perturbation magnitude (default: 0.1)
- alpha: Step size for each iteration (default: 0.01)
- iters: Number of iterations (default: 10)
- targeted: Whether to perform targeted attack (default: False)
TPGD (Targeted Projected Gradient Descent)¶
from segmentation_robustness_framework.attacks import TPGD
# Create TPGD attack
attack = TPGD(
model=model,
eps=0.1,
alpha=0.01,
iters=10
)
# Apply attack
adversarial_x = attack.apply(x, y)
Parameters:
- eps: Maximum perturbation magnitude (default: 0.1)
- alpha: Step size for each iteration (default: 0.01)
- iters: Number of iterations (default: 10)
Attack Configuration¶
Configure attacks in YAML configuration files:
attacks:
- name: fgsm
eps: 0.02
- name: pgd
eps: 0.02
alpha: 0.02
iters: 10
targeted: false
- name: rfgsm
eps: 0.02
alpha: 0.02
iters: 10
targeted: false
- name: tpgd
eps: 0.02
alpha: 0.02
iters: 10
Custom Attacks¶
Create custom attacks by inheriting from AdversarialAttack:
from segmentation_robustness_framework.attacks import AdversarialAttack
import torch
class MyCustomAttack(AdversarialAttack):
def __init__(self, model, eps=0.1, custom_param=1.0):
super().__init__(model, eps)
self.custom_param = custom_param
def apply(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
"""Apply your custom attack logic here."""
# Your attack implementation
x.requires_grad_(True)
# Forward pass
logits = self.model.logits(x)
loss = torch.nn.functional.cross_entropy(logits, y)
# Backward pass
loss.backward()
# Create perturbation
perturbation = self.custom_param * x.grad.sign()
# Apply perturbation with clipping
adversarial_x = x + perturbation
adversarial_x = torch.clamp(adversarial_x, 0, 1)
return adversarial_x.detach()
# Use custom attack
attack = MyCustomAttack(model, eps=0.1, custom_param=0.5)
adversarial_x = attack.apply(x, y)
Attack Registration¶
Register custom attacks for automatic discovery:
from segmentation_robustness_framework.attacks import register_attack
@register_attack("my_custom_attack")
class MyCustomAttack(AdversarialAttack):
def __init__(self, model, eps=0.1, custom_param=1.0):
super().__init__(model, eps)
self.custom_param = custom_param
def apply(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
# Your attack implementation
pass
# Now you can use it in configuration
# attacks:
# - name: my_custom_attack
# eps: 0.1
# custom_param: 0.5
Attack Usage in Pipeline¶
Attacks are automatically used by the pipeline:
from segmentation_robustness_framework.pipeline import SegmentationRobustnessPipeline
from segmentation_robustness_framework.attacks import FGSM, PGD
# Create attacks
attacks = [
FGSM(model, eps=0.1),
PGD(model, eps=0.1, alpha=0.01, iters=10)
]
# Use in pipeline
pipeline = SegmentationRobustnessPipeline(
model=model,
dataset=dataset,
attacks=attacks,
metrics=[metrics.mean_iou],
batch_size=4,
device="cuda"
)
results = pipeline.run()
Attack Evaluation¶
Evaluate attack effectiveness:
# Compare clean vs adversarial performance
clean_iou = results['clean']['mean_iou']
fgsm_iou = results['attack_fgsm']['mean_iou']
pgd_iou = results['attack_pgd']['mean_iou']
print(f"Clean IoU: {clean_iou:.3f}")
print(f"FGSM IoU: {fgsm_iou:.3f}")
print(f"PGD IoU: {pgd_iou:.3f}")
# Calculate robustness
fgsm_robustness = fgsm_iou / clean_iou
pgd_robustness = pgd_iou / clean_iou
print(f"FGSM Robustness: {fgsm_robustness:.3f}")
print(f"PGD Robustness: {pgd_robustness:.3f}")
Performance Considerations¶
- GPU Acceleration: All attacks support GPU acceleration
- Memory Efficiency: Optimized for batch processing
- Gradient Computation: Efficient gradient computation for iterative attacks
- Convergence: Automatic convergence detection for iterative attacks