Skip to content

Update DDP strategy to handle unused parameters consistently#1094

Open
dohyeonYoon wants to merge 4 commits into
roboflow:developfrom
dohyeonYoon:develop
Open

Update DDP strategy to handle unused parameters consistently#1094
dohyeonYoon wants to merge 4 commits into
roboflow:developfrom
dohyeonYoon:develop

Conversation

@dohyeonYoon

@dohyeonYoon dohyeonYoon commented Jun 6, 2026

Copy link
Copy Markdown

Removed segmentation head condition for DDP strategy to consistently apply unused parameters detection across detection and segmentation models.

What does this PR do?

**Related Issue(s):Related to #1093

Type of Change

  • Bug fix (non-breaking change that fixes an issue)

Testing

  • I have tested this change locally
  • I have added/updated tests for this change

Test details:

I used the code below, and the error is gone.

import os
from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir="/home/dohyeon/Datasets/dataset_rfdetr_576",
    epochs=1,
    batch_size=4,
    grad_accum_steps=2,  # 2GPU × grad_accum_steps=2 × batch_size=4 = effective 16
    output_dir="output/strategy1",
    progress_bar="tqdm",
    devices=2,
    # strategy="ddp_find_unused_parameters_true",  # NOT set → triggers error
    num_workers=16,
    pin_memory=True,
    resolution=576,
    resume=None,
    checkpoint_interval=1,
    use_ema=True,
    tensorboard=True,
    seed=42,
)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

This bug only occurs when training with multiple GPUs, for example using:

torchrun --nproc_per_node=2 train.py

The error occurs only when using two or more GPUs. It does not occur when running on a single GPU:

python train.py

Removed segmentation head condition for DDP strategy to consistently apply unused parameters detection across detection and segmentation models.
@CLAassistant

CLAassistant commented Jun 6, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

pre-commit-ci Bot and others added 2 commits June 6, 2026 16:50
Removed segmentation_head condition for consistent DDP configuration across models.
Comment thread src/rfdetr/training/trainer.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants