[ICCV 2025] Region-Level Data Attribution for Text-to-Image Generative Models

Official PyTorch implementation for Attribution Region Detector (ARD). Details can be found in the paper, [Paper].

If you find this repository useful, please give it a star ⭐.

Region-Level Data Attribution for Text-to-Image Generative Models

Trong-Bang Nguyen, Phi-Le Nguyen, Simon Lucey, and Minh Hoai
ICCV 2025

Abstract:
Data attribution in text-to-image generative models is a crucial yet underexplored problem, particularly at the regional level, where identifying the most influential training regions for generated content can enhance transparency, copyright protection, and error diagnosis. Existing data attribution methods either operate at the whole-image level or lack scalability for large-scale generative models. In this work, we propose a novel framework for region-level data attribution. At its core is the Attribution Region (AR) detector, which localizes influential regions in training images used by the text-to-image generative model. To support this research, we construct a large-scale synthetic dataset with ground-truth region-level attributions, enabling both training and evaluation of our method. Empirical results show that our method outperforms existing attribution techniques in accurately tracing generated content back to training data. Additionally, we demonstrate practical applications, including identifying artifacts in generated images and suggesting improved replacements for generated content. Our dataset and framework will be released to advance further research in region-level data attribution for generative models.

AR-Detector Architecture

Preparation

Setup

We conduct our model running with the following settings: Python 3.9.19, and CUDA 12.1. It is possible that other versions are also available.

Clone this repository.

git clone https://github.com/bangdabezt/AR-Detector.git 
cd AR-Detector/

Install the required dependencies.

conda env create -f environment.yml
conda activate AR_Detector

Download Synthesized Dataset

Download Pre-trained CountGD and Backbones

Synthesize Attribution-Region Dataset

We will release our code for dataset generation soon.

ARD Testing

Instructions for testing ARD with synthesized dataset.

Download Trained ARD Weights

ARD Training

Instructions for training ARD with synthesized dataset.

ARD Inference

Instructions for running ARD on a single sample.

Citation

Acknowledgements

This repository is based on the CountGD and Open-GroundingDino. Our dataset generation is based on Break-A-Scene, SAM and GroundingDINO.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
config		config
data/break-a-scene		data/break-a-scene
datasets		datasets
groundingdino/util		groundingdino/util
models		models
repo_img		repo_img
tools		tools
util		util
.gitignore		.gitignore
README.md		README.md
data_format.md		data_format.md
download_bert.py		download_bert.py
easy_test.py		easy_test.py
engine.py		engine.py
environment.yml		environment.yml
finetune.py		finetune.py
finetune.sh		finetune.sh
finetune_ranking.py		finetune_ranking.py
lvis_filename.sh		lvis_filename.sh
main.py		main.py
main_BERT.py		main_BERT.py
run.sh		run.sh
run1.py		run1.py
run1.sh		run1.sh
run2.py		run2.py
run2.sh		run2.sh
run_train.sh		run_train.sh
single_inf.py		single_inf.py
single_inf.sh		single_inf.sh
temp.json		temp.json
test.py		test.py
train_filename.sh		train_filename.sh
train_label.sh		train_label.sh
train_lafi.sh		train_lafi.sh
train_random.sh		train_random.sh
train_ranking.py		train_ranking.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[ICCV 2025] Region-Level Data Attribution for Text-to-Image Generative Models

AR-Detector Architecture

Contents

Preparation

Setup

Download Synthesized Dataset

Download Pre-trained CountGD and Backbones

Synthesize Attribution-Region Dataset

ARD Testing

Download Trained ARD Weights

ARD Training

ARD Inference

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

AIoT-Lab-BKAI/AR-Detector

Folders and files

Latest commit

History

Repository files navigation

[ICCV 2025] Region-Level Data Attribution for Text-to-Image Generative Models

AR-Detector Architecture

Contents

Preparation

Setup

Download Synthesized Dataset

Download Pre-trained CountGD and Backbones

Synthesize Attribution-Region Dataset

ARD Testing

Download Trained ARD Weights

ARD Training

ARD Inference

Citation

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages