This repository is the project page for the paper
“Evaluating Diffusion-based Super-Resolution for Trustworthy Quantitative Metallography”,
published at the NeurIPS 2025 AI4Mat workshop.
Recent advances in generative based super resolution(SR)has set SOTA performance on natural images upsampling. Super-resolution holds promise for improving metallographic analysis, but diffusion-based methods raise concerns about hallucinated structures that could bias quantitative results. We present the first systematic study of diffusion SR in quantitative metallography. Using OSEDiff with a fixed domain prompt (“metallographic image”), we generate a fourfold super-resolved version of the Texture Boundary in Metallography (TBM) dataste(SR-TBM) and train edge detectors on both original and SR images. We evaluate both pixel based edge detection metrics, and global grain size estimation (Heyn Intercpet). We demonstrate generative AI, under domain specific constraints, can be used to dramatically improve metallographic images quality. A human expert audit found virtually no major hallucinations. Models trained on SR-TBM achieved a comparable pixel-based performance, while demonstrating a 47% reduction in grain-size error.
Edge detection for metallography on the TBM dataset. Two tracks: training and evaluating UAED on the original TBM images and on a super-resolved variant (SR_TBM).
Pretrained checkpoints and datasets download available on: https://zenodo.org/records/16918388
.
├── checkpoints/
│ ├── uaed_tbm.pth # trained on TBM_original
│ └── uaed_sr_tbm.pth # trained on SR_TBM
├── scripts/
│ ├── train.py # training entry point
│ ├── eval_on_folder.py # evaluation on an images+masks folder
│ ├── train_original/repro.sh
│ ├── train_sr/repro.sh
│ ├── eval_original_configs/run_command.txt
│ └── eval_sr_configs/run_command.txt
├── TBM_original/ # original-resolution dataset
│ ├── train/{images,images_1024,masks}
│ ├── val/{images,images_1024,masks}
│ └── test/{images,images_1024,masks}
└── SR_TBM/ # super-resolved dataset
├── train/{images,masks}
├── val/{images,masks}
└── test/{images,masks}
Notes
images_1024/are linearly upscaled originals for parity with SR inputs.- Filenames in
images/andmasks/are one-to-one.
Runs under the UAED environment. Follow the setup from the UAED repository and keep the same Python/CUDA stack.
- UAED GitHub: https://github.com/ZhouCX117/UAED_MuGE
Minimal extras commonly used here (install after the UAED setup):
pip install numpy pandas scikit-image imageio matplotlib tqdmOriginal TBM
python scripts/eval_on_folder.py --images TBM_original/test/images_1024 --masks TBM_original/test/masks --ckpt checkpoints/uaed_tbm.pth --out outputs/tbm_eval
# Or replicate our exact command(s):
cat scripts/eval_original_configs/run_command.txtSR_TBM
python scripts/eval_on_folder.py --images SR_TBM/test/images --masks SR_TBM/test/masks --ckpt checkpoints/uaed_sr_tbm.pth --out outputs/sr_tbm_eval
# Or:
cat scripts/eval_sr_configs/run_command.txtHelp
python scripts/eval_on_folder.py -hExpected outputs
- Probability maps, uncertainty maps, thresholded masks, and overlays.
- CSV report with AP, ROC-AUC, thresholded metrics, and a pooled Heyn grain-size estimate.
- PR/ROC curve images.
Original TBM
bash scripts/train_original/repro.shSR_TBM
bash scripts/train_sr/repro.shDirect help
python scripts/train.py -hOutputs
- Checkpoints saved under
checkpoints/(or a run directory if configured). - Validation figures similar to eval.
- Optional logs/TensorBoard if enabled in the scripts.
Folder convention:
DATA_ROOT/{train,val,test}/images/*.png
DATA_ROOT/{train,val,test}/masks/*.png
# TBM_original also has {train,val,test}/images_1024/*.png
Masks are single-channel PNGs (0/255 or 0/1). Filenames match between images/ and masks/.
checkpoints/uaed_tbm.pth— UAED trained onTBM_original.checkpoints/uaed_sr_tbm.pth— UAED trained onSR_TBM.
Use --ckpt in eval_on_folder.py or the training script to select a model.
- Use the exact commands in:
scripts/train_original/repro.shscripts/train_sr/repro.shscripts/eval_*_configs/run_command.txt
- Fix seeds and keep hardware/software constant when possible.
- UAED for the uncertainty-aware edge detection framework and training codebase. https://github.com/ZhouCX117/UAED_MuGE
- OSEDiff for one-step effective diffusion SR used to produce SR_TBM. https://github.com/cswry/OSEDiff
If this repository helps your work, please cite our accompanying paper(s) and the upstream projects UAED and OSEDiff.
Apache-2.0
Open an issue or pull request with a minimal reproducible example.
