While multi-step diffusion models have advanced both forward and inverse rendering, existing approaches often treat these problems independently, leading to cycle inconsistency and slow inference speed. In this work, we present Ouroboros, a framework composed of two single-step diffusion models that handle forward and inverse rendering with mutual reinforcement. Our approach extends intrinsic decomposition to both indoor and outdoor scenes and introduces a cycle consistency mechanism that ensures coherence between forward and inverse rendering outputs. Experimental results demonstrate state-of-the-art performance across diverse scenes while achieving substantially faster inference speed compared to other diffusion-based methods. We also demonstrate that Ouroboros can transfer to video decomposition in a training-free manner, reducing temporal inconsistency in video sequences while maintaining high-quality per-frame inverse rendering.
The complete Ouroboros pipeline is organized around a cycle-consistent pair of single-step diffusion models, with an additional training-free extension for temporally consistent video inverse rendering.
We evaluate Ouroboros on both forward rendering and inverse rendering across diverse indoor and outdoor scenes. The visual comparisons below highlight two key observations: first, the cycle-consistent single-step design produces cleaner intrinsic estimates and more faithful reconstructions than RGB↔X in forward rendering; second, the inverse rendering model generalizes across scene types while preserving plausible albedo, normal, roughness, metallicity, and irradiance predictions.
Beyond per-image quality, Ouroboros also supports a training-free video extension that improves temporal consistency, which is reflected in the framework above and the stable appearance of the predicted intrinsic maps.
Given predicted intrinsic maps, Ouroboros synthesizes images that remain close to the original appearance while maintaining consistency with inverse rendering outputs. Compared with RGB↔X, our method yields more coherent material attributes and more faithful reconstructions.
The inverse rendering results show robust decomposition across both indoor and outdoor scenes. The predicted albedo, normal, roughness, metallicity, and irradiance maps remain structurally consistent across diverse inputs.
@article{sun2025ouroboros,
title={Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering},
author={Sun, Shanlin and Wang, Yifan and Zhang, Hanwen and Xiong, Yifeng and Ren, Qin and Fang, Ruogu and Xie, Xiaohui and You, Chenyu},
journal={arXiv preprint arXiv:2508.14461},
year={2025}
}
For questions, feedback, or collaboration opportunities, please contact:
Email: shanlins@uci.edu, chenyu.you@stonybrook.edu