←back to Blog

NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

NormalCrafter introduces a novel approach for surface normal estimation in videos, leveraging diffusion priors to achieve high spatial fidelity and temporal consistency over arbitrary-length sequences.

Key Highlights:

  • Video Diffusion Model Repurposing – Adapts Stable Video Diffusion (SVD) for normal map prediction, maintaining temporal structure instead of RGB generation.
  • Semantic Feature Regularization (SFR) – Aligns intermediate diffusion features with DINO semantic embeddings, enhancing fine-grained geometric detail without inference overhead.
  • Two-Stage Training Protocol – Trains full U-Net in latent space for long-term temporal modeling, followed by spatial fine-tuning in pixel space for high-resolution normal accuracy.
  • Fine-Tuned VAE Decoder – Improves normal map reconstruction quality by adapting the VAE decoder, reducing angular errors and boosting PSNR during training.
  • Zero-Shot Generalization – Achieves strong results across NYUv2, iBims-1 (static images), and ScanNet, Sintel (videos) without task-specific fine-tuning.
  • Superior Quantitative Results – Outperforms baselines (DSINE, StableNormal, Marigold-E2E-FT) with up to 1.6° lower mean angular error and +3.1% better pixel accuracy under 30° error on Sintel videos.
  • Temporal Stability – Produces smoother y-t slices compared to prior methods, eliminating flickering artifacts under large motion and dynamic scenes.
  • Efficient Semantic Enhancement – SFR operates only during training, adding no inference latency or memory cost.
  • Flexible Single-Image Compatibility – Capable of single-frame normal estimation by setting frame length to one, maintaining competitive static accuracy.
  • Extensive Validation – Evaluated across DAVIS, Sora-generated videos, NYUv2, iBims-1, ScanNet, Sintel benchmarks, confirming robustness to diverse environments.

Project

The post NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors appeared first on OpenCV.