← back to Blog

Computer Vision

Category Added in a WPeMatico Campaign

  • Code2Video: A Code-Centric Paradigm for Educational Video Generation

    Code2Video introduces a revolutionary framework for generating professional educational videos directly from executable Python code. Unlike pixel-based diffusion or text-to-video models, Code2Video treats code as the core generative medium, enabling precise visual control, transparency, and interpretability in long-form educational content. Developed by Show Lab (National University of Singapore), the system coordinates three collaborative agents, namely:…

    Read more →

  • CAP4D: 4D Avatars with Morphable Multi-View Diffusion Models

    CAP4D introduces a unified framework for generating photorealistic and animate style rendering 4D portrait avatars from any number of reference images as well as even a single image. By combining Morphable Multi-View Diffusion Models (MMDMs) with 3D Gaussian Splatting, CAP4D enables real-time rendering and animation with state-of-the-art realism and identity consistency. Key Highlights Morphable Multi-View…

    Read more →

  • Test3R: Learning to Reconstruct 3D at Test Time

    Test3R is a novel and simple test-time learning technique that significantly improves 3D reconstruction quality. Unlike traditional pairwise methods such as DUSt3R, which often suffer from geometric inconsistencies and poor generalization, Test3R leverages image triplets and self-supervised optimization at inference to enforce cross-pair consistency. This makes it both robust and cost-efficient, requiring minimal overhead while delivering…

    Read more →

  • BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

    BlenderFusion is a novel framework that merges 3D graphics editing with diffusion models to enable precise, 3D-aware visual compositing. Unlike prior approaches that struggle with multi-object and camera disentanglement, BlenderFusion leverages Blender for fine-grained control and a diffusion-based compositor for realism, bringing unprecedented flexibility to scene editing and generative compositing. Key Highlights: 3D-Grounded Control: Segments…

    Read more →

  • Step-by-Step process to remove Backgrounds from Images Using OpenCV

    Removing backgrounds from images is a common task in design and computer vision. Whether you’re prepping product shots, creating profile pictures, or building visual datasets, automating this process can save hours of manual work. In this blog, we will walk you through building a batch background removal tool using OpenCV. It’s fast, scalable, and outputs…

    Read more →

  • Gemma 3 Explained

    The Google DeepMind team has unveiled its latest evolution in their family of open models –  Gemma 3, and it’s a monumental leap forward. While the AI space is crowded with updates, Gemma 3 isn’t just an incremental improvement; it’s a fundamental upgrade that makes state-of-the-art AI more powerful and accessible. Built on the same…

    Read more →

  • OpenCV Live: From Data Science to Storytelling

    Thursday on OpenCV Live! we’ve got author and data scientist Kristen Kerher who will tell us about how her interest in computer vision led to writing a children’s book about using CV to let her kids know when the school bus was coming down their street. She holds a Master of Science degree in Applied…

    Read more →

  • OpenCV Community Survey 2025

    The OpenCV Community Survey for 2025 is open, and we’re asking for your participation! It’s a short, focused online survey open to the entire OpenCV community that will take just a few minutes to complete. Your answers to the survey questions are important to the future of OpenCV. Please help the OpenCV project and take…

    Read more →

  • OpenCV’s Participation in the GitHub Secure Open Source Fund

    Earlier this year OpenCV was selected to be part of the GitHub Secure Open Source Fund, which provides OOS maintainers with financial support to participate in a three-week program educating them on the latest tooling and methods for ensuring the safety of Open Source Software projects. We are honored to be part of the 71…

    Read more →

  • Application of VLM in Healthcare

    In the complex world of modern medicine, two forms of data reign supreme: the visual and the textual. On one side, a deluge of images, X-rays, MRIs, and pathology slides. On the other, an ocean of text, clinical notes, patient histories, and research papers. For centuries, the bridge between these two worlds existed only within…

    Read more →