Category Added in a WPeMatico Campaign
BlockVid represents a major leap forward in long-video generation, tackling one of the hardest open problems in video generation, i.e, producing coherent, high-fidelity, minute-long clips without collapse, drift, or degradation over time. Developed by DAMO Academy, ZIP Lab, and Hupan Lab, BlockVid enhances the semi-autoregressive block diffusion paradigm with innovations that directly address KV-cache error…
Graphical User Interfaces (GUIs) are essential for interactive computer vision, enabling developers to visualize results, adjust parameters, and interact with applications in real time. While frameworks like PyQt and Tkinter are powerful, OpenCV’s HighGUI module offers a lightweight, cross-platform solution that integrates seamlessly with OpenCV, making it ideal for quick experiments, prototyping, and debugging. HighGUI…
Computer vision has evolved into one of the most approachable fields for anyone interested in practical AI, whether you’re a student, engineer, hobbyist, or maker. With the performance of the Raspberry Pi and the maturity of OpenCV, building reliable vision applications at home has become remarkably straightforward. This guide provides a complete walkthrough for getting…
OpenCV.js enables real-time webcam filters in the browser, allowing advanced computer vision effects without installations or native dependencies. From face blurring to artistic effects, it offers powerful visual processing. Powered by WebAssembly, OpenCV.js delivers near-native performance for smooth, complex transformations. This blog covers building live webcam filters with OpenCV.js, from setup to advanced effects, all…
With all the buzz surrounding AI recently, OpenCV has been quietly evolving, adding a range of powerful new features. The OpenCV DNN module, in particular, has matured beautifully, aging like fine wine. As of November 2025, we can see several exciting additions in the latest release. But does it still deliver the same impact as…
WorldGrow redefines 3D world generation by enabling infinite, continuous 3D scene creation through a hierarchical block-wise synthesis and inpainting pipeline. Developed by researchers from Shanghai Jiao Tong University, Huawei Inc., and Huazhong University of Science and Technology, it achieves unbounded, photorealistic, and geometrically coherent environments paving the way for scalable virtual world modeling for games,…
Nano3D revolutionizes 3D asset editing by enabling training-free, part-level shape modifications like removal, addition, and replacement without any manual masks. Developed by researchers from Tsinghua University, Peking University, HKUST, CASIA, and ShengShu, Nano3D bridges the gap between text-driven 2D editing and 3D object manipulation. Unlike existing 3D editing methods that require time-consuming optimization or mask…
Triangle Splatting+ redefines 3D scene reconstruction and rendering by directly optimizing opaque triangles, the fundamental primitive of computer graphic, in a fully differentiable framework. Unlike Gaussian Splatting or NeRF-based approaches, it delivers real-time, game-engine-ready meshes without post-processing, enabling instant compatibility with engines like Unity or Unreal. Developed by researchers from the University of Liège, Simon…
Code2Video introduces a revolutionary framework for generating professional educational videos directly from executable Python code. Unlike pixel-based diffusion or text-to-video models, Code2Video treats code as the core generative medium, enabling precise visual control, transparency, and interpretability in long-form educational content. Developed by Show Lab (National University of Singapore), the system coordinates three collaborative agents, namely:…
CAP4D introduces a unified framework for generating photorealistic and animate style rendering 4D portrait avatars from any number of reference images as well as even a single image. By combining Morphable Multi-View Diffusion Models (MMDMs) with 3D Gaussian Splatting, CAP4D enables real-time rendering and animation with state-of-the-art realism and identity consistency. Key Highlights Morphable Multi-View…