Early attempts in 3D generation focused on single-view reconstruction using category-specific models. Recent advancements utilize pre-trained image and video generators, particularly diffusion models, to enable open-domain generation. Fine-tuning on multi-view datasets improved results, but challenges persisted in generating complex compositions and interactions. Efforts to enhance compositionality in image generative models faced difficulties in transferring techniques…
Microscopic imaging is crucial in modern medicine as an indispensable tool for researchers and clinicians. This imaging technology allows detailed examination of biological structures at the cellular and molecular levels, enabling the study of tissue samples in disease diagnosis and pathology. By capturing these microscopic images, medical professionals can better understand disease mechanisms and progression,…
A major challenge in the field of Speech-Language Models (SLMs) is the lack of comprehensive evaluation metrics that go beyond basic textual content modeling. While SLMs have shown significant progress in generating coherent and grammatically correct speech, their ability to model acoustic features such as emotion, background noise, and speaker identity remains underexplored. Evaluating these…
AI Control assesses the safety of deployment protocols for untrusted AIs through red-teaming exercises involving a protocol designer and an adversary. AI systems, like chatbots with access to tools such as code interpreters, become increasingly integrated into various tasks, ensuring their safe deployment becomes more complex. While prior research has focused on building robustly safe…
Large language models (LLMs) have made significant success in various language tasks, but steering their outputs to meet specific properties remains a challenge. Researchers are attempting to solve the problem of controlling LLM generations to satisfy desired characteristics across a wide range of applications. This includes reinforcement learning from human feedback (RLHF), red-teaming techniques, reasoning…
Stochastic optimization problems involve making decisions in environments with uncertainty. This uncertainty can arise from various sources, such as sensor noise, system disturbances, or unpredictable external factors. It can real-time control and planning in robotics and autonomy, where computational efficiency is crucial for handling complex dynamics and cost functions in ever-changing environments. The core problem…
Large language models (LLMs) have seen remarkable success in natural language processing (NLP). Large-scale deep learning models, especially transformer-based architectures, have grown exponentially in size and complexity, reaching billions to trillions of parameters. However, they pose major challenges in computational resources and memory usage. Even advanced GPUs struggle to handle models with trillions of parameters,…
With the success of LLMs in various tasks, search engines have begun using generative methods to provide accurate answers with in-line citations to user queries. However, generating reliable and attributable answers, especially in open-ended information-seeking scenarios, poses challenges due to the complexity of questions and the broad scope of candidate-attributed answers. Existing methods typically focus…
Automatic speech recognition (ASR) has become a crucial area in artificial intelligence, focusing on the ability to transcribe spoken language into text. ASR technology is widely used in various applications such as virtual assistants, real-time transcription, and voice-activated systems. These systems are integral to how users interact with technology, providing hands-free operation and improving accessibility.…