Large Language Models (LLMs) have gained significant attention for their versatility, but their factualness remains a critical concern. Studies have revealed that LLMs can produce nonfactual, hallucinated, or outdated information, undermining reliability. Current evaluation methods, such as fact-checking and fact-QA, face several challenges. Fact-checking struggles to assess the factualness of generated content, while fact-QA encounters…
Multimodal Language Models MLLMs architectures have evolved to enhance text-image interactions through various techniques. Models like Flamingo, IDEFICS, BLIP-2, and Qwen-VL use learnable queries, while LLaVA and MGM employ projection-based interfaces. LLaMA-Adapter and LaVIN focus on parameter-efficient tuning. Dataset quality significantly impacts MLLM effectiveness, with recent studies refining visual instruction tuning datasets to improve performance…
Visual Simultaneous Localization and Mapping (SLAM) is a critical technology in robotics and computer vision that allows real-time state estimation for various applications. SLAM has become important for monocular depth estimation, view synthesis, and 3D human pose reconstruction tasks. However, these tasks face a critical challenge in applications in achieving high tracking accuracy with monocular…
Artificial intelligence, particularly natural language processing (NLP), has become a cornerstone in advancing technology, with large language models (LLMs) leading the charge. These models, such as those used for text summarization, automated customer support, and content creation, are designed to interpret and generate human-like text. However, the true potential of these LLMs is realized through…
The protein structure and sequence analysis field is critical in understanding how proteins function at a molecular level. Proteins are essential molecules composed of sequences of amino acids that fold into specific 3D shapes and structures, determining their functions in biological systems. Understanding the precise relationship between these sequences and their resulting structures is vital…
Audio, as a medium, holds immense potential for conveying complex information, making it essential for developing systems that can accurately interpret & respond to audio inputs. The field aims to create models that can comprehend a wide range of sounds, from spoken language to environmental noise, and use this understanding to facilitate more natural interactions…
Traditional molecular representations, primarily focused on covalent bonds, have neglected crucial aspects like delocalization and non-covalent interactions. Existing machine learning models have utilized information-sparse representations, limiting their ability to capture molecular complexity. While computational chemistry has developed robust quantum-mechanical methods, their application in machine learning has been constrained by calculation challenges for complex systems. Graph-based…
Deep learning has revolutionized various domains, with Transformers emerging as a dominant architecture. However, Transformers must improve the processing of lengthy sequences due to their quadratic computational complexity. Recently, a novel architecture named Mamba has shown promise in building foundation models with comparable abilities to Transformers while maintaining near-linear scalability with sequence length. This survey…
Knowledge Distillation (KD) has become a key technique in the field of Artificial Intelligence, especially in the context of Large Language Models (LLMs), for transferring the capabilities of proprietary models, like GPT-4, to open-source alternatives like LLaMA and Mistral. In addition to improving the performance of open-source models, this procedure is essential for compressing them…
Data analysis has become increasingly accessible due to the development of large language models (LLMs). These models have lowered the barrier for individuals with limited programming skills, enabling them to engage in complex data analysis through conversational interfaces. LLMs have opened new avenues for extracting meaningful insights from data by simplifying the process of generating…