Large Language Models (LLMs) are powerful tools for various applications due to their knowledge and understanding capabilities. However, they are also vulnerable to exploitation, especially in jailbreaking attacks in multi-round dialogues. Jailbreaking attacks exploit the complex and sequential nature of human-LLM interactions to subtly manipulate the model’s responses over multiple exchanges. By carefully building questions…
Developing web agents is a challenging area of AI research that has attracted significant attention in recent years. As the web becomes more dynamic and complex, it demands advanced capabilities from agents that interact autonomously with online platforms. One of the major challenges in building web agents is effectively testing, benchmarking, and evaluating their behavior…
Allen Institute for AI (AI2) was founded in 2014 and has consistently advanced artificial intelligence research and applications. OLMo is a large language model (LLM) introduced in February 2024. Unlike proprietary models, OLMo is fully open-source, with its pre-training data, training code, and model weights freely available to the public. This transparency is designed to…
The detailed study of the fly connectome has revolutionized neuroscience, offering insights into brain circuitry and its applications. Extending this progress to the mouse brain, which shares more structural similarities with the human brain, holds immense potential. It could provide the foundation for brain-inspired AI systems, enabling human-like capabilities such as continual learning, energy efficiency,…
Google DeepMind has introduced Genie 2, a multimodal AI model designed to reduce the gap between creativity and AI. Genie 2 is poised to redefine the future of interactive content creation, particularly in video game development and virtual worlds. Building upon the foundation of its predecessor, the original Genie, this new iteration demonstrates advancements, including…
Large language models (LLMs) have rapidly advanced multimodal large language models (LMMs), particularly in vision-language tasks. Videos represent complex, information-rich sources crucial for understanding real-world scenarios. However, current video-language models encounter significant challenges in temporal localization and precise moment detection. Despite extensive training in video captioning and question-answering datasets, these models struggle to identify and…
Medprompt, a run-time steering strategy, demonstrates the potential of guiding general-purpose LLMs to achieve state-of-the-art performance in specialized domains like medicine. By employing structured, multi-step prompting techniques such as chain-of-thought (CoT) reasoning, curated few-shot examples, and choice-shuffle ensembling, Medprompt bridges the gap between generalist and domain-specific models. This approach significantly enhances performance on medical benchmarks…
Understanding protein sequences and their functions has always been a challenging aspect of protein research. Proteins, often described as the building blocks of life, are made up of long, complex sequences that determine their roles in biological systems. Despite advancements in computational biology, making sense of these sequences in a meaningful way is still a…
Astronomical research has transformed dramatically, evolving from limited observational capabilities to sophisticated data collection systems that capture cosmic phenomena with unprecedented precision. Modern telescopes now generate massive datasets spanning multiple wavelengths, revealing intricate details of celestial objects. The current astronomical landscape produces an astounding volume of scientific data, with observational technologies capturing everything from minute…
In the evolving landscape of artificial intelligence, language models are becoming increasingly integral to a variety of applications, from customer service to real-time data analysis. One key challenge, however, remains: preparing documents for ingestion into large language models (LLMs). Many existing LLMs require specific formats and well-structured data to function effectively. Parsing and transforming different…