EXplainable AI (XAI) has become a critical research domain since AI systems have progressed to being deployed in essential sectors such as health, finance, and criminal justice. These systems have been making decisions that would largely affect the lives of human beings; thus, it’s necessary to understand why their output will end at such results.…
Health acoustics, encompassing sounds like coughs and breathing, hold valuable health information but must be utilized more in medical machine learning. Existing deep learning models for these acoustics are often task-specific, limiting their generalizability. Non-semantic speech attributes can aid in emotion recognition and detecting diseases like Parkinson’s and Alzheimer’s. Recent advancements in SSL promise to…
For optimal performance, AI models require top-notch data. Obtaining and organizing this data may be quite a challenge, unfortunately. There is a risk that publicly available datasets must be more adequate, too broad, or tainted to be useful for some purposes. It can be challenging to find domain experts, which is a problem for many…
Natural Language Processing (NLP) has seen remarkable advancements, particularly in text generation techniques. Among these, Retrieval Augmented Generation (RAG) is a method that significantly improves the coherence, factual accuracy, and relevance of generated text by incorporating information retrieved from specific databases. This approach is especially crucial in specialized fields where precision and context are essential,…
Large-scale pretraining followed by task-specific fine-tuning has revolutionized language modeling and is now transforming computer vision. Extensive datasets like LAION-5B and JFT-300M enable pre-training beyond traditional benchmarks, expanding visual learning capabilities. Notable models such as DINOv2, MAWS, and AIM have made significant strides in self-supervised feature generation and masked autoencoder scaling. However, existing methods often…
AI21 Labs has made a significant stride in the AI landscape by releasing the Jamba 1.5 family of open models, comprising Jamba 1.5 Mini and Jamba 1.5 Large. These models, built on the novel SSM-Transformer architecture, represent a breakthrough in AI technology, particularly in handling long-context tasks. AI21 Labs aims to democratize access to these…
The main challenge in developing advanced visual language models (VLMs) lies in enabling these models to effectively process and understand long video sequences that contain extensive contextual information. Long-context understanding is crucial for applications such as detailed video analysis, autonomous systems, and real-world AI implementations where tasks require the comprehension of complex, multi-modal inputs over…
Tabular data, which dominates many genres, such as healthcare, financial, and social science applications, contains rows and columns with structured features, making it much easier for data management or analysis. However, the diversity of tabular data, including numerical, unconditional, and textual, brings huge challenges to attaining robust and accurate predictive performance. Another area for improvement…
One uses computational power in physics simulation to solve mathematical models that describe physical events. When dealing with complex geometries, fluid dynamics, or large-scale systems, the processing demands of these simulations can be enormous, but the insights they bring are vital. 3D physics simulations are time-consuming, costly, and a pain to run. Before even running…
The training of large-scale deep models on broad datasets is becoming more and more costly in terms of resources and environmental effects due to the exponential development in model sizes and dataset scales in deep learning. A new, potentially game-changing approach is deep model fusion techniques, which combine the insights of several models into one…