The field of information retrieval has rapidly evolved due to the exponential growth of digital data. With the increasing volume of unstructured data, efficient methods for searching and retrieving relevant information have become more crucial than ever. Traditional keyword-based search techniques often need to capture the nuanced meaning of text, leading to inaccurate or irrelevant…
Self-correction mechanisms have been a significant topic of interest within artificial intelligence, particularly in Large Language Models (LLMs). Self-correction is traditionally seen as a distinctive human trait. Still, researchers have started investigating how it can be applied to LLMs to enhance their capabilities without requiring external inputs. This emerging area explores ways to enable LLMs…
Large Language Models (LLMs) have revolutionized artificial intelligence, impacting various scientific and engineering disciplines. The Transformer architecture, initially designed for machine translation, has become the foundation for GPT models, significantly advancing the field. However, current LLMs face challenges in their training approach, which primarily focuses on predicting the next token based on previous context while…
Large Language Models (LLMs) based on Transformer architectures have revolutionized AI development. However, the complexity of their training process remains poorly understood. A significant challenge in this domain is the inconsistency in optimizer performance. While the Adam optimizer has become the standard for training Transformers, stochastic gradient descent with momentum (SGD), which is highly effective…
Recent research highlights that Transformers, though successful in tasks like arithmetic and algorithms, need help with length generalization, where models handle inputs of unseen lengths. This is crucial for algorithmic tasks such as coding or reasoning, where input length often correlates with problem difficulty. Large language models face this limitation even when scaled due to…
One of the critical challenges in the development and deployment of Large Language Models (LLMs) is ensuring that these models are aligned with human values. As LLMs are applied across diverse fields and tasks, the risk of these models operating in ways that may contradict ethical norms or propagate cultural biases becomes a significant concern.…
Cardiotocography (CTG) is a non-invasive method used to monitor fetal heart rate and uterine contractions during pregnancy. This data can help identify potential complications early on, such as fetal distress, preeclampsia, or preterm labor. However, interpreting CTG recordings can be subjective and prone to errors, leading to potential misdiagnosis and delayed intervention. It can be…
Retrieval Augmented Generation (RAG) is an AI framework that optimizes the output of a Large Language Model (LLM) by referencing a credible knowledge base outside of its training sources. RAG combines the capabilities of LLMs with the strengths of traditional information retrieval systems such as databases to help AI write more accurate and relevant text.…
Cardinality estimation (CE) is essential to many database-related tasks, such as query generation, cost estimation, and query optimization. Accurate CE is necessary to ensure optimal query planning and execution within a database system. Adopting machine learning (ML) techniques has introduced new possibilities for CE, allowing researchers to leverage ML models’ robust learning and representation capabilities.…
Large language models (LLMs) have gained significant attention in machine learning, shifting the focus from optimizing generalization on small datasets to reducing approximation error on massive text corpora. This paradigm shift presents researchers with new challenges in model development and training methodologies. The primary objective has evolved from preventing overfitting through regularization techniques to effectively…