Advances in large language and multimodal speech-text models have laid a foundation for seamless, real-time, natural, and human-like voice interactions. Achieving this requires systems to process speech content, emotional tones, and audio cues while giving accurate and coherent responses. However, challenges remain in overcoming differences in speech and text sequences, limited pre-training for speech tasks…
Scientific metadata in research literature holds immense significance, as highlighted by flourishing research in scientometrics—a discipline dedicated to analyzing scholarly literature. Metadata improves the findability and accessibility of scientific documents by indexing and linking papers in a massive graph. Today, the research community has realized the importance of metadata. However, its awareness and consideration were…
Speech processing systems often struggle to deliver clear audio in noisy environments. This challenge impacts applications such as hearing aids, automatic speech recognition (ASR), and speaker verification. Conventional single-channel speech enhancement (SE) systems use neural network architectures like LSTMs, CNNs, and GANs, but they are not without limitations. For instance, attention-based models such as Conformers,…
Biometric authentication has emerged as a promising solution to enhance security by offering a more robust defense against cyber threats. However, hackers can increasingly develop sophisticated methods to bypass traditional security measures as technology advances. This includes forging common protections such as easily guessed PINs, passwords, or even misplacing physical keys, which were once considered…
Blockchain systems face significant challenges in efficiently managing and updating state storage due to high write amplification (WA) and extensive I/O operations. In traditional architecture, such as Merkle Patricia Tries (MPT), frequent and expensive disk interactions incur inefficiencies that restrict throughput and scalability. Such problems are one of the biggest bottlenecks for decentralized applications requiring…
Mathematical reasoning has long been a significant challenge for Large Language Models (LLMs). Errors in intermediate reasoning steps can undermine both the accuracy and reliability of final outputs, which is particularly problematic for applications requiring precision, such as education and scientific computation. Traditional evaluation methods, like the Best-of-N (BoN) strategy, often fail to capture the…
Generating time series data is important for many applications, including data augmentation, synthetic datasets, and scenarios. However, when there is more than one, this process becomes too complex because it includes variations of patterns across categories in the real world. With such wide variations in patterns among real-world categories, the complexity of the process tends…
LLMs, such as GPT-3.5 and GPT-4, have shown exceptional capabilities in language generation, comprehension, and translation tasks. Despite these advancements, their performance is inherently constrained by the availability of training data, much of which has already been utilized. Recent research explores self-improvement by generating synthetic data by LLMs to address this limitation. While using advanced…
In today’s digital age, we are surrounded by enormous amounts of data, from social media interactions to e-commerce transactions and medical records. Making sense of this data to derive meaningful insights is a significant challenge. Traditional programming methods often fall short when dealing with complex and dynamic datasets, making manual rule-based systems inefficient. For instance,…