Apple AI — itinai content

Guiding Instruction-based Image Editing via Multimodal Large Language Models

4 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Instruction-based image editing improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks. However, human instructions are sometimes too brief for current methods to capture and follow. Multimodal large language models (MLLMs) show promising capabilities in cross-modal understanding and visual-aware response generation via LMs. We investigate how MLLMs…
Read more →
Conformal Prediction via Regression-as-Classification

4 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Conformal prediction (CP) for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals. Here, we circumvent the challenges by converting regression…
Read more →
Pseudo-Generalized Dynamic View Synthesis from a Video

4 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Rendering scenes observed in a monocular video from novel viewpoints is a chal- lenging problem. For static scenes the community has studied both scene-specific optimization techniques, which optimize on every test scene, and generalized tech- niques, which only run a deep net forward pass on a test scene. In contrast, for dy- namic scenes, scene-specific…
Read more →
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models

4 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Large Language Models (LLMs) with billions of parameters have drastically transformed AI applications. However, their demanding computation during inference has raised significant challenges for deployment on resource-constrained devices. Despite recent trends favoring alternative activation functions such as GELU or SiLU, known for increased computation, this study strongly advocates for reinstating ReLU activation in LLMs. We…
Read more →
Poly-View Contrastive Learning

3 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Contrastive learning typically matches pairs of related views among a number of unrelated negative views. Views can be generated (e.g. by augmentations) or be observed. We investigate matching when there are more than two related views which we call poly-view tasks, and derive new representation learning objectives using information maximization and sufficient statistics. We show…
Read more →
When can transformers reason with abstract symbols?

2 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

We investigate the capabilities of transformer models on relational reasoning tasks. In these tasks, models are trained on a set of strings encoding abstract relations, and are then tested out-of-distribution on data that contains symbols that did not appear in the training dataset. We prove that for any relational reasoning task in a large family…
Read more →
Manifold Diffusion Fields

2 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Score-based models have quickly become the de facto choice for generative modeling of images, text and more recently molecules. However, to adapt a score-based generative modeling to these domains the score network needs to be carefully designed, hampering its applicability to arbitrary data domains. In this paper we tackle this problem by taking a textit{functional}…
Read more →
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7× Faster Pre-training on Web-scale Image-Text Data

1 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text embeddings. However, pairwise similarity computation in contrastive loss between image and text pairs poses computational challenges. This paper presents a novel weakly supervised pre-training of vision models on web-scale image-text data. The proposed method reframes…
Read more →
Label-Efficient Sleep Staging Using Transformers Pre-trained with Position Prediction

1 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

Sleep staging is a clinically important task for diagnosing various sleep disorders but remains challenging to deploy at scale because it requires clinical expertise, among other reasons. Deep learning models can perform the task but at the expense of large labeled datasets, which are unfeasible to procure at scale. While self-supervised learning (SSL) can mitigate…
Read more →
Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference

1 мая, 2024

AI, AI Business, AI Education, AI Healthcare, AI Help, AI in Finance, AI Libs, AI Marketing, AI Product, AI Research, AI Sales, AI Staff, AI Startup, AI Tech, AI UX, Automation, DeepSense, Edge AI, Explainable AI, Natural Language Processing, NLP, No-code AI, Quantization, Transform AI, XAI

On-device machine learning (ML) moves computation from the cloud to personal devices, protecting user privacy and enabling intelligent user experiences. However, fitting models on devices with limited resources presents a major technical challenge: practitioners need to optimize models and balance hardware metrics such as model size, latency, and power. To help practitioners create efficient ML…
Read more →