Building a Context-Aware Multi-Agent AI System Using Nomic Embeddings and Gemini LLM

«`html

Understanding the Target Audience

The target audience for building a context-aware multi-agent AI system using Nomic embeddings and Gemini LLM primarily consists of:

AI researchers and developers looking to implement advanced AI solutions.
Business professionals interested in leveraging AI for improved decision-making and operational efficiency.
Data scientists and machine learning engineers aiming to enhance their models with semantic memory and reasoning capabilities.

Common pain points include:

Difficulty in integrating multiple AI technologies into a cohesive system.
Challenges in ensuring context-aware interactions for improved user experiences.
Need for efficient memory and knowledge retrieval methods in AI applications.

Goals of the audience may include:

Developing scalable and modular AI frameworks.
Improving the accuracy and relevance of AI-generated responses.
Enhancing user engagement through conversational AI.

Interests typically revolve around:

Latest advancements in AI and machine learning technologies.
Real-world applications and case studies of AI in business.
Best practices for implementing AI in various sectors.

Communication preferences often favor:

Technical documentation and tutorials that provide step-by-step guidance.
Interactive content that allows for hands-on experimentation.
Clear, concise explanations of complex concepts.

Tutorial: Building a Context-Aware Multi-Agent AI System

In this tutorial, we walk through the complete implementation of an advanced AI agent system powered by Nomic embeddings and Google’s Gemini. We design the architecture from the ground up, integrating semantic memory, contextual reasoning, and multi-agent orchestration into a single intelligent framework. Using LangChain, Faiss, and LangChain-Nomic, we equip our agents with the ability to store, retrieve, and reason over information using natural language queries. The goal is to demonstrate how we can build a modular and extensible AI system that supports both analytical research and friendly conversation.

Installation of Required Libraries

We begin by installing all the required libraries, including langchain-nomic, langchain-google-genai, and faiss-cpu, to support our agent’s embedding, reasoning, and vector search capabilities. We then import the necessary modules and securely set our Nomic and Google API keys using getpass to ensure smooth integration with the embedding and LLM services.


!pip install -qU langchain-nomic langchain-core langchain-community langchain-google-genai faiss-cpu numpy matplotlib

Code Implementation

We define the core structure of our intelligent agent by creating a memory system that mimics episodic and semantic recall. We integrate Nomic embeddings for semantic understanding and use Gemini LLM to generate contextual, personality-driven responses. With built-in capabilities like memory retrieval, knowledge search, and reasoning, we enable the agent to interact intelligently and learn from each conversation.


@dataclass
class AgentMemory:
   """Agent's episodic and semantic memory"""
   episodic: List[Dict[str, Any]]
   semantic: Dict[str, Any]
   working: Dict[str, Any]

class IntelligentAgent:
   """Advanced AI Agent with Nomic Embeddings for semantic reasoning"""
  
   def __init__(self, agent_name: str = "AIAgent", personality: str = "helpful"):
       self.name = agent_name
       self.personality = personality
       self.embeddings = NomicEmbeddings(
           model="nomic-embed-text-v1.5",
           dimensionality=384, 
           inference_mode="remote"
       )
       self.llm = ChatGoogleGenerativeAI(
           model="gemini-1.5-flash", 
           temperature=0.7,
           max_tokens=512
       )
       self.memory = AgentMemory(
           episodic=[],
           semantic={},
           working={}
       )
       self.knowledge_base = None
       self.vector_store = None
       self.capabilities = {
           "reasoning": True,
           "memory_retrieval": True,
           "knowledge_search": True,
           "context_awareness": True,
           "learning": True
       }
       print(f" {self.name} initialized with Nomic embeddings + Gemini LLM")

Adding Knowledge to the Agent

We implement methods to add knowledge to the agent’s semantic memory, allowing it to learn and adapt based on new information.


   def add_knowledge(self, documents: List[str], metadata: List[Dict] = None):
       """Add knowledge to agent's semantic memory"""
       if metadata is None:
           metadata = [{"source": f"doc_{i}"} for i in range(len(documents))]
       docs = [Document(page_content=doc, metadata=meta)
               for doc, meta in zip(documents, metadata)]
       if self.vector_store is None:
           self.vector_store = InMemoryVectorStore.from_documents(docs, self.embeddings)
       else:
           self.vector_store.add_documents(docs)
       print(f" Added {len(documents)} documents to knowledge base")

Memory and Reasoning Capabilities

The agent is designed to remember interactions and retrieve similar past memories to enhance context-aware responses.


   def remember_interaction(self, user_input: str, agent_response: str, context: Dict = None):
       """Store interaction in episodic memory"""
       memory_entry = {
           "timestamp": len(self.memory.episodic),
           "user_input": user_input,
           "agent_response": agent_response,
           "context": context or {},
           "embedding": self.embeddings.embed_query(f"{user_input} {agent_response}")
       }
       self.memory.episodic.append(memory_entry)

   def retrieve_similar_memories(self, query: str, k: int = 3) -> List[Dict]:
       """Retrieve similar past interactions"""
       if not self.memory.episodic:
           return []
       query_embedding = self.embeddings.embed_query(query)
       similarities = []
       for memory in self.memory.episodic:
           similarity = np.dot(query_embedding, memory["embedding"])
           similarities.append((similarity, memory))
       similarities.sort(reverse=True, key=lambda x: x[0])
       return [mem for _, mem in similarities[:k]]

Multi-Agent System Architecture

We build a multi-agent system that intelligently routes queries to either the research or conversational agent based on semantic similarity. This architecture allows for scalable and specialized AI behavior.


class MultiAgentSystem:
   """Orchestrate multiple specialized agents"""
  
   def __init__(self):
       self.agents = {
           "research": ResearchAgent(),
           "chat": ConversationalAgent()
       }
       self.coordinator_embeddings = NomicEmbeddings(model="nomic-embed-text-v1.5", dimensionality=256)
  
   def route_query(self, query: str) -> str:
       """Route query to most appropriate agent"""
       agent_descriptions = {
           "research": "analysis, research, data, statistics, technical information",
           "chat": "conversation, questions, general discussion, casual talk"
       }
       query_embedding = self.coordinator_embeddings.embed_query(query)
       best_agent = "chat" 
       best_similarity = 0

       for agent_name, description in agent_descriptions.items():
           desc_embedding = self.coordinator_embeddings.embed_query(description)
           similarity = np.dot(query_embedding, desc_embedding)
           if similarity > best_similarity:
               best_similarity = similarity
               best_agent = agent_name
              
       return best_agent

Conclusion

We have developed a powerful and flexible AI agent framework that leverages Nomic embeddings for semantic understanding and Gemini LLM for contextual response generation. By demonstrating both research-focused and conversational interactions, we highlight the framework’s potential for building intelligent and responsive AI assistants.

All credit for this research goes to the researchers of this project. Feel free to follow us on Twitter and join our 100k+ ML SubReddit. Subscribe to our Newsletter.

«`