A Full Code Implementation to Design a Graph-Structured AI Agent with Gemini for Task Planning, Retrieval, Computation, and Self-Critique

«`html

Understanding the Target Audience

The target audience for this tutorial comprises AI developers, data scientists, and business managers interested in integrating advanced AI capabilities into their operations. They are likely to be familiar with programming and AI concepts, yet they seek practical applications that can enhance productivity and decision-making processes.

Pain Points

Difficulty in implementing complex AI systems due to a lack of clear guidance.
Challenges in ensuring the reliability and accuracy of AI outputs.
Need for modularity and flexibility in AI solutions to adapt to varying tasks.

Goals

To design AI agents that can effectively plan, retrieve information, compute results, and critique outputs.
To streamline workflows and improve task management using AI.
To leverage advanced AI models like Gemini for enhanced performance.

Interests

Exploring innovative AI frameworks and models.
Learning about practical implementations of AI in business contexts.
Understanding the integration of AI with existing tools and systems.

Communication Preferences

The audience prefers clear, structured content with practical examples and code snippets. They appreciate step-by-step tutorials that break down complex concepts into manageable parts.

Implementing a Graph-Structured AI Agent with Gemini

In this tutorial, we implement an advanced graph-based AI agent using the GraphAgent framework and the Gemini 1.5 Flash model. We define a directed graph of nodes, each responsible for a specific function: a planner to break down the task, a router to control flow, research and math nodes to provide external evidence and computation, a writer to synthesize the answer, and a critic to validate and refine the output. We integrate Gemini through a wrapper that handles structured JSON prompts, while local Python functions act as tools for safe math evaluation and document search. By executing this pipeline end-to-end, we demonstrate how reasoning, retrieval, and validation are modularized within a single cohesive system.

Code Implementation

We begin by importing core Python libraries for data handling, timing, and safe evaluation, along with dataclasses and typing helpers to structure our state. We also load the google.generativeai client to access Gemini and, optionally, NetworkX for graph visualization.

import os, json, time, ast, math, getpass
from dataclasses import dataclass, field
from typing import Dict, List, Callable, Any
import google.generativeai as genai

try:
   import networkx as nx
except ImportError:
   nx = None

Model Configuration

def make_model(api_key: str, model_name: str = "gemini-1.5-flash"):
   genai.configure(api_key=api_key)
   return genai.GenerativeModel(model_name, system_instruction=(
       "You are GraphAgent, a principled planner-executor. "
       "Prefer structured, concise outputs; use provided tools when asked."
   ))

Calling the LLM

def call_llm(model, prompt: str, temperature=0.2) -> str:
   r = model.generate_content(prompt, generation_config={"temperature": temperature})
   return (r.text or "").strip()

Safe Math Evaluation

def safe_eval_math(expr: str) -> str:
   node = ast.parse(expr, mode="eval")
   allowed = (ast.Expression, ast.BinOp, ast.UnaryOp, ast.Num, ast.Constant,
              ast.Add, ast.Sub, ast.Mult, ast.Div, ast.Pow, ast.Mod,
              ast.USub, ast.UAdd, ast.FloorDiv, ast.AST)
   def check(n):
       if not isinstance(n, allowed): raise ValueError("Unsafe expression")
       for c in ast.iter_child_nodes(n): check(c)
   check(node)
   return str(eval(compile(node, " $", "eval"), {"__builtins__": {}}, {}))$

Document Search

DOCS = [
   "Solar panels convert sunlight to electricity; capacity factor ~20%.",
   "Wind turbines harvest kinetic energy; onshore capacity factor ~35%.",
   "RAG = retrieval-augmented generation joins search with prompting.",
   "LangGraph enables cyclic graphs of agents; good for tool orchestration.",
]
def search_docs(q: str, k: int = 3) -> List[str]:
   ql = q.lower()
   scored = sorted(DOCS, key=lambda d: -sum(w in d.lower() for w in ql.split()))
   return scored[:k]

Node Functions

We implement key node functions to manage the state as the graph executes:

@dataclass
class State:
   task: str
   plan: str = ""
   scratch: List[str] = field(default_factory=list)
   evidence: List[str] = field(default_factory=list)
   result: str = ""
   step: int = 0
   done: bool = False

def node_plan(state: State, model) -> str:
   prompt = f"""Plan step-by-step to solve the user task.
Task: {state.task}
Return JSON: subtasks, "success_criteria": ["..."]}}"""
   js = call_llm(model, prompt)
   try:
       plan = json.loads(js[js.find("{"): js.rfind("}")+1])
   except Exception:
       plan = {"subtasks": ["Research", "Synthesize"], "tools": {"search": True, "math": False}, "success_criteria": ["clear answer"]}
   state.plan = json.dumps(plan, indent=2)
   state.scratch.append("PLAN:\n"+state.plan)
   return "route"

Execution of the Graph

def run_graph(task: str, api_key: str) -> State:
   model = make_model(api_key)
   state = State(task=task)
   cur = "plan"
   max_steps = 12
   while not state.done and state.step < max_steps:
       state.step += 1
       nxt = NODES[cur](state, model)
       if nxt == "end": break
       cur = nxt
   return state

Program Entry Point

if __name__ == "__main__":
   key = os.getenv("GEMINI_API_KEY") or getpass.getpass(" Enter GEMINI_API_KEY: ")
   task = input(" Enter your task: ").strip() or "Compare solar vs wind for reliability; compute 5*7."
   t0 = time.time()
   state = run_graph(task, key)
   dt = time.time() - t0
   print("\n=== GRAPH ===", ascii_graph())
   print(f"\n Result in {dt:.2f}s:\n{state.result}\n")
   print("---- Evidence ----")
   print("\n".join(state.evidence))
   print("\n---- Scratch (last 5) ----")
   print("\n".join(state.scratch[-5:]))

Conclusion

We demonstrate how a graph-structured agent enables the design of deterministic workflows around a probabilistic LLM. The planner node enforces task decomposition, the router dynamically selects between research and math, and the critic provides iterative improvement for factuality and clarity. Gemini acts as the central reasoning engine, while the graph nodes supply structure, safety checks, and transparent state management.

For full code implementations and additional resources, please refer to our GitHub Page for Tutorials, Codes, and Notebooks. Follow us on Twitter and join our ML SubReddit community.

```