←back to Blog

A Coding Implementation of an Advanced Tool-Using AI Agent with Semantic Kernel and Gemini

«`html

Understanding the Target Audience

The target audience for this tutorial primarily consists of software developers, data scientists, and business managers interested in leveraging AI for enhancing operational efficiency. They are likely to be familiar with programming concepts and have experience with AI and machine learning frameworks. Their pain points include:

  • Integration Challenges: Difficulty in seamlessly integrating AI tools with existing systems.
  • Complexity: Overwhelmed by the complexity of AI frameworks and tools.
  • Resource Constraints: Limited resources for training and deploying AI solutions.

Their goals include:

  • Implementing AI solutions that improve decision-making processes.
  • Enhancing productivity through automation.
  • Staying updated with the latest advancements in AI technology.

Interests may revolve around practical applications of AI, coding tutorials, and business management strategies that incorporate AI tools. Communication preferences are likely to favor clear, concise, and structured content with practical examples and step-by-step guides.

Tutorial: Building an Advanced AI Agent Using Semantic Kernel and Gemini

In this tutorial, we will build an advanced AI agent using Semantic Kernel combined with Google’s Gemini free model, running it seamlessly on Google Colab. This agent will utilize various Semantic Kernel plugins as tools, such as web search, math evaluation, file I/O, and note-taking, allowing Gemini to orchestrate these functionalities through structured JSON outputs. The agent will plan, call tools, process observations, and deliver a final answer.

Setting Up the Environment

First, we need to install the required libraries and import essential modules:


!pip -q install semantic-kernel google-generativeai duckduckgo-search rich

import os, re, json, time, math, textwrap, getpass, pathlib, typing as T
from rich import print
import google.generativeai as genai
from duckduckgo_search import DDGS

GEMINI_API_KEY = os.getenv("GEMINI_API_KEY") or getpass.getpass("Enter GEMINI_API_KEY: ")
genai.configure(api_key=GEMINI_API_KEY)
GEMINI_MODEL = "gemini-1.5-flash"
model = genai.GenerativeModel(GEMINI_MODEL)

We begin by installing the libraries and importing essential modules, including Semantic Kernel, Gemini, and DuckDuckGo search. We set up our Gemini API key and model to generate responses, preparing Semantic Kernel’s kernel_function to register our custom tools.

Defining the Agent Tools

Next, we define an AgentTools class as our Semantic Kernel toolset, giving the agent abilities like web search, safe math calculation, time retrieval, file read/write, and lightweight note storage:


class AgentTools:
    """Semantic Kernel-native toolset the agent can call."""

    def __init__(self):
        self._notes: list[str] = []

    @kernel_function(name="web_search", description="Search the web for fresh info; returns JSON list of {title,href,body}.")
    def web_search(self, query: str, k: int = 5) -> str:
        k = max(1, min(int(k), 10))
        hits = list(DDGS().text(query, max_results=k))
        return json.dumps(hits[:k], ensure_ascii=False)

    @kernel_function(name="calc", description="Evaluate a safe math expression, e.g., '41*73+5' or 'sin(pi/4)**2'.")
    def calc(self, expression: str) -> str:
        allowed = {"__builtins__": {}}
        for n in ("pi","e","tau"): allowed[n] = getattr(math, n)
        for fn in ("sin","cos","tan","asin", "sqrt","log","log10","exp","floor","ceil"):
            allowed[fn] = getattr(math, fn)
        return str(eval(expression, allowed, {}))

    @kernel_function(name="now", description="Get the current local time string.")
    def now(self) -> str:
        return time.strftime("%Y-%m-%d %H:%M:%S")

    @kernel_function(name="write_file", description="Write text to a file path; returns saved path.")
    def write_file(self, path: str, content: str) -> str:
        p = pathlib.Path(path).expanduser().resolve()
        p.parent.mkdir(parents=True, exist_ok=True)
        p.write_text(content, encoding="utf-8")
        return str(p)

    @kernel_function(name="read_file", description="Read text from a file path; returns first 4000 chars.")
    def read_file(self, path: str) -> str:
        p = pathlib.Path(path).expanduser().resolve()
        return p.read_text(encoding="utf-8")[:4000]

    @kernel_function(name="add_note", description="Persist a short note into memory.")
    def add_note(self, note: str) -> str:
        self._notes.append(note.strip())
        return f"Notes stored: {len(self._notes)}"

    @kernel_function(name="search_notes", description="Search notes by keyword; returns top matches.")
    def search_notes(self, query: str) -> str:
        q = query.lower()
        hits = [n for n in self._notes if q in n.lower()]
        return json.dumps(hits[:10], ensure_ascii=False)

kernel = sk.Kernel()
tools = AgentTools()
kernel.add_plugin(tools, "agent_tools")

Listing Available Tools

We create a list_tools helper to collect all available tools, their descriptions, and signatures into a registry:


def list_tools() -> dict[str, dict]:
    registry = {}
    for name in ("web_search","calc","now","write_file","read_file","add_note","search_notes"):
        fn = getattr(tools, name)
        desc = getattr(fn, "description", "") or fn.__doc__ or ""
        sig = "()" if name in ("now",) else "(**kwargs)"
        registry[name] = {"callable": fn, "description": desc.strip(), "signature": sig}
    return registry

TOOLS = list_tools()

Running the Agent

We implement an iterative agent loop that feeds context to Gemini, enforces JSON-only tool calls, executes the requested tools, and returns a final answer:


def run_agent(task: str, max_steps: int = 8, verbose: bool = True) -> str:
    transcript: list[dict] = [{"role":"system","parts":[SYSTEM]},
                               {"role":"user","parts":[task]}]
    observations = ""
    for step in range(1, max_steps+1):
        content = []
        for m in transcript:
            role = m["role"]
            for part in m["parts"]:
                content.append({"text": f"[{role.upper()}]\n{part}\n"})
        if observations:
            content.append({"text": f"[OBSERVATIONS]\n{observations[-4000:]}\n"})
        resp = model.generate_content(content, request_options={"timeout":60})
        text = resp.text or ""
        if verbose:
            print(f"\n[bold cyan]Step {step} - Model[/bold cyan]\n{textwrap.shorten(text, 1000)}")
        cmd = extract_json(text)
        if not cmd:
            transcript.append({"role":"user","parts":[
                "Please output strictly one JSON object per your rules."
            ]})
            continue
        if "final_answer" in cmd:
            return cmd["final_answer"]
        if "tool" in cmd:
            tname = cmd["tool"]; args = cmd.get("args", {})
            if tname not in TOOLS:
                observations += f"\nToolError: unknown tool '{tname}'."
                continue
            try:
                out = TOOLS[tname]["callable"](**args)
                out_str = out if isinstance(out,str) else json.dumps(out, ensure_ascii=False)
                if len(out_str) > 4000: out_str = out_str[:4000] + "...[truncated]"
                observations += f"\n[{tname}] {out_str}"
                transcript.append({"role":"user","parts":[f"Observation from {tname}:\n{out_str}"]})
            except Exception as e:
                observations += f"\nToolError {tname}: {e}"
                transcript.append({"role":"user","parts":[f"ToolError {tname}: {e}"]})
        else:
            transcript.append({"role":"user","parts":[
                "Your output must be a single JSON with either a tool call or final_answer."
            ]})
    return "Reached step limit. Summarize findings:\n" + observations[-1500:]

Demo Task

We define a demo task that makes the agent search, compute, write a file, save notes, and report the current time:


DEMO = (
    "Find the top 3 concise facts about Chandrayaan-3 with sources, "
    "compute 41*73+5, store a 3-line summary into '/content/notes.txt', "
    "add the summary to notes, then show current time and return a clean final answer."
)

if __name__ == "__main__":
    print("[bold] Tools loaded:[/bold]", ", ".join(TOOLS.keys()))
    ans = run_agent(DEMO, max_steps=8, verbose=True)
    print("\n" + "="*80 + "\n[bold green]FINAL ANSWER[/bold green]\n" + ans + "\n")

Conclusion

In conclusion, we observe how Semantic Kernel and Gemini collaborate to form a compact yet powerful agentic system within Colab. This tutorial demonstrates that building a practical, advanced AI agent can be both simple and efficient when using the right combination of frameworks.

For further exploration, check out the FULL CODES. Feel free to follow us on Twitter and join our community of over 100,000 machine learning enthusiasts on SubReddit. Subscribe to our Newsletter for more updates.

«`