«`html
Understanding the Target Audience
The target audience for this tutorial primarily consists of software developers, data scientists, and business managers interested in leveraging AI for enhancing operational efficiency. They are likely to be familiar with programming concepts and have experience with AI and machine learning frameworks. Their pain points include:
- Integration Challenges: Difficulty in seamlessly integrating AI tools with existing systems.
- Complexity: Overwhelmed by the complexity of AI frameworks and tools.
- Resource Constraints: Limited resources for training and deploying AI solutions.
Their goals include:
- Implementing AI solutions that improve decision-making processes.
- Enhancing productivity through automation.
- Staying updated with the latest advancements in AI technology.
Interests may revolve around practical applications of AI, coding tutorials, and business management strategies that incorporate AI tools. Communication preferences are likely to favor clear, concise, and structured content with practical examples and step-by-step guides.
Tutorial: Building an Advanced AI Agent Using Semantic Kernel and Gemini
In this tutorial, we will build an advanced AI agent using Semantic Kernel combined with Google’s Gemini free model, running it seamlessly on Google Colab. This agent will utilize various Semantic Kernel plugins as tools, such as web search, math evaluation, file I/O, and note-taking, allowing Gemini to orchestrate these functionalities through structured JSON outputs. The agent will plan, call tools, process observations, and deliver a final answer.
Setting Up the Environment
First, we need to install the required libraries and import essential modules:
!pip -q install semantic-kernel google-generativeai duckduckgo-search rich
import os, re, json, time, math, textwrap, getpass, pathlib, typing as T
from rich import print
import google.generativeai as genai
from duckduckgo_search import DDGS
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY") or getpass.getpass("Enter GEMINI_API_KEY: ")
genai.configure(api_key=GEMINI_API_KEY)
GEMINI_MODEL = "gemini-1.5-flash"
model = genai.GenerativeModel(GEMINI_MODEL)
We begin by installing the libraries and importing essential modules, including Semantic Kernel, Gemini, and DuckDuckGo search. We set up our Gemini API key and model to generate responses, preparing Semantic Kernel’s kernel_function to register our custom tools.
Defining the Agent Tools
Next, we define an AgentTools
class as our Semantic Kernel toolset, giving the agent abilities like web search, safe math calculation, time retrieval, file read/write, and lightweight note storage:
class AgentTools:
"""Semantic Kernel-native toolset the agent can call."""
def __init__(self):
self._notes: list[str] = []
@kernel_function(name="web_search", description="Search the web for fresh info; returns JSON list of {title,href,body}.")
def web_search(self, query: str, k: int = 5) -> str:
k = max(1, min(int(k), 10))
hits = list(DDGS().text(query, max_results=k))
return json.dumps(hits[:k], ensure_ascii=False)
@kernel_function(name="calc", description="Evaluate a safe math expression, e.g., '41*73+5' or 'sin(pi/4)**2'.")
def calc(self, expression: str) -> str:
allowed = {"__builtins__": {}}
for n in ("pi","e","tau"): allowed[n] = getattr(math, n)
for fn in ("sin","cos","tan","asin", "sqrt","log","log10","exp","floor","ceil"):
allowed[fn] = getattr(math, fn)
return str(eval(expression, allowed, {}))
@kernel_function(name="now", description="Get the current local time string.")
def now(self) -> str:
return time.strftime("%Y-%m-%d %H:%M:%S")
@kernel_function(name="write_file", description="Write text to a file path; returns saved path.")
def write_file(self, path: str, content: str) -> str:
p = pathlib.Path(path).expanduser().resolve()
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content, encoding="utf-8")
return str(p)
@kernel_function(name="read_file", description="Read text from a file path; returns first 4000 chars.")
def read_file(self, path: str) -> str:
p = pathlib.Path(path).expanduser().resolve()
return p.read_text(encoding="utf-8")[:4000]
@kernel_function(name="add_note", description="Persist a short note into memory.")
def add_note(self, note: str) -> str:
self._notes.append(note.strip())
return f"Notes stored: {len(self._notes)}"
@kernel_function(name="search_notes", description="Search notes by keyword; returns top matches.")
def search_notes(self, query: str) -> str:
q = query.lower()
hits = [n for n in self._notes if q in n.lower()]
return json.dumps(hits[:10], ensure_ascii=False)
kernel = sk.Kernel()
tools = AgentTools()
kernel.add_plugin(tools, "agent_tools")
Listing Available Tools
We create a list_tools
helper to collect all available tools, their descriptions, and signatures into a registry:
def list_tools() -> dict[str, dict]:
registry = {}
for name in ("web_search","calc","now","write_file","read_file","add_note","search_notes"):
fn = getattr(tools, name)
desc = getattr(fn, "description", "") or fn.__doc__ or ""
sig = "()" if name in ("now",) else "(**kwargs)"
registry[name] = {"callable": fn, "description": desc.strip(), "signature": sig}
return registry
TOOLS = list_tools()
Running the Agent
We implement an iterative agent loop that feeds context to Gemini, enforces JSON-only tool calls, executes the requested tools, and returns a final answer:
def run_agent(task: str, max_steps: int = 8, verbose: bool = True) -> str:
transcript: list[dict] = [{"role":"system","parts":[SYSTEM]},
{"role":"user","parts":[task]}]
observations = ""
for step in range(1, max_steps+1):
content = []
for m in transcript:
role = m["role"]
for part in m["parts"]:
content.append({"text": f"[{role.upper()}]\n{part}\n"})
if observations:
content.append({"text": f"[OBSERVATIONS]\n{observations[-4000:]}\n"})
resp = model.generate_content(content, request_options={"timeout":60})
text = resp.text or ""
if verbose:
print(f"\n[bold cyan]Step {step} - Model[/bold cyan]\n{textwrap.shorten(text, 1000)}")
cmd = extract_json(text)
if not cmd:
transcript.append({"role":"user","parts":[
"Please output strictly one JSON object per your rules."
]})
continue
if "final_answer" in cmd:
return cmd["final_answer"]
if "tool" in cmd:
tname = cmd["tool"]; args = cmd.get("args", {})
if tname not in TOOLS:
observations += f"\nToolError: unknown tool '{tname}'."
continue
try:
out = TOOLS[tname]["callable"](**args)
out_str = out if isinstance(out,str) else json.dumps(out, ensure_ascii=False)
if len(out_str) > 4000: out_str = out_str[:4000] + "...[truncated]"
observations += f"\n[{tname}] {out_str}"
transcript.append({"role":"user","parts":[f"Observation from {tname}:\n{out_str}"]})
except Exception as e:
observations += f"\nToolError {tname}: {e}"
transcript.append({"role":"user","parts":[f"ToolError {tname}: {e}"]})
else:
transcript.append({"role":"user","parts":[
"Your output must be a single JSON with either a tool call or final_answer."
]})
return "Reached step limit. Summarize findings:\n" + observations[-1500:]
Demo Task
We define a demo task that makes the agent search, compute, write a file, save notes, and report the current time:
DEMO = (
"Find the top 3 concise facts about Chandrayaan-3 with sources, "
"compute 41*73+5, store a 3-line summary into '/content/notes.txt', "
"add the summary to notes, then show current time and return a clean final answer."
)
if __name__ == "__main__":
print("[bold] Tools loaded:[/bold]", ", ".join(TOOLS.keys()))
ans = run_agent(DEMO, max_steps=8, verbose=True)
print("\n" + "="*80 + "\n[bold green]FINAL ANSWER[/bold green]\n" + ans + "\n")
Conclusion
In conclusion, we observe how Semantic Kernel and Gemini collaborate to form a compact yet powerful agentic system within Colab. This tutorial demonstrates that building a practical, advanced AI agent can be both simple and efficient when using the right combination of frameworks.
For further exploration, check out the FULL CODES. Feel free to follow us on Twitter and join our community of over 100,000 machine learning enthusiasts on SubReddit. Subscribe to our Newsletter for more updates.
«`