Pinned: Expect some jankiness
Some of the JavaScript games need ... adjustments. Play at your own risk.
Recently, Google released an Agent Development Kit (ADK), a new framework to help developers create “agentic” generative AI applications quickly. In a previous blog post, I worked through the quickstart to learn the basics of ADK. In this blog post, I attempt to answer: can the ADK help me to create Google Cloud code sample?
You can see the results of my efforts here in the
code_sample_generation/ directory. Read on to see how I did it!
Very broadly, I want to create a multi-agent system that uses tools to
compose a Google Cloud code sample. I intend to use a RAG technique
to fetch protocol buffer files before I generate the sample; that’s
one tool I need to create. Then, I
want to have whatever code is generated by the system to be evaluated using
an LLM-as-a-judge technique; that’s a second tool. This overall
system itself will be orchestrated by a root agent and executed by a
Runner.
With that in mind, I’ll write a little mini-spec here to help guide my work.
Runner, that
manages events and session data. The root agent itself will need to be an
instance of either SequentialAgent or
LoopAgent. I’m going to use a SequentialAgent, for now,
to simply test the pipeline of subagents.With these steps in mind, let’s see how far I can get.
First, I will build the RAG agent & tool. I customized the code from the quickstart to better encapsulate the features of the tool and agent.
For the agent, I’m just going to create a simple function that opens a
local copy of the Secret Manager proto file(s). In a future version
of this application, the get_protos function will fetch the proto files
from the GitHub repo where they reside.
def get_protos(tool_context: ToolContext) -> str:
return (
pathlib.Path(
f"{os.path.dirname(__file__)}/resources/secretmanager.proto"
)
.open()
.read()
)
get_protos function, which returns the
contents of a local proto file.
Next I will create an agent that uses this function. I pass in the get_proto
function in the tools field when I create the agent.
def init_rag_agent() -> Agent:
instruction = textwrap.dedent("""
You are the retrieval-augmented grounding agent.
Your task is to download protocol buffer files from GitHub
Use the 'get_protos' tool to get the files.
If the user provides files to use, make sure to pass it to the
tool. Do not engage in any other conversation or tasks.""")
description = (
"Downloads protocol buffer files from GitHub using the 'get_protos'"
"tool."
)
try:
rag_agent = Agent(
model=MODEL_GEMINI_2_0_FLASH,
name="rag_agent",
instruction=textwrap.dedent(instruction),
description=textwrap.dedent(description),
tools=[get_protos],
)
print(
f"{rag_agent.name}' created using model '{rag_agent.model}'."
)
return rag_agent
except Exception as e:
print(f"error: create RAG agent with ({rag_agent.model}): {e}")
init_rag_agent function, which wraps the
get_protos() tool in an agent.
Following from my ML engineer inclinations, I want to test these agents one-at-a-time. The ADK has testing documentation, but the directions assume that I would be testing these files manually. I would prefer to write some small unit tests instead, tests that verify the behavior of agents individually. I’ll need to put a pin in that for another blog post.
Like the RAG tool, I’m going to fake a response from an evaluation microservice. I could, of course, just have this agent do the evaluation; I’ll probably experiment with that in a future update to this code.
def get_evaluation(code_sample: str, tool_context: ToolContext):
yield json.dumps(
{
"score": 1.0,
"explanation": "This code sample is great! No notes.",
}
)
get_evaluation function,
which returns a canned response from an evaluation microservice.
While I was creating this agent, I forgot to update the instruction so that it was doing evaluations rather than RAG (bad copy/paste). The first couple of runs of this system failed with the root agent saying that it couldn’t do evaluations. It wasn’t until I fixed the typo in the instructions that I began to see results.
def init_evaluation_agent():
instruction = textwrap.dedent("""
You are the Evaluation Agent. Your task is to decide how
good a code sample is. Use the 'get_evaluation' tool when a code sample
has been generated.""")
description = (
"Evaluates generated code samples using the 'get_evaluation' tool."
)
try:
agent = Agent(
model=MODEL_GEMINI_2_0_FLASH,
name="evaluation_agent",
instruction=textwrap.dedent(instruction),
description=textwrap.dedent(description),
tools=[get_evaluation],
)
print(f"{agent.name}' created using model '{agent.model}'.")
return agent
except Exception as e:
print(f"error: create {agent.name} agent with ({agent.model}): {e}")
init_evaluation_agent function, which
wraps the get_evaluation() tool in an agent.
Now for the third and final subagent: the agent that writes the code sample
itself. This agent doesn’t have a tool associated with it, as mentioned before.
Instead, this agent accepts the grounding from the rag_agent, writes the code
sample, and then passes the sample onto the evaluation_agent.
def init_generation_agent():
instruction = textwrap.dedent("""You are the Generation Agent. Your task is
to write a code sample in Node.js""")
description = "Generates code samples in Node.js"
try:
agent = Agent(
model=MODEL_GEMINI_2_0_FLASH,
name="generation_agent",
instruction=textwrap.dedent(instruction),
description=textwrap.dedent(description),
)
print(f"{agent.name}' created using model '{agent.model}'.")
return agent
except Exception as e:
print(f"error: create {agent.name} agent with ({agent.model}): {e}")
init_generation_agent
function, which writes the code sample.
Finally, with the subagents defined, I can build the overall execution
mechanism, which requires a root agent, a Runner, and a SessionService (at
a bare minimum). How all these pieces fit together was a bit murky for me at
first; I’ll try to explain as much as I can.
The first piece is the call_agent_async() function, which I’ve
cribbed from Step 1 of the ADK multi-agent tutorial. The
function listens for events generated by the Runner.run_async()
method, which returns an AsyncGenerator object. This
listening loop prints out the events from the runner, eventually stopping when
the final response from the model is received.
async def call_agent_async(query: str, runner, user_id, session_id):
"""Sends a query to the agent and prints the final response."""
print(f"\n>>> User Query: {query}")
content = types.Content(role="user", parts=[types.Part(text=query)])
final_response_text = "Agent did not produce a final response." # Default
async for event in runner.run_async(
user_id=user_id, session_id=session_id, new_message=content
):
print(f" [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")
if event.is_final_response():
if event.content and event.content.parts:
final_response_text = event.content.parts[0].text
elif (
event.actions and event.actions.escalate
):
final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
break
print(f"<<< Agent Response: {final_response_text}")
call_agent_async function,
which listens to events raised by the Runner.
As I alluded to earlier, the top-level abstraction for executing a multi-agent
system in the ADK is the Runner object. This object
initiates the session with the LLM agents, yields responses from the agents,
and processes the results. When constructing the Runner object, you must
provide it with an orchestration / root agent and a SessionService object.
In the following code example, I’ve created the root agent as an instance of
the SequentialAgent class. Each of the subagents I defined earlier are passed
as a list to the sub_agent field.
Figuring out the SessionService was a bit of a challenge. I had assumed that
I would pass a Session object to the runner, after creating one from the
SessionService. As it turns out, no, you create a session, append it as
an event to the SessionService, and then pass the SessionService to the
Runner.
Finally, I’ve created the run() function that instantiates the agents,
creates the SequentialAgent, and then executes the runner. This function
is called by main, wrapped within a call to asyncio.run().
async def run(query):
rag_agent = init_rag_agent()
evaluation_agent = init_evaluation_agent()
generation_agent = init_generation_agent()
description = textwrap.dedent("""Executes a sequence of getting source
grounding files, writing code samples, and evaluating the code samples.""")
code_pipeline_agent = SequentialAgent(
name="CodePipelineAgent",
sub_agents=[rag_agent, evaluation_agent, generation_agent],
description=description,
)
session_service = InMemorySessionService()
session = session_service.create_session(
app_name="CodeGenerator",
user_id="CodeGeneratorUser",
session_id="CodeGeneratorSession",
state={"query": query},
)
session_service.append_event(
session, Event(author="user", content={"parts": [{"text": query}]})
)
# Or use InMemoryRunner
runner_agent = Runner(
agent=code_pipeline_agent,
app_name=APP_NAME,
session_service=session_service,
)
await call_agent_async(
query=query,
runner=runner_agent,
user_id="CodeGeneratorUser",
session_id="CodeGeneratorSession",
)
run function, which is
called by the main function.
With all of this put together, I’m ready to run the code and get a code
sample! A simple invocation of python agent.py produces the following code
sample:
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const projectId = 'your-project-id';
// const secretName = 'your-secret-name';
// const versionId = 'your-secret-version';
// Imports the Secret Manager library
const { SecretManagerServiceClient } = require('@google-cloud/secret-manager');
// Instantiates a client
const client = new SecretManagerServiceClient();
async function accessSecretVersion() {
const projectId = process.env.PROJECT_ID;
const secretName = process.env.SECRET_NAME;
const versionId = process.env.SECRET_VERSION || 'latest';
const name = `projects/${projectId}/secrets/${secretName}/versions/${versionId}`;
try {
const [version] = await client.accessSecretVersion({
name: name,
});
// Extract the payload as a string.
const payload = version.payload.data.toString();
// WARNING: Do not print the secret in a production environment - use proper logging
// Any log that includes the secret data should be disabled
console.info(`Payload: ${payload}`);
return payload;
} catch (error) {
console.error(`Failed to access secret version: ${error}`);
throw error;
}
}
accessSecretVersion();
accessSecretVersion() code
sample produced by the ADK.
Right after the code sample, I get a helpful evaluation of the code.
**Evaluation:**
* **Completeness:** The code provides a complete function to access a secret from Secret Manager. It handles the necessary setup, retrieves the secret, and extracts the payload. It also includes error handling.
* **Correctness:** The code is functionally correct. It uses the Secret Manager client to access the specified secret version and retrieves the payload data. The added `try...catch` block helps ensure that errors during secret retrieval are caught and handled gracefully. The use of environment variables makes the code more portable.
* **Clarity:** The code is well-structured and includes comments explaining each step. The use of meaningful variable names enhances readability.
* **Security:** The code includes a crucial warning about not printing the secret in a production environment. This highlights the importance of secure logging practices.
* **Error Handling:** The code includes a `try...catch` block to handle potential errors during secret access, which is good practice.
* **Best Practices:** The code follows Node.js best practices for asynchronous operations using `async/await`. It also leverages environment variables for configuration, which is a recommended practice for portability and security.
Overall, the code sample is well-written, secure, and adheres to best practices.
All-in-all, this is a pretty good start!
I’m pretty happy with the results of this experiment. I was able to create a multi-agent system that generates a code sample. With that said, I have some observations:
load_dotenv to load my environment variables from the .env file
where they were defined. Otherwise, my system was unable to find my Google CLoud
project’s ID.SequentialAgent saying that it was a simple RAG agent rather than a
code generation agent. I checked the sub_agents field when I instantiated
the SequentialAgent; all of the correct subagents had been included. On a
whim, I decided to user query so that it explicitly asks for each of the
the tools to be used. With that change, I finally saw the output I wanted.
To me, this indicates that there is still a bit of non-determinism in this
system.