Building a code generation system with ADK

2025-06-06

Recently, Google released an Agent Development Kit (ADK), a new framework to help developers create “agentic” generative AI applications quickly. In a previous blog post, I worked through the quickstart to learn the basics of ADK. In this blog post, I attempt to answer: can the ADK help me to create Google Cloud code sample?

You can see the results of my efforts here in the code_sample_generation/ directory. Read on to see how I did it!

Plan my agentic code generation system

Very broadly, I want to create a multi-agent system that uses tools to compose a Google Cloud code sample. I intend to use a RAG technique to fetch protocol buffer files before I generate the sample; that’s one tool I need to create. Then, I want to have whatever code is generated by the system to be evaluated using an LLM-as-a-judge technique; that’s a second tool. This overall system itself will be orchestrated by a root agent and executed by a Runner.

With that in mind, I’ll write a little mini-spec here to help guide my work.

Objective: Write a Node.js code sample that gets a secret from Google Cloud Secret Manager.
Components:
1. Tool #1. Download the Secret Manager proto files from GitHub. These files will be used as grounding for the LLM when it writes the code samples. Getting the proto files will be my first tool. The real-world version of this tool would fetch the files from a GitHub repo; however, I’m going to just return a canned response for now.
2. Tool #2. Evaluate the quality of the code sample. If the quality is low, then return to step #2. This evaluation step will be my second tool. Similar to the first tool, I’m going to return a canned response for now.
3. Generation Agent. I could wrap the generation task in a tool and have that executed by an agent; however, I could also just create an agent to write the code sample. Thus the third subagent in my system will generate a new Node.js code sample given the user prompt and the grounding context.
4. Agents. Create agents for each tool. Reading the docs, it looks like the pattern for this type of application is to have a root agent that handles orchestration and a series of subagents.
5. Orchestration: The root agent will be executed in a Runner, that manages events and session data. The root agent itself will need to be an instance of either SequentialAgent or LoopAgent. I’m going to use a SequentialAgent, for now, to simply test the pipeline of subagents.

With these steps in mind, let’s see how far I can get.

Create the RAG agent

First, I will build the RAG agent & tool. I customized the code from the quickstart to better encapsulate the features of the tool and agent.

For the agent, I’m just going to create a simple function that opens a local copy of the Secret Manager proto file(s). In a future version of this application, the get_protos function will fetch the proto files from the GitHub repo where they reside.

def get_protos(tool_context: ToolContext) -> str:
    return (
        pathlib.Path(
            f"{os.path.dirname(__file__)}/resources/secretmanager.proto"
        )
        .open()
        .read()
    )

Figure 1. The get_protos function, which returns the contents of a local proto file.

Next I will create an agent that uses this function. I pass in the get_proto function in the tools field when I create the agent.

def init_rag_agent() -> Agent:
    instruction = textwrap.dedent("""
            You are the retrieval-augmented grounding agent.
            Your task is to download protocol buffer files from GitHub
            Use the 'get_protos' tool to get the files.
            If the user provides files to use, make sure to pass it to the
            tool. Do not engage in any other conversation or tasks.""")
    description = (
        "Downloads protocol buffer files from GitHub using the 'get_protos'"
        "tool."
    )
    try:
        rag_agent = Agent(
            model=MODEL_GEMINI_2_0_FLASH,
            name="rag_agent",
            instruction=textwrap.dedent(instruction),
            description=textwrap.dedent(description),
            tools=[get_protos],
        )
        print(
            f"{rag_agent.name}' created using model '{rag_agent.model}'."
        )
        return rag_agent
    except Exception as e:
        print(f"error: create RAG agent with ({rag_agent.model}): {e}")

Figure 2. The init_rag_agent function, which wraps the get_protos() tool in an agent.

Following from my ML engineer inclinations, I want to test these agents one-at-a-time. The ADK has testing documentation, but the directions assume that I would be testing these files manually. I would prefer to write some small unit tests instead, tests that verify the behavior of agents individually. I’ll need to put a pin in that for another blog post.

Create the evaluation agent

Like the RAG tool, I’m going to fake a response from an evaluation microservice. I could, of course, just have this agent do the evaluation; I’ll probably experiment with that in a future update to this code.

def get_evaluation(code_sample: str, tool_context: ToolContext):
    yield json.dumps(
        {
            "score": 1.0,
            "explanation": "This code sample is great! No notes.",
        }
    )

Figure 3. The get_evaluation function, which returns a canned response from an evaluation microservice.

While I was creating this agent, I forgot to update the instruction so that it was doing evaluations rather than RAG (bad copy/paste). The first couple of runs of this system failed with the root agent saying that it couldn’t do evaluations. It wasn’t until I fixed the typo in the instructions that I began to see results.

def init_evaluation_agent():
    instruction = textwrap.dedent("""
        You are the Evaluation Agent. Your task is to decide how
        good a code sample is. Use the 'get_evaluation' tool when a code sample
        has been generated.""")
    description = (
        "Evaluates generated code samples using the 'get_evaluation' tool."
    )
    try:
        agent = Agent(
            model=MODEL_GEMINI_2_0_FLASH,
            name="evaluation_agent",
            instruction=textwrap.dedent(instruction),
            description=textwrap.dedent(description),
            tools=[get_evaluation],
        )
        print(f"{agent.name}' created using model '{agent.model}'.")
        return agent
    except Exception as e:
        print(f"error: create {agent.name} agent with ({agent.model}): {e}")

Figure 4. The init_evaluation_agent function, which wraps the get_evaluation() tool in an agent.

Create the code generation agent

Now for the third and final subagent: the agent that writes the code sample itself. This agent doesn’t have a tool associated with it, as mentioned before. Instead, this agent accepts the grounding from the rag_agent, writes the code sample, and then passes the sample onto the evaluation_agent.

def init_generation_agent():
    instruction = textwrap.dedent("""You are the Generation Agent. Your task is
                                  to write a code sample in Node.js""")
    description = "Generates code samples in Node.js"
    try:
        agent = Agent(
            model=MODEL_GEMINI_2_0_FLASH,
            name="generation_agent",
            instruction=textwrap.dedent(instruction),
            description=textwrap.dedent(description),
        )
        print(f"{agent.name}' created using model '{agent.model}'.")
        return agent
    except Exception as e:
        print(f"error: create {agent.name} agent with ({agent.model}): {e}")

Figure 5. The init_generation_agent function, which writes the code sample.

Create a runner

Finally, with the subagents defined, I can build the overall execution mechanism, which requires a root agent, a Runner, and a SessionService (at a bare minimum). How all these pieces fit together was a bit murky for me at first; I’ll try to explain as much as I can.

The first piece is the call_agent_async() function, which I’ve cribbed from Step 1 of the ADK multi-agent tutorial. The function listens for events generated by the Runner.run_async() method, which returns an AsyncGenerator object. This listening loop prints out the events from the runner, eventually stopping when the final response from the model is received.

async def call_agent_async(query: str, runner, user_id, session_id):
    """Sends a query to the agent and prints the final response."""
    print(f"\n>>> User Query: {query}")

    content = types.Content(role="user", parts=[types.Part(text=query)])

    final_response_text = "Agent did not produce a final response."  # Default

    async for event in runner.run_async(
        user_id=user_id, session_id=session_id, new_message=content
    ):
        print(f"  [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")

        if event.is_final_response():
            if event.content and event.content.parts:
                final_response_text = event.content.parts[0].text
            elif (
                event.actions and event.actions.escalate
            ):
                final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
            break

    print(f"<<< Agent Response: {final_response_text}")

Figure 6. The call_agent_async function, which listens to events raised by the Runner.

As I alluded to earlier, the top-level abstraction for executing a multi-agent system in the ADK is the Runner object. This object initiates the session with the LLM agents, yields responses from the agents, and processes the results. When constructing the Runner object, you must provide it with an orchestration / root agent and a SessionService object.

In the following code example, I’ve created the root agent as an instance of the SequentialAgent class. Each of the subagents I defined earlier are passed as a list to the sub_agent field.

Figuring out the SessionService was a bit of a challenge. I had assumed that I would pass a Session object to the runner, after creating one from the SessionService. As it turns out, no, you create a session, append it as an event to the SessionService, and then pass the SessionService to the Runner.

Finally, I’ve created the run() function that instantiates the agents, creates the SequentialAgent, and then executes the runner. This function is called by main, wrapped within a call to asyncio.run().

async def run(query):

    rag_agent = init_rag_agent()
    evaluation_agent = init_evaluation_agent()
    generation_agent = init_generation_agent()

    description = textwrap.dedent("""Executes a sequence of getting source
    grounding files, writing code samples, and evaluating the code samples.""")
    code_pipeline_agent = SequentialAgent(
        name="CodePipelineAgent",
        sub_agents=[rag_agent, evaluation_agent, generation_agent],
        description=description,
    )

    session_service = InMemorySessionService()
    session = session_service.create_session(
        app_name="CodeGenerator",
        user_id="CodeGeneratorUser",
        session_id="CodeGeneratorSession",
        state={"query": query},
    )
    session_service.append_event(
        session, Event(author="user", content={"parts": [{"text": query}]})
    )
    # Or use InMemoryRunner
    runner_agent = Runner(
        agent=code_pipeline_agent,
        app_name=APP_NAME,
        session_service=session_service,
    )

    await call_agent_async(
        query=query,
        runner=runner_agent,
        user_id="CodeGeneratorUser",
        session_id="CodeGeneratorSession",
    )

Figure 8. The run function, which is called by the main function.

Run the code!

With all of this put together, I’m ready to run the code and get a code sample! A simple invocation of python agent.py produces the following code sample:

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'your-project-id';
// const secretName = 'your-secret-name';
// const versionId = 'your-secret-version';

// Imports the Secret Manager library
const { SecretManagerServiceClient } = require('@google-cloud/secret-manager');

// Instantiates a client
const client = new SecretManagerServiceClient();

async function accessSecretVersion() {
  const projectId = process.env.PROJECT_ID;
  const secretName = process.env.SECRET_NAME;
  const versionId = process.env.SECRET_VERSION || 'latest';


  const name = `projects/${projectId}/secrets/${secretName}/versions/${versionId}`;

  try {
    const [version] = await client.accessSecretVersion({
      name: name,
    });

    // Extract the payload as a string.
    const payload = version.payload.data.toString();

    // WARNING: Do not print the secret in a production environment - use proper logging
    // Any log that includes the secret data should be disabled
    console.info(`Payload: ${payload}`);
    return payload;

  } catch (error) {
    console.error(`Failed to access secret version: ${error}`);
    throw error;
  }
}

accessSecretVersion();

Figure 9. The accessSecretVersion() code sample produced by the ADK.

Right after the code sample, I get a helpful evaluation of the code.

**Evaluation:**

*   **Completeness:** The code provides a complete function to access a secret from Secret Manager. It handles the necessary setup, retrieves the secret, and extracts the payload. It also includes error handling.
*   **Correctness:**  The code is functionally correct. It uses the Secret Manager client to access the specified secret version and retrieves the payload data. The added `try...catch` block helps ensure that errors during secret retrieval are caught and handled gracefully. The use of environment variables makes the code more portable.
*   **Clarity:** The code is well-structured and includes comments explaining each step. The use of meaningful variable names enhances readability.
*   **Security:** The code includes a crucial warning about not printing the secret in a production environment. This highlights the importance of secure logging practices.
*   **Error Handling:** The code includes a `try...catch` block to handle potential errors during secret access, which is good practice.
*   **Best Practices:** The code follows Node.js best practices for asynchronous operations using `async/await`. It also leverages environment variables for configuration, which is a recommended practice for portability and security.

Overall, the code sample is well-written, secure, and adheres to best practices.

Figure 10. An LLM-evaluation of the code sample produced by the code generation system.

All-in-all, this is a pretty good start!

Final thoughts

I’m pretty happy with the results of this experiment. I was able to create a multi-agent system that generates a code sample. With that said, I have some observations:

It took me a couple of tries to realize that I needed to call load_dotenv to load my environment variables from the .env file where they were defined. Otherwise, my system was unable to find my Google CLoud project’s ID.
I would like to create some unit tests for each of the agents. I’m a big believer in TDD, so I’d like to be able to test the agents individually. I wish there was a better out-of-the-box technique for testing these agents.
System instructions and user prompts matter!! Another hiccup that I encountered was that I got errors from the SequentialAgent saying that it was a simple RAG agent rather than a code generation agent. I checked the sub_agents field when I instantiated the SequentialAgent; all of the correct subagents had been included. On a whim, I decided to user query so that it explicitly asks for each of the the tools to be used. With that change, I finally saw the output I wanted. To me, this indicates that there is still a bit of non-determinism in this system.