PydanticAI
Build an agent that can write and run code with PydanticAI
Overview
In this guide we’ll build an agent that can write and run code safely using PydanticAI and Riza.
PydanticAI is a Python Agent Framework that simplifies building LLM powered applications. Among other conveniences, PydanticAI’s abstractions greatly simplify tool use / function calling.
A common use case for function calling is to run code written by an LLM. However, code that has not been reviewed should be treated as untrusted and only be run in a safe, isolated environment.
Riza’s Code Interpreter provides a safe environment for executing untrusted code, which makes it a great fit for tool calling. With Riza you can execute arbitrary code in a sandboxed environment via a simple API call. For example, here’s Hello World with Riza:
Let’s build an agent that can write and run code with PydanticAI and Riza.
The complete script
Below is the full script. In the rest of this guide we’ll explain component.
Set up your environment and run the script
To run this script, first create and activate a virtual environment:
Install the pydantic-ai
and riza
Python libraries:
Get API keys from both OpenAI and Riza.
- Get an OpenAI API key from the OpenAI Console.
- Get a Riza API key from the Riza Dashboard.
Set these API keys as environment variables in your terminal:
Copy and paste the above script into a file named coding_agent.py
and run it:
How the Script Works
There are four main components to this script:
- The PydanticAI agent
- The
execute_code
tool - An optional logging function
- A loop to ask for user input, run the agent, and log messages
Define a PydanticAI Agent
You can define a PydanticAI agent with just a model and a system prompt:
Agents can be much more complex, but a simple agent works for this guide. You can find the full PydanticAI Agent documentation here.
Add an execute_code() tool to the PydanticAI Agent
We can add function tools to our agent using decorators.
There are two decorators you can use to add a tool to an agent:
-
@agent.tool
is used for tools that need to access to the agent context, such as the message history. -
@agent.tool_plain
is used for tools that do not need access to the agent context.
Our execute_code
tool only requires a single parameter, the code to execute, so we’ll use the @code_agent.tool_plain
decorator:
We’ll add docstring to help the LLM understand how and when to use the tool.
For logging, we’ll print the code that the agent wants to execute.
To run code, we create a Riza client, then call the exec()
method with the code to execute. Optionally, we can pass in http={"allow": [{"host": "*"}]}
to allow code to make HTTP requests.
When code is executed on Riza, it returns a result
object with three properties: exit_code
, stdout
, and stderr
.
If exit_code
is not 0
, that means an error occurred. We raise a ModelRetry
and pass along the error. PydanticAI will tell the model to try to fix the error. (More on ModelRetry)
If exit_code
is 0, the execution did not throw and error. If stdout
is empty, this often means that the LLM generated code that did not print()
to stdout — a common cause of silent failure when LLMs generate code. We raise a ModelRetry
and pass along a message to encourage the LLM to include print statements in the code.
If exit_code
is 0 and stdout
is not empty, the code executed successfully and we return stdout
.
Log Messages
We use a utility function to log the messages to a all_messages.json
file. This is not necessary, but is helpful for debugging and understanding the agent’s behavior which can be hidden behind PydanticAI’s abstractions.
Run the PydanticAI Agent
An agent’s run_sync method runs the agent and returns a RunResult
object. Two of RunResult
’s properties include:
result.data
: The result of the run.result.all_messages
: The full message history.
We start our interaction by asking the agent to introduce itself, then enter a loop.
In the loop we print result.data
and ask the user for a new message.
On subsequent runs, we call run_sync
again with message_history=result.all_messages()
to track a multi-step conversation between the agent and the user.
After each run we log the messages to all_messages.json
.
Example Run: What time is it?
A simple way to invoke the agent’s ability to write and execute code is to ask what time is it?
. Some LLM providers inject the current date into the system prompt — but not the time. To accurately report the current time, the agent will need to run a Python script.
Notice the model-structured-response
and tool-return
messages. These are interactions between the model and its tools that PydanticAI handles automatically. You can see in the model-structured-response
that the tool call is execute_code
and the argument is code that uses the datetime
module to get the current time.
The tool-return
message contains the output of the tool call which is passed back to the model, which uses the result to generate a final response. Notice the difference between the format of the tool-return
message and the final model-text-response
message.
Example Run: What is 100 factorial?
While LLMs have gotten better at math, they struggle with large numbers. To calculate 100!, the agent needs to run a Python script.
Here you can see the tool call to Riza with Python code to evaluate 100!, the result from Riza, and the final response from the LLM.
Conclusion
The script in this guide demonstrates how to use PydanticAI’s agents with Riza’s Code Interpreter API to safely execute Python code generated by an LLM. By leveraging these tools you can create powerful applications that write and run code dynamically, while maintaining a secure execution environment.
Remember to always treat LLM-generated code as untrusted and use appropriate safeguards, such as Riza’s sandboxed environment, when executing it.