Overview
In this guide we’ll build an agent that can write and run code safely using PydanticAI and Riza. PydanticAI is a Python Agent Framework that simplifies building LLM powered applications. Among other conveniences, PydanticAI’s abstractions greatly simplify tool use / function calling. A common use case for function calling is to run code written by an LLM. However, code that has not been reviewed should be treated as untrusted and only be run in a safe, isolated environment. Riza’s Code Interpreter provides a safe environment for executing untrusted code, which makes it a great fit for tool calling. With Riza you can execute arbitrary code in a sandboxed environment via a simple API call. For example, here’s Hello World with Riza:The complete script
Below is the full script. In the rest of this guide we’ll explain component.code_agent.py
Set up your environment and run the script
To run this script, first create and activate a virtual environment:pydantic-ai and riza Python libraries:
- Get an OpenAI API key from the OpenAI Console.
- Get a Riza API key from the Riza Dashboard.
coding_agent.py and run it:
How the Script Works
There are four main components to this script:- The PydanticAI agent
- The
execute_codetool - An optional logging function
- A loop to ask for user input, run the agent, and log messages
Define a PydanticAI Agent
You can define a PydanticAI agent with just a model and a system prompt:Add an execute_code() tool to the PydanticAI Agent
We can add function tools to our agent using decorators. There are two decorators you can use to add a tool to an agent:-
@agent.toolis used for tools that need to access to the agent context, such as the message history. -
@agent.tool_plainis used for tools that do not need access to the agent context.
execute_code tool only requires a single parameter, the code to execute, so we’ll use the @code_agent.tool_plain decorator:
exec() method with the code to execute. Optionally, we can pass in http={"allow": [{"host": "*"}]} to allow code to make HTTP requests.
result object with three properties: exit_code, stdout, and stderr.
If exit_code is not 0, that means an error occurred. We raise a ModelRetry and pass along the error. PydanticAI will tell the model to try to fix the error. (More on ModelRetry)
If exit_code is 0, the execution did not throw and error. If stdout is empty, this often means that the LLM generated code that did not print() to stdout — a common cause of silent failure when LLMs generate code. We raise a ModelRetry and pass along a message to encourage the LLM to include print statements in the code.
If exit_code is 0 and stdout is not empty, the code executed successfully and we return stdout.
Log Messages
We use a utility function to log the messages to aall_messages.json file. This is not necessary, but is helpful for debugging and understanding the agent’s behavior which can be hidden behind PydanticAI’s abstractions.
Run the PydanticAI Agent
An agent’s run_sync method runs the agent and returns aRunResultobject. Two of RunResult’s properties include:
result.data: The result of the run.result.all_messages: The full message history.
result.data and ask the user for a new message.
On subsequent runs, we call run_sync again with message_history=result.all_messages() to track a multi-step conversation between the agent and the user.
After each run we log the messages to all_messages.json.
Example Run: What time is it?
A simple way to invoke the agent’s ability to write and execute code is to askwhat time is it?. Some LLM providers inject the current date into the system prompt — but not the time. To accurately report the current time, the agent will need to run a Python script.
model-structured-response and tool-return messages. These are interactions between the model and its tools that PydanticAI handles automatically. You can see in the model-structured-response that the tool call is execute_code and the argument is code that uses the datetime module to get the current time.
The tool-return message contains the output of the tool call which is passed back to the model, which uses the result to generate a final response. Notice the difference between the format of the tool-return message and the final model-text-response message.