Data Analysis

In this guide, we’ll show you how to use Riza to generate charts and graphs from custom data. We’ll prompt an LLM to write the code to analyze the data and create a data visualization, and execute that code using Riza.

Why use Riza?

In general, LLMs are good at writing code, but they can’t execute the code they write.

A common use case for Riza is to safely execute code written by LLMs.

For example, you can ask an LLM to write code to analyze specific data, to generate graphs, or to extract data from a website or document. The code written by the LLM is “untrusted” and might contain harmful side-effects. You can protect your systems by executing that code on Riza instead of in your production environment.

For a more in-depth data analysis example, see our guide on building a data analyst AI agent with LangGraph, Browserbase, and Riza.

Scenario: Understand different datasets quickly

Many government websites provide public datasets. For example, the city of San Francisco provides the annual salary of each city employee.

These datasets have different data models, and it can be time-consuming to manually analyze each dataset. If we want to understand trends much more quickly, we can use LLMs to generate code to analyze a given dataset for us.

Solution: Automatically compute stats and plot charts

We’ll build a script that automatically computes statistics and plots a chart for a given dataset. In this script, we’ll prompt an LLM to generate code to produce a chart, and we’ll safely execute that code using Riza.

Example code and data

Get the full code and data for this example in our GitHub.

The data we’ve prepared is a subset of the full San Francisco employee salary data. This subset is anonymized and includes only individuals working in Fire Services.

Here’s an example chart generated by our script:

Before you begin, sign up for Riza and Anthropic API access. You can adapt this guide to use any other LLM. There is no special reason we chose Anthropic for this use case, and it’s straightforward to adjust the implementation to use another LLM provider.

Step 1: Read in data from CSV

First, we’ll read in the data from our CSV.

INPUT_CSV_FILEPATH = "/path/to/salary_data.csv"

def read_file(filepath):
    with open(filepath, "r", encoding="utf-8") as file:
        content = file.read()
    return content

def main():
    full_rows = read_file(INPUT_CSV_FILEPATH)

Step 2: Generate data analysis code with LLM

In this step, we’ll pass a few lines of the CSV we just read to Anthropic, and ask it to generate custom code to calculate some statistics and produce a chart.

First, install and initialize the Anthropic SDK:

pip install anthropic

Import and initialize the Anthropic client.

import anthropic

# Option 1: Pass your API key directly:
anthropic_client = anthropic.Anthropic(api_key="YOUR_API_KEY")

# Option 2: Set the `ANTHROPIC_API_KEY` environment variable
anthropic_client = anthropic.Anthropic() # Will use `ANTHROPIC_API_KEY`

We’ll now add a generate_code() function, along with a prompt for the LLM:

PROMPT = """
You are given a dataset of the yearly salary of SF city employees (who work in Fire Services) over many years.

Write a Python function that calculates the minimum, maximum, mean, median, and mode of the salaries per year, and plots them. The function should generate a chart and return the chart as a base64-encoded PNG image.

The function signature is:

def execute(input):

`input` is a Python object. The full data is available as text at `input["data"]`. The data is in CSV format.

Here are the rules for writing code:
- The function should return an object that has 1 field: "image". The "image" data should be the chart as a base64-encoded PNG image.
- Use only the Python standard library and built-in modules. In addition, you can use `pandas`, `matplotlib`, and `seaborn`.

Finally, here is an excerpt of the CSV data:

{}
"""

def generate_code(csv_sample):
    message = anthropic_client.messages.create(
        model="claude-3-7-sonnet-20250219",
        max_tokens=2048,
        system="You are an expert programmer. When given a programming task, " +
           "you will only output the final code, without any explanation. " +
           "Do NOT put the code in a codeblock.",
        messages=[
            {
                "role": "user",
                "content": PROMPT.format(csv_sample),
            }
        ]
    )
    code = message.content[0].text
    # Uncomment these lines to see generated code
    # print("GENERATED CODE: ")
    # print(code)
    return code

Finally, we’ll call generate_code(csv_sample) in main(). We’ll only send a few rows of our CSV data to the LLM, because that’s all it needs to understand the shape of the data:

def first_n_lines(text, n):
    return "\n".join(itertools.islice(text.splitlines(), n))

def main():
    full_rows = read_file(INPUT_CSV_FILEPATH)

    first_rows = first_n_lines(full_rows, 10)
    python_code = generate_code(first_rows)

Key components of the prompt

Note that in our prompt above, we explicitly ask the LLM to do a few things:

Write Python code. We plan to execute this code in a Python runtime on Riza.
Write a function that reads data from an object and returns an object. We plan to use Riza’s Execute Function API to run this code. The Execute Function API lets us pass in an input object and receive an output object.
Use the Python standard library, plus pandas, matplotlib, and seaborn. We’re asking the LLM to write code to analyze and visualize data, so we want it to be able to use these popular libraries. By default, Riza provides access to standard libraries. To use additional libraries, you can create a custom runtime. We’ll do that in the next step.

Step 3: Execute the code on Riza

Now that we have LLM-generated code, we’re ready to run it on Riza and finish our script.

Step 3a. Create custom runtime

As we mentioned above, we allowed the LLM to use pandas, matplotlib, and seaborn in its parsing code. To make these libraries available on Riza, we’ll create a custom runtime.

Follow these steps:

In the Riza dashboard, select Custom Runtimes.
Click Create runtime.
In the runtime creation form, provide the following values:
Field Value
Language Python
requirements.txt pandas
matplotlib
seaborn
Click Create runtime.
Wait for the Status of your runtime revision to become “Succeeded”.
Copy the ID of your runtime revision (not the runtime) to use in the next step.

Field	Value
Language	Python
requirements.txt	pandas matplotlib seaborn

Step 3b. Call the Riza API

Now, let’s add the final pieces of code to finish our script.

First, install and initialize the Riza API client library:

pip install rizaio

Import and initialize the Riza client. Note that there are multiple ways to set your API key:

from rizaio import Riza

# Option 1: Pass your API key directly:
riza_client = Riza(api_key="your Riza API key")

# Option 2: Set the `RIZA_API_KEY` environment variable
riza_client = Riza() # Will use `RIZA_API_KEY`

Let’s add a function, run_code(), that calls the Riza Execute Function API and uses our custom runtime. Make sure to fill in your own runtime ID:

RUNTIME_REVISION_ID="your_runtime_revision_id"

# ... other functions ...

def run_code(code, input_data):
    print("Running code on Riza...")
    result = riza_client.command.exec_func(
        language="python",
        runtime_revision_id=RUNTIME_REVISION_ID,
        input=input_data,
        code=code,
    )
    if result.execution.exit_code != 0:
        print("Code did not execute successfully. Error:")
        print(result.execution.stderr)
    elif result.output_status != "valid":
        print("Unsuccessful output status:")
        print(result.output_status)
    return result.output

Finally, we’ll update our main() function to run the generated code, and save the resulting image:

import anthropic
from rizaio import Riza
import itertools
import base64

OUTPUT_GRAPH_FILEPATH = "/path/to/local_image.png"

# ... other functions ...

def save_image_to_file(base64_encoded_image, filepath):
    with open(filepath, "wb") as image_file:
        image_file.write(base64.b64decode(base64_encoded_image))

def main():
    full_rows = read_file(INPUT_CSV_FILEPATH)

    first_rows = first_n_lines(full_rows, 10)
    python_code = generate_code(first_rows)

    input_data = {
        "data": full_rows,
    }
    output = run_code(python_code, input_data)
    save_image_to_file(output["image"], OUTPUT_GRAPH_FILEPATH)

if __name__ == "__main__":
    main()

This script is now complete. You can now run it to produce a chart of salary statistics per year.

Next steps

Get the full code for this example in our GitHub.
Try out the API.
Learn how to use the Riza API with tool use APIs from OpenAI, Anthropic and Google.
Check out the roadmap to see what we’re working on next.

Getting Started

Use Case Guides

Tool-use Guides

Framework Guides

Interpreter Environment

Code Execution

Data Analysis

Why use Riza?

Scenario: Understand different datasets quickly

Solution: Automatically compute stats and plot charts

Example code and data

Step 1: Read in data from CSV

Step 2: Generate data analysis code with LLM

Key components of the prompt

Step 3: Execute the code on Riza

Step 3a. Create custom runtime

Step 3b. Call the Riza API

Next steps

Getting Started

Use Case Guides

Tool-use Guides

Framework Guides

Interpreter Environment

Code Execution

​Why use Riza?

​Scenario: Understand different datasets quickly

​Solution: Automatically compute stats and plot charts

​Example code and data

​Step 1: Read in data from CSV

​Step 2: Generate data analysis code with LLM

​Key components of the prompt

​Step 3: Execute the code on Riza

​Step 3a. Create custom runtime

​Step 3b. Call the Riza API

​Next steps

Why use Riza?

Scenario: Understand different datasets quickly

Solution: Automatically compute stats and plot charts

Example code and data

Step 1: Read in data from CSV

Step 2: Generate data analysis code with LLM

Key components of the prompt

Step 3: Execute the code on Riza

Step 3a. Create custom runtime

Step 3b. Call the Riza API

Next steps