In this guide, we’ll show you how to build an AI agent that extracts dynamic data from a website, analyzes key changes in the data, and generates a relevant chart to accompany the analysis.

We’ll use the following technologies:

  • LangGraph to orchestrate the agent
  • Browserbase to scrape the website
  • An LLM to write code to extract and transform the data from the website, and to write to code to create a data visualization
  • Riza to safely execute the LLM-written code

Why use Riza?

In general, LLMs are good at writing code, but they can’t execute the code they write.

A common use case for Riza is to safely execute code written by LLMs.

For example, you can ask an LLM to write code to analyze specific data, to generate graphs, or to extract data from a website or document. The code written by the LLM is “untrusted” and might contain harmful side-effects. You can protect your systems by executing that code on Riza instead of in your production environment.

Scenario: Analyze price changes & produce report with chart

In this demo, we’ll build an agent that monitors changes in gas prices in different states across the U.S, and generates a report.

There are a few challenges in our scenario:

  • The data updates regularly and is time-consuming to analyze by hand.
  • The data we want lives on a website that does not have an API.
  • We want to compare new data against previous data, highlight notable changes, and generate an appropriate chart that goes along with the dynamic analysis. That means we don’t know what kind of chart we want to create each time we generate the report.

Solution: Data analyst AI agent

To solve these challenges, we’ll build an AI agent that can:

  1. Navigate to the website and extract the HTML.
  2. Extract and tranform the data in the HTML into a CSV.
  3. Detect changes in the data.
  4. Perform data analysis only if there are changes, producing a written summary and a visual chart.

Example code and data

Get the full code and data for this example in our GitHub.

Here’s an example analysis and chart generated by the agent:

🔔 Change summary for https://gasprices.aaa.com/state-gas-price-averages/:
# Gas Price Update Analysis

## National Overview
The data shows significant variation in gas prices across the U.S. states, with California having the highest regular gas prices at $4.779 per gallon and Mississippi having the lowest at $2.650 per gallon.

## Price Range Analysis
- **Regular gas**: $2.650 (Mississippi) to $4.779 (California) - a difference of $2.129
- **Midgrade gas**: $3.069 (Mississippi) to $4.977 (California) - a difference of $1.908
- **Premium gas**: $3.426 (Mississippi) to $5.162 (California) - a difference of $1.736
- **Diesel**: $3.115 (Texas) to $5.248 (Hawaii) - a difference of $2.133

## Regional Patterns
- **West Coast states** have the highest prices overall:
  - California: $4.779 (regular)
  - Hawaii: $4.491 (regular), $5.248 (diesel - highest in nation)
  - Washington: $4.258 (regular)
  - Oregon: $3.898 (regular)

- **Southern states** generally have the lowest prices:
  - Mississippi: $2.650 (regular)
  - Louisiana: $2.707 (regular)
  - Tennessee: $2.725 (regular)
  - Alabama: $2.739 (regular)
  - Texas: $2.742 (regular), $3.115 (diesel - lowest in nation)

## Price Differentials
- The average price difference between regular and premium gas is approximately $0.90 per gallon
- Diesel prices are generally higher than regular gas prices in most states, with an average difference of about $0.40 per gallon
- The Northeast region shows some of the largest spreads between regular and premium gas prices

## Notable Outliers
- California's regular gas price ($4.779) is 80% higher than Mississippi's ($2.650)
- Hawaii has the highest diesel price ($5.248), which is 68% higher than Texas's diesel price ($3.115)
- The District of Columbia has one of the largest spreads between regular and premium gas at $1.004 per gallon

This data reflects significant regional economic differences and varying state tax policies on fuel.

📈 View the accompanying chart: <path to local file>

Prerequisites

Before we start, you’ll need:

You can adapt this guide to use another LLM provider. There is no special reason we chose Anthropic for this use case.

Step 1: Set environment variables & install dependencies

In your project root, create a .env file with the following variables:

.env
ANTHROPIC_API_KEY=your_anthropic_api_key
BROWSERBASE_API_KEY=your_browserbase_api_key
BROWSERBASE_PROJECT_ID=your_project_id
RIZA_API_KEY=your_riza_api_key
RIZA_RUNTIME_REVISION_ID=we_will_fill_this_in_step_5_of_this_guide

Then install required dependencies:

  1. Run uv init
  2. Run uv add browserbase langchain langchain-anthropic langgraph playwright python-dotenv rizaio

Step 2: Define AI agent state

Let’s first create a class that will hold the state used throughout our LangGraph workflow. Create a state.py file that defines a TrackerState class:

state.py
from typing_extensions import TypedDict, NotRequired

class TrackerState(TypedDict):
    url: str
    storage_folder_path: str
    current_html: NotRequired[str]
    current_csv: NotRequired[str]
    previous_csv: NotRequired[str]
    diff: NotRequired[str]
    summary: NotRequired[str]
    chart_path: NotRequired[str]

This state object will be passed from step to step in our workflow. The required fields are the ones we must provide at the start of the workflow: the url to scrape; and storage_folder_path, the local folder where we want to store our scraped data and saved charts. The other fields will be populated by the steps in the workflow.

Step 3: Define LangGraph workflow

Let’s create graph.py, where we will define our LangGraph workflow. Here, we’ll define the steps (aka “nodes”) we want in the workflow, and the order of those steps.

Note, we have not implemented these nodes yet—we will do that next.

graph.py
from langgraph.graph import StateGraph, START, END
from state import TrackerState
from nodes.scrape_prices import scrape_prices_node
from nodes.extract_price_data import extract_price_data_node
from nodes.check_if_changed import check_if_changed_node
from nodes.summarize_change import summarize_change_node
from nodes.create_chart import create_chart_node
from nodes.store_and_notify import store_notify_node

builder = StateGraph(TrackerState)

builder.add_node("ScrapePrices", scrape_prices_node)
builder.add_node("ExtractPriceData", extract_price_data_node)
builder.add_node("CheckIfChanged", check_if_changed_node)
builder.add_node("SummarizeChange", summarize_change_node)
builder.add_node("CreateChart", create_chart_node)
builder.add_node("StoreAndNotify", store_notify_node)

builder.add_edge(START, "ScrapePrices")
builder.add_edge("ScrapePrices", "ExtractPriceData")
builder.add_edge("ExtractPriceData", "CheckIfChanged")
builder.add_conditional_edges(
    "CheckIfChanged",
    lambda state: "changed" if state["diff"] else "no_change",
    {
        "changed": "SummarizeChange",
        "no_change": END
    }
)
builder.add_edge("SummarizeChange", "CreateChart")
builder.add_edge("CreateChart", "StoreAndNotify")
builder.add_edge("StoreAndNotify", END)

graph = builder.compile()

This sets up our workflow with:

  • Six core nodes that perform different tasks
  • A conditional edge that only processes diffs if changes are detected
  • A logical flow from scraping to notifying

Now, let’s implement each node.

Step 4: Scrape gas prices with Browserbase

Let’s create nodes/scrape.py. This node will scrape HTML from our gas price website (this page from AAA) using Browserbase. Browserbase provides a managed browser that can navigate to the website and extract the table of gas prices.

nodes/scrape.py
import os
from dotenv import load_dotenv
from browserbase import Browserbase
from playwright.sync_api import sync_playwright, Playwright
from state import TrackerState

load_dotenv()
bb = Browserbase(api_key=os.environ["BROWSERBASE_API_KEY"])


def _run(playwright: Playwright, url: str) -> str:
    session = bb.sessions.create(project_id=os.environ["BROWSERBASE_PROJECT_ID"])

    chromium = playwright.chromium
    browser = chromium.connect_over_cdp(session.connect_url)
    context = browser.contexts[0]
    page = context.pages[0]

    table_html = ""

    try:
        page.goto(url)
        print(page.title())

        # Wait for the table to be visible
        page.wait_for_selector("table")

        # Get the table element
        table_element = page.query_selector("table")

        # Extract the HTML content of the table
        table_html = table_element.inner_html()

    finally:
        page.close()
        browser.close()
        print(f"Session complete! View replay at https://browserbase.com/sessions/{session.id}")

    return table_html


def scrape_prices_node(state: TrackerState) -> TrackerState:
    url = state["url"]
    table_html = ""
    with sync_playwright() as playwright:
      table_html = _run(playwright, url)
    return {**state, "current_html": table_html}

This node sets the current_html field in our state to the newly-extracted HTML.

Step 5: Transform HTML to CSV with LLM + Riza

Let’s create nodes/extract_price_data.py. This node will extract the data from our newly-scraped HTML, and transform it into a CSV. The CSV format is more compact and easier to manipulate for data analysis.

Why LLM + Riza

While we could handwrite code to extract and transform the data from the HTML, this code could break if the gas price website decides to rename the headers on their HTML table, or decides to change the HTML structure in some other way.

To make this step more resilient to website design changes, we’ll scrape it by prompting an LLM to generate code to operate on the actual extracted HTML, and then run that code on Riza.

Step 5a: Create a Riza custom runtime with beautifulsoup4 & plotly

Since we want the LLM to write code to extract data from HTML, let’s allow it to use beautifulsoup4, a popular Python library for parsing HTML. To make beautifulsoup4 available on Riza, we’ll create a custom runtime.

Later on (in Step 8), we’ll also want to run LLM-written code to generate a chart. So let’s also add plotly, a popular Python charting library, to our custom runtime.

Follow these steps:

  1. In the Riza dashboard, select Custom Runtimes.
  2. Click Create runtime.
  3. In the runtime creation form, provide the following values:
    FieldValue
    LanguagePython
    requirements.txtbeautifulsoup4
    plotly
  4. Click Create runtime.
  5. Wait for the Status of your runtime revision to become “Succeeded”.
  6. Copy the ID of your runtime revision (not the runtime) and set it as the RIZA_RUNTIME_REVISION_ID in the .env file you created in Step 1.

Step 5b: Implement the node

Now let’s implement the logic that prompts an LLM to write code to transform the HTML to a CSV, and then runs that code on Riza.

The code is below. Note how we:

  1. Create a helper function, _run_code(), that calls the Riza Execute Function API and uses our custom runtime.
  2. Define a prompt that asks the LLM to generate code that can be run by the Riza Execute Function API.
  3. Pass the current_html stored in our LangGraph state to both the LLM and to the Riza function call.
  4. Update the LangGraph state to store the extracted CSV (under the field current_csv).
nodes/extract_price_data.py
import os
from dotenv import load_dotenv
from langchain_anthropic import ChatAnthropic
from langchain.prompts import PromptTemplate
from rizaio import Riza
from state import TrackerState

load_dotenv()

riza_client = Riza(api_key=os.getenv("RIZA_API_KEY"))

RIZA_RUNTIME_REVISION_ID = os.getenv("RIZA_RUNTIME_REVISION_ID")

llm = ChatAnthropic(
    model="claude-3-7-sonnet-latest",
    temperature=0,
    anthropic_api_key=os.getenv("ANTHROPIC_API_KEY")
)
prompt = PromptTemplate.from_template(
    """
    You are given HTML table of gas prices in different US states. Write a python function that extracts the data, and returns it as a CSV with the following headings:
    - state
    - regular_price
    - midgrade_price
    - premium_price
    - diesel_price

    IMPORTANT: Only output the final code, without any explanation. Do NOT put the code in a codeblock.

    Here are the rules for writing the Python function:
    - The function should return an object that has 1 field: "csv". The "csv" data should the CSV content as a string.
    - Use only the Python standard library and built-in modules. In addition, you can use `beautifulsoup4`.
    - The function signature must be:

    def execute(input):

    `input` is a Python object.
    The HTML table is available as text at `input["html_table"]`.

    Here is the html_table:

    {html_table}

    """
)

extractor = prompt | llm


def _run_code(code, input_data):
    print("Running code on Riza...")
    result = riza_client.command.exec_func(
        language="python",
        runtime_revision_id=RIZA_RUNTIME_REVISION_ID,
        input=input_data,
        code=code,
    )
    if result.execution.exit_code != 0:
        print("Code did not execute successfully. Error:")
        print(result.execution.stderr)
    elif result.output_status != "valid":
        print("Unsuccessful output status:")
        print(result.output_status)
    return result.output


def extract_price_data_node(state: TrackerState) -> TrackerState:
    response = extractor.invoke({
        "html_table": state["current_html"],
    })

    python_code = response.content
    print("Python code: ")
    print(python_code)

    input_data = {
        "html_table": state["current_html"],
    }
    output = _run_code(python_code, input_data)
    print("Output of running the code:")
    print(output)

    return {**state, "current_csv": output["csv"]}

Step 6: Check if gas prices changed

Now, let’s create nodes/check_if_changed.py. This node will determine if the price data has changed. (Recall that in our LangGraph workflow, we specify that if the price data has not changed, the workflow should immediately end.)

Since this diffing logic is not the focus of our demo, we provide just a basic implementation that uses the built-in difflib Python library.

utils/diff.py
import difflib

def get_diff(old, new):
    if not old:
        return new
    d = difflib.unified_diff(
        old.splitlines(), new.splitlines(), lineterm=""
    )
    return "\n".join(d)
utils/storage.py
import os
import hashlib

def _get_csv_path(url, folder_path) -> str:
    hashed = hashlib.md5(url.encode()).hexdigest()
    return f"{folder_path}/{hashed}.csv"

def load_previous_csv(url, folder_path) -> str:
    path = _get_csv_path(url, folder_path)
    if os.path.exists(path):
        with open(path, "r") as f:
            return f.read()
    return ""

def save_current_csv(url, content, folder_path) -> str:
    os.makedirs(folder_path, exist_ok=True)
    path = _get_csv_path(url, folder_path)
    with open(path, "w") as f:
        f.write(content)
    return path

# Later, we'll add more functions to storage.py
nodes/check_if_changed.py
from utils.storage import load_previous_csv
from utils.diff import get_diff
from state import TrackerState

def check_if_changed_node(state: TrackerState) -> TrackerState:
    url = state["url"]
    current = state["current_csv"]
    storage_folder_path = state["storage_folder_path"]
    previous = load_previous_csv(url, storage_folder_path)
    diff = get_diff(previous, current)
    return {**state, "previous_csv": previous, "diff": diff}

Step 7: Summarize changes with LLM

Next, let’s create nodes/summarize_change.py. As we’ve specified in our LangGraph workflow, this node (and the following nodes) will only run if the previous step detected a change in gas prices. The Summarize Change node will prompt an LLM to act as a data analyst and highlight notable changes in the data.

In our demo, we’ll use a single LLM call to do this analysis, because the analysis for this data will be fairly simple. (For more complex data analyses, we’d likely want to turn this node into an agent that can write code and run the code on Riza, too.)

nodes/summarize_change.py
import os
from dotenv import load_dotenv
from langchain_anthropic import ChatAnthropic
from langchain.prompts import PromptTemplate
from state import TrackerState

load_dotenv()
llm = ChatAnthropic(
    model="claude-3-7-sonnet-latest",
    temperature=0,
    anthropic_api_key=os.getenv("ANTHROPIC_API_KEY")
)
prompt = PromptTemplate.from_template(
    """
    You're an analyst tracking changes in gas prices.
    You have received the following updates to gas prices across different U.S. states.
    Summarize the important updates in this diff. Include a numerical analysis of notable changes:

    {diff}
    """
)
summarizer = prompt | llm


def summarize_change_node(state: TrackerState) -> TrackerState:
    summary = summarizer.invoke({"diff": state["diff"]})
    return {**state, "summary": summary.content}

Step 8: Create relevant chart with LLM + Riza

Now, let’s create nodes/create_chart.py. This node will generate a chart that’s relevant to the analysis that the LLM produced in the previous step.

Why LLM + Riza

Since we don’t know what the analysis will be each time, we can’t implement this node with code that generates a pre-defined type of chart. Instead, we need to make a judgment call about the type of chart to generate, and then create that chart.

This is a good example of a pattern that’s solved by using an LLM + Riza. We prompt the LLM to make the judgment call about what kind of chart to generate based on the analysis. We then ask the LLM to write code to generate that chart, and run that code on Riza.

Implement the node

Now let’s implement the logic that prompts an LLM to write code to generate an appropriate chart, and then runs that code on Riza.

To make it easier to write the code, let’s allow the LLM to use plotly, a popular Python charting library. Recall that in Step 5, we already created a Riza custom runtime that includes plotly. We’ll reuse that custom runtime here.

The code is below. Note how we:

  1. Create a helper function, _run_code(), that calls the Riza Execute Function API and uses our custom runtime.
  2. Define a prompt that asks the LLM to generate code that can be run by the Riza Execute Function API.
  3. Pass the summary, previous_csv, and current_csv stored in our LangGraph state to the LLM, and the previous_csv and current_csv to the Riza function call.
  4. Save the chart image to a local file.
  5. Update the LangGraph state to store the location of the chart (under the field chart_path).
nodes/create_chart.py
import os
from dotenv import load_dotenv
from langchain_anthropic import ChatAnthropic
from langchain.prompts import PromptTemplate
from rizaio import Riza
from state import TrackerState
from utils.storage import save_image

load_dotenv()

riza_client = Riza(api_key=os.getenv("RIZA_API_KEY"))

RIZA_RUNTIME_REVISION_ID = os.getenv("RIZA_RUNTIME_REVISION_ID")

llm = ChatAnthropic(
    model="claude-3-7-sonnet-latest",
    temperature=0,
    anthropic_api_key=os.getenv("ANTHROPIC_API_KEY")
)
prompt = PromptTemplate.from_template(
    """
    You're an analyst tracking changes in gas prices across different U.S. states.

    I want to create a chart / graph / infographic that illustrates the changes described in the "Summary" below. I've also provided the CSV datasets of the gas price data from yesterday, and gas price data from today. The "Summary" was written based on this data.

    Please figure out what chart / graph / infographic to create, then write a Python function to create this graph. Keep it simple.

    IMPORTANT: Only output the final code, without any explanation. Do NOT put the code in a codeblock.

    Here are the rules for writing the Python function:
    - The function should generate a chart and return the chart as a base64-encoded PNG image.
    - The function should return an object that has 1 field: "image". The "image" data should be the chart as a base64-encoded PNG image.
    - Use only the Python standard library and built-in modules. In addition, you can use `plotly`.
    - The function signature must be:

    def execute(input):

    `input` is a Python object.
    Yesterday's CSV data is available as text at `input["yesterday"]`.
    Today's CSV data is available as text at `input["today"]`.


    Here is the gas price summary, and raw data of gas prices yesterday and today:


    == Summary ==
    {summary}


    == Previous data (yesterday) ==
    {previous_data}


    == New data (today) ==
    {current_data}

    """
)

grapher = prompt | llm


def _run_code(code, input_data):
    print("Running code on Riza...")
    result = riza_client.command.exec_func(
        language="python",
        runtime_revision_id=RIZA_RUNTIME_REVISION_ID,
        input=input_data,
        code=code,
    )
    if result.execution.exit_code != 0:
        print("Code did not execute successfully. Error:")
        print(result.execution.stderr)
    elif result.output_status != "valid":
        print("Unsuccessful output status:")
        print(result.output_status)
    return result.output


def create_chart_node(state: TrackerState) -> TrackerState:
    response = grapher.invoke({
        "summary": state["summary"],
        "previous_data": state["previous_csv"],
        "current_data": state["current_csv"],
    })

    python_code = response.content
    print("Python code: ")
    print(python_code)

    input_data = {
        "yesterday": state["previous_csv"],
        "today": state["current_csv"],
    }
    output = _run_code(python_code, input_data)
    print("Output of running the code:")
    print(output)

    image_path = save_image(state["url"], output["image"], state["storage_folder_path"])

    return {**state, "chart_path": image_path}

To support the final step, where we save the image to a local file, let’s add the following functions to utils/storage.py:

utils/storage.py
import base64

# ... previous code

def _get_image_path(url, folder_path) -> str:
    hashed = hashlib.md5(url.encode()).hexdigest()
    return f"{folder_path}/{hashed}.png"

def save_image(url, base64_encoded_image, folder_path) -> str:
    os.makedirs(folder_path, exist_ok=True)
    path = _get_image_path(url, folder_path)
    with open(path, "wb") as image_file:
        image_file.write(base64.b64decode(base64_encoded_image))
    return path

Step 9: Store data and notify user

Let’s implement our last step: nodes/store_and_notify.py. This node will store the new CSV data (replacing the previous CSV data), and “notify” the user. Since this logic is not the focus of our demo, we provide just a basic implementation that saves the data to a local file, and prints the output to the console.

nodes/store_and_notify.py
from state import TrackerState
from utils.storage import save_current_csv

def store_notify_node(state: TrackerState) -> TrackerState:
    url = state["url"]
    save_current_csv(url, state["current_csv"], state["storage_folder_path"])

    # For now, just print the output. Later could send email or Slack.
    print(f"\n🔔 Change summary for {url}:")
    print(state["summary"])
    print(f"\n📈 View the accompanying chart: {state["chart_path"]}")
    return state

Step 10: Create main.py

Finally, let’s create a main.py file that will kick off this workflow.

We import the LangGraph graph, and kick it off with the two pieces of state required at the start of the workflow: the URL of the gas price site, and the path to a local folder that you want to use to store the output files.

main.py
import os
from graph import graph

# Store all output in a local `.output` folder.
script_directory = os.path.dirname(os.path.abspath(__file__))
STORAGE_FOLDER_PATH = os.path.join(script_directory, ".output")

GAS_PRICE_SITE = "https://gasprices.aaa.com/state-gas-price-averages/"

if __name__ == "__main__":
    graph.invoke({
        "url": GAS_PRICE_SITE,
        "storage_folder_path": STORAGE_FOLDER_PATH,
    })

This agent is now complete. You can run the agent using uv run main.py.

Summary: Benefits of Riza in AI agents

Integrating Riza’s code interpreter with LangGraph lets you build an AI agent that dynamically operates on the specific data it encounters. Here are the two concrete examples we saw in this demo.

Resilient data extraction and transformation

Using Riza, we made our data extraction and transformation (Step 5) resilient to changes in the website’s HTML structure and header names. Instead of hardcoding the extract + transform logic, we dynamically generate and run code that does the extract + transform on the specific HTML the agent has just encountered.

Create custom charts for dynamic reports

Using Riza, we made it possible to generate any appropriate chart to accompany a dynamic data analysis (Step 8). We built an agent that provides relevant insights, along with the most relevant visual chart, without having to resort to complex logic that maps different types of analyses to pre-defined chart types and chart-creation logic.

Instead, without us having to write much code, the system can:

  • Analyze price data to identify trends
  • Choose an appropriate visualization type (bar chart, map, etc.), and what data points to include (which may be a subset of the full data)
  • Generate and execute custom code to produce the chart

Next steps