Groq provides a fast inference platform for openly-available LLMs capable of code generation and tool use like Llama3.1, Gemma2 and Mixtral. Groq has also fine-tuned Llama3 to create its own tool-use-specific models.

In this guide we’ll show you how to execute Python generated by Groq’s tool-use models safely with Riza’s Code Interpreter API.

Getting started

To generate Python with Groq’s models you’ll need an API key from the Groq Console.

To execute Python using Riza’s Code Interpreter API you’ll need an API key from the Riza Dashboard.

Make these API keys available in your shell environment:

export GROQ_API_KEY="your_groq_api_key"
export RIZA_API_KEY="your_riza_api_key"

In this guide we’ll use the Groq and Riza Python API client libraries.

Python environment setup

Create a virtualenv and activate it:

python3 -m venv venv
source venv/bin/activate

Install the groq and rizaio packages with pip:

pip install groq rizaio

Generating and executing Python code

There are several ways to get Groq’s models to generate code, and we won’t go over all of them here. In particular, models are sensitive to the text of prompts and we don’t make any claims regarding the prompts in this guide other than that they worked for us at the time of writing. Our goal is to provide a very basic set of prompts to get you started.

At a high level there are two ways to get LLMs to generate runnable code: (1) via “tool use” or “function calling” and (2) direct prompting. Tool use is a powerful concept that allows a model to independently “decide” to write code without being asked, and in general we recommend using this method where appropriate. If you need more control over your prompt or are using a model that doesn’t support tool use, then use direct prompting.

For all methods we start with the following scaffolded code.py file:

import os
import sys
import json
from groq import Groq
from rizaio import Riza

groq = Groq()
riza = Riza()

model = "gemma2-9b-it"
system_message = ""
user_message = str(sys.argv[1])

response = ""

# TODO

print("response:")
print(response)

Connect a code interpreter via tool use (aka function calling)

For some Groq models the Groq API offers tool use. Some model providers call the feature “function calling” but the concept is the same. When you prompt the model you can describe “tools” that may be relevant to achieving a user’s goal. The model can decide to “use” a tool by returning its name and input parameters to you, and then you decide what to do with that information. The general assumption is that you’ll have written some function that you run with the provided input, and then you either show that result to your user or send the result back to the model as part of another prompt.

Below we describe Riza’s Python interpreter to the model as a tool it can use, where the expected input is a Python string. Note that with this method we won’t need to mention Python or code generation in the system or user prompt messages, although you may find it useful to experiment.

We then implement a basic tool use control flow to execute Python with Riza when the model asks us to use the tool.

Add the following system prompt message to the scaffolded code.py:

system_message = "You are a helpful assistant."

Describe the tool

Next we describe the Riza Python interpreter as a tool that the LLM can use:

tools = [
    {
        "type": "function",
        "function": {
            "name": "exec_python",
            "description": "Execute Python to solve problems. Always print output to stdout.",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "The Python code to execute.",
                    }
                },
                "required": ["code"],
            },
        },
    },
]

Make the initial request to the Groq API

We’ll pass the tools and tool_choice parameters in our initial request:

try:
    chat_response = groq.chat.completions.create(
        model = model,
        tools = tools,
        tool_choice = "auto",
        messages = [
            {"role": "system", "content": system_message},
            {"role": "user", "content": user_message},
        ],
    )
except Exception as err:
    print(f"error: groq request failed, {err}", file=sys.stderr)
    exit()

Note that "auto" is the default value for tool_choice, and signals to the model that it can choose whether or not to use a tool. Setting this to "required" will force the model to always use a tool, which gives you a similar level of control to the alternative direct prompting method described below.

We wrap this request in a simple try/except because the Groq API occasionally fails to handle tool use requests and raises groq.BadRequestError with a message similar to "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.".

Handle the Groq API response

If the model decides not to use the tool there’s nothing more to do:

if chat_response.choices[0].finish_reason != "tool_calls":
    response = chat_response.choices[0].message.content
    print("response:")
    print(response)
    exit()

When the model does decide to use the tool, we extract the code:

tool_call = chat_response.choices[0].message.tool_calls[0]
try:
    code = json.loads(tool_call.function.arguments)["code"]
except KeyError:
    print("error: tool call didn't include a required parameter", file=sys.stderr)
    exit()

Execute generated Python with Riza and handle the response

We send the extracted code as input to the Riza Code Interpreter API for execution:

riza_response = riza.command.exec(language="PYTHON", code=code)

exec_python_output = riza_response.stdout
if riza_response.exit_code > 0:
    exec_python_output = riza_response.stderr

Send the follow-up request to the Groq API

After getting the output from Riza, we send it back to the Groq API along with the previous messages (note we also include the initial response message) in order to get a final result:

final_chat_response = groq.chat.completions.create(
    model = model,
    messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
        chat_response.choices[0].message,
        {"role": "tool", "name": "exec_python", "content": exec_python_output, "tool_call_id": tool_call.id}
    ],
)

response = final_chat_response.choices[0].message.content

The complete example

Here’s the final code.py after making all of the above additions:

import os
import sys
import json
from groq import Groq
from rizaio import Riza

groq = Groq()
riza = Riza()

model = "gemma2-9b-it"
system_message = "You are a helpful assistant."
user_message = str(sys.argv[1])

response = ""

tools = [
    {
        "type": "function",
        "function": {
            "name": "exec_python",
            "description": "Execute Python to solve problems. Always print output to stdout.",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "The Python code to execute.",
                    }
                },
                "required": ["code"],
            },
        },
    },
]

try:
    chat_response = groq.chat.completions.create(
        model = model,
        tools = tools,
        tool_choice = "auto",
        messages = [
            {"role": "system", "content": system_message},
            {"role": "user", "content": user_message},
        ],
    )
except Exception as err:
    print(f"error: groq request failed with message {err}", file=sys.stderr)
    exit()

if chat_response.choices[0].finish_reason != "tool_calls":
    response = chat_response.choices[0].message.content
    print("response:")
    print(response)
    exit()

tool_call = chat_response.choices[0].message.tool_calls[0]
try:
    code = json.loads(tool_call.function.arguments)["code"]
except KeyError:
    print("error: tool call didn't include a required parameter", file=sys.stderr)
    exit()

print("making a request to Riza with the following code:")
print(code)

riza_response = riza.command.exec(language="PYTHON", code=code)

exec_python_output = riza_response.stdout
if riza_response.exit_code > 0:
    exec_python_output = riza_response.stderr

final_chat_response = groq.chat.completions.create(
    model = model,
    messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
        chat_response.choices[0].message,
        {"role": "tool", "name": "exec_python", "content": exec_python_output, "tool_call_id": tool_call.id}
    ],
)

response = final_chat_response.choices[0].message.content

print("response:")
print(response)

Run it

Try getting today’s date, or generating the first 50 Fibonacci numbers:

python code.py "What's today's date?"
python code.py "What are the first 50 Fibonacci numbers?"

In both cases the model will almost surely decide to use the exec_python tool with a valid Python script that produces the required information for the model to provide a useful response.

Asking for something that doesn’t require code won’t trigger tool use:

python code.py "Write a haiku about San Francisco."

If the model writes code that attempts to use the network (e.g. makes an HTTP request) execution will fail inside the Riza runtime environment by default. This is by design. See the Next steps section below for more information about configuring Riza to allow network access.

Connect a code interpreter via direct prompting

When tool use or function calling aren’t available, or if you’d prefer to be explicit with your prompt, you can directly prompt a Groq hosted model to generate code and then extract the code from the model’s response to execute via Riza.

Using an LLM for code execution via this method is probably best-suited for the parts of an application that aren’t accepting LLM prompts directly from the user, except in very narrow cases where the user is expected to ask for code generation and immediate execution.

The Groq API supports returning structured output using JSON mode for most models, and we recommend this method to ensure reliable extraction of code from the model’s response.

Add the following system prompt message to the scaffolded code.py:

system_message = """You are a Python programmer that returns code within JSON objects.
All JSON objects must have the following schema: {"code": "The Python code to execute."}."""

Groq recommends explicitly asking for JSON and a specific “schema” within your system prompt.

Make the request to the Groq API using JSON mode

Set the response_format parameter with the value {"type": "json_object"} to enable JSON mode:

try:
    chat_response = groq.chat.completions.create(
        model = model,
        response_format = {"type": "json_object"},
        messages = [
            {"role": "system", "content": system_message},
            {"role": "user", "content": user_message},
        ],
    )
except Exception as err:
    print(f"error: groq request failed with message {err}", file=sys.stderr)
    exit()

Handle the Groq API response

Use the json module to parse the model’s response:

# setting the response_format as above ensures this string is valid JSON
json_content = chat_response.choices[0].message.content
try:
    code = json.loads(json_content)["code"]
except KeyError:
    print("error: json string didn't include a required parameter", file=sys.stderr)
    exit()

Execute generated Python with Riza and handle the response

Once we have LLM-generated Python we’ll send it to Riza for execution. This is much safer than a naive implementation relying on exec() or similar direct local execution.

riza_response = riza.command.exec(language="PYTHON", code=code)

if riza_response.exit_code > 0:
    raise RuntimeError(riza_response.stderr)

response = riza_response.stdout

Once we have the output from Riza we can do whatever we want with it. In this case we simply print the output and end the program.

The complete example

Here’s the final code.py after making all of the above additions:

import os
import sys
import json
from groq import Groq
from rizaio import Riza

groq = Groq()
riza = Riza()

model = "gemma2-9b-it"
system_message = """You are a Python programmer that returns code within JSON objects.
All JSON objects must have the following schema: {"code": "The Python code to execute."}."""
user_message = str(sys.argv[1])

response = ""

try:
    chat_response = groq.chat.completions.create(
        model = model,
        response_format = {"type": "json_object"},
        messages = [
            {"role": "system", "content": system_message},
            {"role": "user", "content": user_message},
        ],
    )
except Exception as err:
    print(f"error: groq request failed with message {err}", file=sys.stderr)
    exit()

# setting the response_format as above ensures this string is valid JSON
json_content = chat_response.choices[0].message.content
try:
    code = json.loads(json_content)["code"]
except KeyError:
    print("error: json string didn't include a required parameter", file=sys.stderr)
    exit()

print("making a request to Riza with the following code:")
print(code)

riza_response = riza.command.exec(language="PYTHON", code=code)

if riza_response.exit_code > 0:
    raise RuntimeError(riza_response.stderr)

response = riza_response.stdout

print("response:")
print(response)

Run it

As with tool use, you can try to get today’s date or generate the first 50 Fibonnaci numbers. But the prompt you use may need to be more explicit about writing code:

python code.py "Write a script to print today's date."
python code.py "Write a script to print the first 50 Fibonacci numbers."

In both cases the model will almost surely write a valid Python script that produces the required output for our program to print.

If the model writes code that attempts to use the network (e.g. makes an HTTP request) execution will fail inside the Riza runtime environment by default. This is by design. See the Next steps section below for more information about configuring Riza to allow network access.

Next steps

If you modify the user prompt message enough times and ask a wide variety of questions you’ll surely notice that the model often tries to make HTTP requests within the Python code it writes. By default Riza’s isolated Python runtime environment doesn’t allow access to network I/O. Read about how to allow network access here.