OpenAI Assistants API: Building Stateful AI Agents with Code Interpreter and File Search

Introduction: OpenAI’s Assistants API, launched at DevDay 2023, represents a significant evolution in how developers build AI-powered applications. Unlike the stateless Chat Completions API, Assistants provides a managed, stateful runtime for building sophisticated AI agents with built-in tools like Code Interpreter and File Search. The API handles conversation threading, file management, and tool execution, allowing developers to focus on application logic rather than infrastructure. This guide covers everything from basic assistant creation to advanced patterns with function calling and RAG.

OpenAI Assistants API Architecture — OpenAI Assistants API: Stateful AI Agent Runtime

Capabilities and Features

The Assistants API provides powerful capabilities for building AI agents:

Stateful Conversations: Managed threads that persist conversation history automatically
Code Interpreter: Sandboxed Python environment for data analysis, visualization, and file processing
File Search (RAG): Built-in vector store with automatic chunking and retrieval
Function Calling: Define custom tools that the assistant can invoke
Streaming: Real-time streaming of assistant responses and tool outputs
File Handling: Upload and process files up to 512MB per file
Multiple Models: Support for GPT-4o, GPT-4 Turbo, and GPT-3.5 Turbo
Parallel Tool Use: Execute multiple tools simultaneously
Run Management: Control execution with cancellation and status polling
Annotations: Automatic citations and file references in responses

Getting Started

Install the OpenAI Python SDK and set up your environment:

# Install OpenAI SDK
pip install openai

# Set environment variable
export OPENAI_API_KEY="your-api-key"

# Or use in code
from openai import OpenAI
client = OpenAI(api_key="your-api-key")

Creating Your First Assistant

Build a data analysis assistant with Code Interpreter:

from openai import OpenAI
import time

client = OpenAI()

# Create an assistant with Code Interpreter
assistant = client.beta.assistants.create(
    name="Data Analyst",
    instructions="""You are an expert data analyst. When given data:
    1. Analyze it thoroughly using Python
    2. Create visualizations when helpful
    3. Provide clear insights and recommendations
    Always explain your methodology and findings clearly.""",
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}]
)

print(f"Created assistant: {assistant.id}")

# Create a thread for the conversation
thread = client.beta.threads.create()
print(f"Created thread: {thread.id}")

# Add a message to the thread
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Generate sample sales data for 12 months and analyze the trends"
)

# Run the assistant
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# Poll for completion
while run.status in ["queued", "in_progress"]:
    time.sleep(1)
    run = client.beta.threads.runs.retrieve(
        thread_id=thread.id,
        run_id=run.id
    )
    print(f"Status: {run.status}")

# Get the response
messages = client.beta.threads.messages.list(thread_id=thread.id)

for msg in messages.data:
    if msg.role == "assistant":
        for content in msg.content:
            if content.type == "text":
                print(f"Assistant: {content.text.value}")
            elif content.type == "image_file":
                print(f"Generated image: {content.image_file.file_id}")

File Search with Vector Stores

Build a RAG-powered assistant that can search through your documents:

from openai import OpenAI

client = OpenAI()

# Create a vector store
vector_store = client.beta.vector_stores.create(
    name="Product Documentation"
)

# Upload files to the vector store
file_paths = ["docs/api-reference.pdf", "docs/user-guide.pdf", "docs/faq.md"]
file_streams = [open(path, "rb") for path in file_paths]

file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
    vector_store_id=vector_store.id,
    files=file_streams
)

print(f"Uploaded {file_batch.file_counts.completed} files")

# Create assistant with file search
assistant = client.beta.assistants.create(
    name="Documentation Assistant",
    instructions="""You are a helpful documentation assistant.
    Answer questions based on the uploaded documentation.
    Always cite the source document when providing information.
    If you can't find the answer in the docs, say so clearly.""",
    model="gpt-4o",
    tools=[{"type": "file_search"}],
    tool_resources={
        "file_search": {
            "vector_store_ids": [vector_store.id]
        }
    }
)

# Query the assistant
thread = client.beta.threads.create()

client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="How do I authenticate with the API?"
)

run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id,
    assistant_id=assistant.id
)

messages = client.beta.threads.messages.list(thread_id=thread.id)
response = messages.data[0].content[0].text

print(f"Answer: {response.value}")

# Print citations
for annotation in response.annotations:
    if annotation.type == "file_citation":
        print(f"Citation: {annotation.file_citation.file_id}")

Function Calling for Custom Tools

Extend your assistant with custom functions:

from openai import OpenAI
import json

client = OpenAI()

# Define custom tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g., San Francisco"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "Search product catalog",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "category": {"type": "string"},
                    "max_price": {"type": "number"}
                },
                "required": ["query"]
            }
        }
    }
]

# Create assistant with custom tools
assistant = client.beta.assistants.create(
    name="Shopping Assistant",
    instructions="Help users find products and check weather for delivery estimates.",
    model="gpt-4o",
    tools=tools
)

# Function implementations
def get_weather(location: str, unit: str = "celsius") -> dict:
    # Simulate weather API call
    return {"location": location, "temperature": 22, "unit": unit, "condition": "sunny"}

def search_products(query: str, category: str = None, max_price: float = None) -> list:
    # Simulate product search
    return [
        {"name": f"{query} Pro", "price": 99.99, "rating": 4.5},
        {"name": f"{query} Basic", "price": 49.99, "rating": 4.2}
    ]

# Handle function calls
def handle_tool_calls(run, thread_id):
    tool_outputs = []
    
    for tool_call in run.required_action.submit_tool_outputs.tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)
        
        if function_name == "get_weather":
            result = get_weather(**arguments)
        elif function_name == "search_products":
            result = search_products(**arguments)
        else:
            result = {"error": "Unknown function"}
        
        tool_outputs.append({
            "tool_call_id": tool_call.id,
            "output": json.dumps(result)
        })
    
    return client.beta.threads.runs.submit_tool_outputs(
        thread_id=thread_id,
        run_id=run.id,
        tool_outputs=tool_outputs
    )

# Run conversation with function calling
thread = client.beta.threads.create()

client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Find me a laptop under $100 and check the weather in Seattle"
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# Poll and handle tool calls
while True:
    run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
    
    if run.status == "requires_action":
        run = handle_tool_calls(run, thread.id)
    elif run.status == "completed":
        break
    elif run.status in ["failed", "cancelled", "expired"]:
        print(f"Run failed: {run.status}")
        break

messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages.data[0].content[0].text.value)

Streaming Responses

from openai import OpenAI

client = OpenAI()

# Stream assistant responses
with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id
) as stream:
    for event in stream:
        if event.event == "thread.message.delta":
            for delta in event.data.delta.content:
                if delta.type == "text":
                    print(delta.text.value, end="", flush=True)
        elif event.event == "thread.run.step.delta":
            # Handle tool call streaming
            if event.data.delta.step_details:
                print(f"\nTool: {event.data.delta.step_details}")

Benchmarks and Performance

Assistants API performance characteristics:

Operation	Latency (p50)	Latency (p99)	Cost Factor
Create Assistant	200ms	500ms	Free
Create Thread	150ms	400ms	Free
Simple Run (GPT-4o)	2-5s	10s	1x tokens
Code Interpreter Run	5-15s	60s	$0.03/session
File Search Query	3-8s	20s	$0.10/GB/day
Vector Store Upload	1-5s/file	30s/file	$0.10/GB
Streaming First Token	500ms	2s	Same as run

When to Use Assistants API

Best suited for:

Applications requiring persistent conversation state
Data analysis tasks with Code Interpreter
Document Q&A with built-in RAG (File Search)
Rapid prototyping without infrastructure setup
Applications needing file processing capabilities
Teams wanting managed AI agent infrastructure

Consider alternatives when:

Need fine-grained control over RAG pipeline (use LangChain/LlamaIndex)
Building complex multi-agent systems (use LangGraph/CrewAI)
Require custom embedding models (use your own vector store)
Cost-sensitive applications (Chat Completions is cheaper)
Need to use non-OpenAI models

References and Documentation

Official Documentation: https://platform.openai.com/docs/assistants
API Reference: https://platform.openai.com/docs/api-reference/assistants
OpenAI Cookbook: https://cookbook.openai.com/
Python SDK: https://github.com/openai/openai-python
Pricing: https://openai.com/pricing

Conclusion

The OpenAI Assistants API dramatically simplifies building AI agents by providing managed infrastructure for conversations, file handling, and tool execution. Its built-in Code Interpreter and File Search tools eliminate the need for custom implementations of common patterns. While it trades some flexibility for convenience, the Assistants API is an excellent choice for teams wanting to quickly build and deploy AI-powered applications without managing complex infrastructure. For production applications requiring OpenAI models with persistent state and built-in tools, the Assistants API offers the fastest path from concept to deployment.

Discover more from Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Searching in

Code, Cloud & Context

Latest Articles