Latest Articles

OpenAI Assistants API: Building Stateful AI Agents with Code Interpreter and File Search

Introduction: OpenAI’s Assistants API, launched at DevDay 2023, represents a significant evolution in how developers build AI-powered applications. Unlike the stateless Chat Completions API, Assistants provides a managed, stateful runtime for building sophisticated AI agents with built-in tools like Code Interpreter and File Search. The API handles conversation threading, file management, and tool execution, allowing developers to focus on application logic rather than infrastructure. This guide covers everything from basic assistant creation to advanced patterns with function calling and RAG.

OpenAI Assistants API Architecture
OpenAI Assistants API: Stateful AI Agent Runtime

Capabilities and Features

The Assistants API provides powerful capabilities for building AI agents:

  • Stateful Conversations: Managed threads that persist conversation history automatically
  • Code Interpreter: Sandboxed Python environment for data analysis, visualization, and file processing
  • File Search (RAG): Built-in vector store with automatic chunking and retrieval
  • Function Calling: Define custom tools that the assistant can invoke
  • Streaming: Real-time streaming of assistant responses and tool outputs
  • File Handling: Upload and process files up to 512MB per file
  • Multiple Models: Support for GPT-4o, GPT-4 Turbo, and GPT-3.5 Turbo
  • Parallel Tool Use: Execute multiple tools simultaneously
  • Run Management: Control execution with cancellation and status polling
  • Annotations: Automatic citations and file references in responses

Getting Started

Install the OpenAI Python SDK and set up your environment:

# Install OpenAI SDK
pip install openai

# Set environment variable
export OPENAI_API_KEY="your-api-key"

# Or use in code
from openai import OpenAI
client = OpenAI(api_key="your-api-key")

Creating Your First Assistant

Build a data analysis assistant with Code Interpreter:

from openai import OpenAI
import time

client = OpenAI()

# Create an assistant with Code Interpreter
assistant = client.beta.assistants.create(
    name="Data Analyst",
    instructions="""You are an expert data analyst. When given data:
    1. Analyze it thoroughly using Python
    2. Create visualizations when helpful
    3. Provide clear insights and recommendations
    Always explain your methodology and findings clearly.""",
    model="gpt-4o",
    tools=[{"type": "code_interpreter"}]
)

print(f"Created assistant: {assistant.id}")

# Create a thread for the conversation
thread = client.beta.threads.create()
print(f"Created thread: {thread.id}")

# Add a message to the thread
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Generate sample sales data for 12 months and analyze the trends"
)

# Run the assistant
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# Poll for completion
while run.status in ["queued", "in_progress"]:
    time.sleep(1)
    run = client.beta.threads.runs.retrieve(
        thread_id=thread.id,
        run_id=run.id
    )
    print(f"Status: {run.status}")

# Get the response
messages = client.beta.threads.messages.list(thread_id=thread.id)

for msg in messages.data:
    if msg.role == "assistant":
        for content in msg.content:
            if content.type == "text":
                print(f"Assistant: {content.text.value}")
            elif content.type == "image_file":
                print(f"Generated image: {content.image_file.file_id}")

File Search with Vector Stores

Build a RAG-powered assistant that can search through your documents:

from openai import OpenAI

client = OpenAI()

# Create a vector store
vector_store = client.beta.vector_stores.create(
    name="Product Documentation"
)

# Upload files to the vector store
file_paths = ["docs/api-reference.pdf", "docs/user-guide.pdf", "docs/faq.md"]
file_streams = [open(path, "rb") for path in file_paths]

file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
    vector_store_id=vector_store.id,
    files=file_streams
)

print(f"Uploaded {file_batch.file_counts.completed} files")

# Create assistant with file search
assistant = client.beta.assistants.create(
    name="Documentation Assistant",
    instructions="""You are a helpful documentation assistant.
    Answer questions based on the uploaded documentation.
    Always cite the source document when providing information.
    If you can't find the answer in the docs, say so clearly.""",
    model="gpt-4o",
    tools=[{"type": "file_search"}],
    tool_resources={
        "file_search": {
            "vector_store_ids": [vector_store.id]
        }
    }
)

# Query the assistant
thread = client.beta.threads.create()

client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="How do I authenticate with the API?"
)

run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id,
    assistant_id=assistant.id
)

messages = client.beta.threads.messages.list(thread_id=thread.id)
response = messages.data[0].content[0].text

print(f"Answer: {response.value}")

# Print citations
for annotation in response.annotations:
    if annotation.type == "file_citation":
        print(f"Citation: {annotation.file_citation.file_id}")

Function Calling for Custom Tools

Extend your assistant with custom functions:

from openai import OpenAI
import json

client = OpenAI()

# Define custom tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g., San Francisco"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "Search product catalog",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "category": {"type": "string"},
                    "max_price": {"type": "number"}
                },
                "required": ["query"]
            }
        }
    }
]

# Create assistant with custom tools
assistant = client.beta.assistants.create(
    name="Shopping Assistant",
    instructions="Help users find products and check weather for delivery estimates.",
    model="gpt-4o",
    tools=tools
)

# Function implementations
def get_weather(location: str, unit: str = "celsius") -> dict:
    # Simulate weather API call
    return {"location": location, "temperature": 22, "unit": unit, "condition": "sunny"}

def search_products(query: str, category: str = None, max_price: float = None) -> list:
    # Simulate product search
    return [
        {"name": f"{query} Pro", "price": 99.99, "rating": 4.5},
        {"name": f"{query} Basic", "price": 49.99, "rating": 4.2}
    ]

# Handle function calls
def handle_tool_calls(run, thread_id):
    tool_outputs = []
    
    for tool_call in run.required_action.submit_tool_outputs.tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)
        
        if function_name == "get_weather":
            result = get_weather(**arguments)
        elif function_name == "search_products":
            result = search_products(**arguments)
        else:
            result = {"error": "Unknown function"}
        
        tool_outputs.append({
            "tool_call_id": tool_call.id,
            "output": json.dumps(result)
        })
    
    return client.beta.threads.runs.submit_tool_outputs(
        thread_id=thread_id,
        run_id=run.id,
        tool_outputs=tool_outputs
    )

# Run conversation with function calling
thread = client.beta.threads.create()

client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Find me a laptop under $100 and check the weather in Seattle"
)

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# Poll and handle tool calls
while True:
    run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
    
    if run.status == "requires_action":
        run = handle_tool_calls(run, thread.id)
    elif run.status == "completed":
        break
    elif run.status in ["failed", "cancelled", "expired"]:
        print(f"Run failed: {run.status}")
        break

messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages.data[0].content[0].text.value)

Streaming Responses

from openai import OpenAI

client = OpenAI()

# Stream assistant responses
with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id
) as stream:
    for event in stream:
        if event.event == "thread.message.delta":
            for delta in event.data.delta.content:
                if delta.type == "text":
                    print(delta.text.value, end="", flush=True)
        elif event.event == "thread.run.step.delta":
            # Handle tool call streaming
            if event.data.delta.step_details:
                print(f"\nTool: {event.data.delta.step_details}")

Benchmarks and Performance

Assistants API performance characteristics:

OperationLatency (p50)Latency (p99)Cost Factor
Create Assistant200ms500msFree
Create Thread150ms400msFree
Simple Run (GPT-4o)2-5s10s1x tokens
Code Interpreter Run5-15s60s$0.03/session
File Search Query3-8s20s$0.10/GB/day
Vector Store Upload1-5s/file30s/file$0.10/GB
Streaming First Token500ms2sSame as run

When to Use Assistants API

Best suited for:

  • Applications requiring persistent conversation state
  • Data analysis tasks with Code Interpreter
  • Document Q&A with built-in RAG (File Search)
  • Rapid prototyping without infrastructure setup
  • Applications needing file processing capabilities
  • Teams wanting managed AI agent infrastructure

Consider alternatives when:

  • Need fine-grained control over RAG pipeline (use LangChain/LlamaIndex)
  • Building complex multi-agent systems (use LangGraph/CrewAI)
  • Require custom embedding models (use your own vector store)
  • Cost-sensitive applications (Chat Completions is cheaper)
  • Need to use non-OpenAI models

References and Documentation

Conclusion

The OpenAI Assistants API dramatically simplifies building AI agents by providing managed infrastructure for conversations, file handling, and tool execution. Its built-in Code Interpreter and File Search tools eliminate the need for custom implementations of common patterns. While it trades some flexibility for convenience, the Assistants API is an excellent choice for teams wanting to quickly build and deploy AI-powered applications without managing complex infrastructure. For production applications requiring OpenAI models with persistent state and built-in tools, the Assistants API offers the fastest path from concept to deployment.


Discover more from Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

About the Author

I am a Cloud Architect and Developer passionate about solving complex problems with modern technology. My blog explores the intersection of Cloud Architecture, Artificial Intelligence, and Software Engineering. I share tutorials, deep dives, and insights into building scalable, intelligent systems.

Areas of Expertise

Cloud Architecture (Azure, AWS)
Artificial Intelligence & LLMs
DevOps & Kubernetes
Backend Dev (C#, .NET, Python, Node.js)
© 2025 Code, Cloud & Context | Built by Nithin Mohan TK | Powered by Passion