Searching in

Enter search term to find items

to navigate, to select, and to close

Tips and Tricks – Cache LLM Responses for Cost Reduction

Posted on December 19, 2025 by Nithin Mohan TK 2 min read

Implement semantic caching to avoid redundant LLM calls and reduce API costs.

Code Snippet

import hashlib
import json
from functools import lru_cache
from typing import Optional

class LLMCache:
    def __init__(self):
        self._cache = {}
    
    def _hash_prompt(self, prompt: str, model: str) -> str:
        """Create deterministic hash for cache key."""
        content = f"{model}:{prompt}"
        return hashlib.sha256(content.encode()).hexdigest()
    
    def get(self, prompt: str, model: str) -> Optional[str]:
        key = self._hash_prompt(prompt, model)
        return self._cache.get(key)
    
    def set(self, prompt: str, model: str, response: str):
        key = self._hash_prompt(prompt, model)
        self._cache[key] = response

cache = LLMCache()

def cached_llm_call(prompt: str, model: str = "gpt-4") -> str:
    # Check cache first
    cached = cache.get(prompt, model)
    if cached:
        return cached
    
    # Make actual API call
    response = call_openai_api(prompt, model)
    
    # Cache for future use
    cache.set(prompt, model, response)
    return response

Why This Helps

Reduces API costs by 30-70% for repeated queries
Faster response times for cached prompts
Enables offline development and testing

How to Test

Call same prompt twice, verify cache hit
Monitor API call counts

When to Use

Any application with repeated or similar prompts. Chatbots, content generation, analysis.

Performance/Security Notes

Use Redis for production caching. Consider TTL for time-sensitive content.

References

https://python.langchain.com/docs/modules/model_io/llms/llm_caching

Try this tip in your next project and share your results in the comments!

Related

Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Posted in Artificial Intelligence(AI), Emerging Technologies, Tips and Tricks

Tagged AI, LLM, Machine Learning, machine-learning, ml, tips-and-tricks

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.