Tips and Tricks #157: Use Embeddings for Semantic Search

Published on August 9, 2025

Implement semantic search using text embeddings for more relevant results than keyword matching.

Code Snippet

from openai import OpenAI
import numpy as np

client = OpenAI()

def get_embedding(text: str) -> list[float]:
    """Generate embedding for text using OpenAI."""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

def cosine_similarity(a: list[float], b: list[float]) -> float:
    """Calculate cosine similarity between two vectors."""
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Index documents
documents = ["Python is great for data science", "JavaScript powers the web"]
doc_embeddings = [get_embedding(doc) for doc in documents]

# Search
query = "machine learning programming"
query_embedding = get_embedding(query)

# Find most similar
similarities = [cosine_similarity(query_embedding, de) for de in doc_embeddings]
best_match_idx = np.argmax(similarities)
print(f"Best match: {documents[best_match_idx]}")

Why This Helps

Finds semantically similar content, not just keyword matches
Works across languages and synonyms
Foundation for RAG applications

How to Test

Test with synonyms and paraphrases
Compare results to keyword search

When to Use

Search functionality, recommendation systems, document similarity, RAG pipelines.

Performance/Security Notes

Use vector databases (Pinecone, Weaviate) for production. Cache embeddings for performance.

References

https://platform.openai.com/docs/guides/embeddings

Try this tip in your next project and share your results in the comments!

Discover more from Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Previous

The Hidden Tax on Innovation: Why FinOps Is the Most Important Discipline You’re Probably Ignoring

Next

Tips and Tricks #156: Use dbt for Maintainable Data Transformations

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Searching in

Enter search term to find items

to navigate, to select, and to close