In January 2026, Microsoft announced the acquisition of Osmos, an agentic AI data engineering platform that automates complex data transformation, integration, and quality tasks. This acquisition signals Microsoft’s commitment to bringing autonomous AI agents into the data engineering workflow within Microsoft Fabric. For data engineers struggling with repetitive ETL development, schema mapping, and data quality […]
Read more โTag: ETL
Tips and Tricks – Use Embeddings for Semantic Search
Implement semantic search using text embeddings for more relevant results than keyword matching.
Read more โTips and Tricks – Use dbt for Maintainable Data Transformations
Build modular, tested, documented data transformations with dbt.
Read more โTips and Tricks – Partition Large Tables for Query Performance
Use table partitioning to dramatically speed up queries on large datasets.
Read more โData Quality for AI: Ensuring High-Quality Training Data
Data quality determines AI model performance. After managing data quality for 100+ AI projects, I’ve learned what matters. Here’s the complete guide to ensuring high-quality training data. Figure 1: Data Quality Framework Why Data Quality Matters Data quality directly impacts model performance: Accuracy: Poor data leads to poor predictions Bias: Biased data creates biased models […]
Read more โETL for Vector Embeddings: Preparing Data for RAG
Preparing data for RAG requires specialized ETL pipelines. After building pipelines for 50+ RAG systems, I’ve learned what works. Here’s the complete guide to ETL for vector embeddings.
Read more โ