Emerging Technologies – Page 33 – C4: Container, Code, Cloud & Context

Global Traffic Distribution with Google Cloud Load Balancing and CDN: Enterprise Edge Architecture

Posted on April 8, 2025 by Nithin Mohan TK 10 min read

Introduction: Google Cloud Load Balancing and Cloud CDN provide enterprise-grade traffic distribution and content delivery for global applications. This comprehensive guide explores load balancing architectures, from HTTP(S) load balancers and TCP/UDP proxies to internal load balancing and traffic management policies. After implementing global load balancing for applications serving billions of requests daily, I’ve found Google’s […]

Read more →

Quantization Methods for LLMs: GPTQ, AWQ, and BitsAndBytes

Posted on April 8, 2025 by Nithin Mohan TK 5 min read

Last year, I needed to run a 13B parameter model on a 16GB GPU. Full precision required 52GB. After testing GPTQ, AWQ, and BitsAndBytes, I reduced memory to 7GB with minimal accuracy loss. After quantizing 30+ models, I’ve learned which method works best for each scenario. Here’s the complete guide to LLM quantization. Figure 1: […]

Read more →

AKS pod managed identity

Posted on April 7, 2025 by Nithin Mohan TK 6 min read

Kubernetes has become one of the most popular container orchestration tools, and Azure Kubernetes Service (AKS) is a managed Kubernetes service provided by Microsoft Azure. With the increasing use of Kubernetes and AKS, there is a growing need to improve the security and management of access to cloud resources. AKS pod managed identity is a […]

Read more →

Advanced RAG Patterns: From Naive Retrieval to Production-Grade Systems (Part 1 of 2)

Posted on April 7, 2025 by Nithin Mohan TK 12 min read

Introduction: Retrieval-Augmented Generation (RAG) has become the go-to architecture for building LLM applications that need access to private or current information. By retrieving relevant documents and including them in the prompt, RAG grounds LLM responses in factual content, reducing hallucinations and enabling knowledge that wasn’t in the training data. But naive RAG implementations often disappoint—the […]

Read more →

HL7 v3: Understanding RIM and Why v3 Failed to Replace v2

Posted on April 7, 2025 by Nithin Mohan TK 6 min read

Executive Summary HL7 v3 was designed in the 1990s as the successor to HL7 v2, promising a rigorous, model-driven approach based on the Reference Information Model (RIM). Despite 20+ years of development and standardization, v3 never achieved widespread adoption. Understanding why v3 failed—and where it still matters—is crucial for architects navigating healthcare interoperability standards. 🏥 […]

Read more →

Azure Front Door: A Solutions Architect’s Guide to Global Load Balancing and CDN

Posted on April 6, 2025 by Nithin Mohan TK 8 min read

Executive Summary In an era where milliseconds of latency can translate to millions in lost revenue, global load balancing has evolved from a nice-to-have to a critical infrastructure component. Azure Front Door represents Microsoft’s answer to the challenge of delivering applications globally with enterprise-grade security and performance. Configuration Example { “name”: “my-frontdoor”, “properties”: { “enabledState”: […]

Read more →

Searching in

Category: Emerging Technologies

Global Traffic Distribution with Google Cloud Load Balancing and CDN: Enterprise Edge Architecture

Quantization Methods for LLMs: GPTQ, AWQ, and BitsAndBytes

AKS pod managed identity

Advanced RAG Patterns: From Naive Retrieval to Production-Grade Systems (Part 1 of 2)

HL7 v3: Understanding RIM and Why v3 Failed to Replace v2

Azure Front Door: A Solutions Architect’s Guide to Global Load Balancing and CDN