Let Models Choose Models: Embedding-Driven Smart Routing for LLMs

In an AI architecture where multiple models coexist, such as GPT-4, GPT-4o, lightweight models, and vertical-domain models, one core question is: How can the system automatically select the most suitable model without explicitly specifying a model ID? This article introduces an engineering-friendly approach: Use an embedding model to calculate user intent, perform semantic matching at the gateway layer, and dynamically route the request to the most suitable upstream model service....

May 26, 2026 · 5 min

Kubernetes Ingress NGINX Retirement: Comprehensive Migration Plan and Practice Guide to Gateway API

On November 11, 2025, the official Kubernetes blog formally announced that the Ingress NGINX project has entered the Retirement phase and will cease maintenance entirely in March 2026. This move marks the official entry of Kubernetes cluster ingress and traffic management into the Gateway API era. For teams currently using Ingress NGINX, this is not just a technical upgrade, but a risk management task that needs to be planned as soon as possible....

December 1, 2025 · 5 min