<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Smart on Wilson Wu</title><link>https://wilsonwu.me/en/tags/smart/</link><description>Recent content in Smart on Wilson Wu</description><generator>Hugo -- 0.127.0</generator><language>en-US</language><lastBuildDate>Tue, 26 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://wilsonwu.me/en/tags/smart/index.xml" rel="self" type="application/rss+xml"/><item><title>Let Models Choose Models: Embedding-Driven Smart Routing for LLMs</title><link>https://wilsonwu.me/en/blog/2026/llm-smart-router/</link><pubDate>Tue, 26 May 2026 00:00:00 +0000</pubDate><guid>https://wilsonwu.me/en/blog/2026/llm-smart-router/</guid><description>In an AI architecture where multiple models coexist, such as GPT-4, GPT-4o, lightweight models, and vertical-domain models, one core question is:
How can the system automatically select the most suitable model without explicitly specifying a model ID?
This article introduces an engineering-friendly approach:
Use an embedding model to calculate user intent, perform semantic matching at the gateway layer, and dynamically route the request to the most suitable upstream model service.</description></item></channel></rss>