ROP-12.1: Optimizing Request Matching for LLM Routers

Project summary

Router systems have become popular ways of matching user queries with LLMs. They allow a query to be sent to the best model, as measured by response quality and cost. This project aims to explore the optimization of LLM parameters to enhance the frequency and quality of queries directed to them by router systems, akin to search engine optimization for webpages. The goal is to demonstrate a method that can be applied to a variety of models to enhance their behavior with respect to a router system.

Project abstract

This project investigates whether the parameters of a Large Language Model (LLM) can be optimized to increase the number and quality of queries routed to it from a router system, via a game-theoretic framework. The research questions include: 1) the feasibility of fine-tuning small models to receive more queries, 2) the impact of such optimizations on user welfare and system costs, and 3) the computational costs of training models with these optimizations. The study adopts a theoretical approach, utilizing a Stackelberg game formulation to model the interactions between the model and the router, aiming to explore practical strategies for model optimization in response to router behavior.