Model Detail
Switchpoint Router
—GraphRAG-Router: Learning Cost-Efficient Routing over GraphRAGs and LLMs with Reinforcement Learning
arXiv:2604.16401v1 Announce Type: cross Abstract: Graph-based retrieval-augmented generation (GraphRAG) has recently emerged as a powerful paradigm for knowledge-intensive question answering, especially for tasks that require structured evidence organization and multi-hop reasoning. However, existin
ACE-Router: Generalizing History-Aware Routing from MCP Tools to the Agent Web
arXiv:2601.08276v2 Announce Type: replace Abstract: With the rise of the Agent Web and Model Context Protocol (MCP), the agent ecosystem is evolving into an open collaborative network, exponentially increasing accessible tools. However, current architectures face severe scalability and generality bo
SinkRouter: Sink-Aware Routing for Efficient Long-Context Decoding in Large Language and Multimodal Models
arXiv:2604.16883v1 Announce Type: new Abstract: In long-context decoding for LLMs and LMMs, attention becomes increasingly memory-bound because each decoding step must load a large amount of KV-cache data from GPU memory. Existing acceleration strategies often trade efficiency for accuracy by relyin
SecureRouter: Encrypted Routing for Efficient Secure Inference
arXiv:2604.15499v1 Announce Type: cross Abstract: Cryptographically secure neural network inference typically relies on secure computing techniques such as Secure Multi-Party Computation (MPC), enabling cloud servers to process client inputs without decrypting them. Although prior privacy-preserving
Information Router for Mitigating Modality Dominance in Vision-Language Models
arXiv:2604.16264v1 Announce Type: cross Abstract: Vision Language models (VLMs) have demonstrated strong performance across a wide range of benchmarks, yet they often suffer from modality dominance, where predictions rely disproportionately on a single modality. Prior approaches primarily address th
Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization
arXiv:2604.15022v1 Announce Type: cross Abstract: Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistent