TOPS: First-Principles Visual Token Pruning via Constructing Token Optimal Preservation Sets for Efficient MLLM Inference

arxiv

PublishedJune 26, 2026 at 4:00 AM

▲bullish

TOPS: First-Principles Visual Token Pruning via Constructing Token Optimal Preservation Sets for Efficient MLLM Inference

Source

arxiv.orgfull article ↗

Read on arxiv

Publisher summary· verbatim

arXiv:2606.27161v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) have achieved strong multimodal reasoning capabilities, but their efficiency is limited by the large number of visual tokens, which introduces substantial computational overhead. Visual token pruning offers a na

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Discussion

No replies yet. Be first.

Related coverage

More from ARXIV

arxivMMGist: A Comprehensive Multimodal Benchmark for 20272h arxivVisualizing "We the People": Bridging the Perception Gap through Pluralistic Data Storytelling2h arxivSmall edits, large models: How Wikipedia advocacy shapes LLM values2h arxivNoise-Aware Boundary-Enhanced Generative Learning for Ultrasound Speckle Reduction2h

The Bubble Brief

WEEKLY

Read multimodal insights every Tuesday — top movers, new releases, story of the week.

Originally published on arxiv ↗