GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding

Source

arxiv.orgfull article ↗

Publisher summary· verbatim

arXiv:2603.25841v1 Announce Type: cross Abstract: Current multimodal large language models (MLLMs) cannot effectively utilize eye-gaze information for video understanding, even when gaze cues are supplied via visual overlays or text descriptions. We introduce GazeQwen, a parameter efficient approach

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Discussion

No replies yet. Be first.

GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding

Related coverage

GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding

Related coverage