arxivApril 11, 2026 at 4:00 AM2 min readneutral

GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

View PDF HTML (experimental) Abstract:Recent advances in large language models (LLMs) have enabled increasingly capable chatbots. However, most existing systems focus on single-user settings and do not generalize well to multi-user group chat interactions, where agents require more proactive and accurate intervention under complex, evolving contexts. Existing approaches typically rely on LLMs for both intervention reasoning and response generation, leading to high token consumption, limited scalability, and potential privacy risks. To address these challenges, we propose GroupGPT, a token-efficient and privacy-preserving agentic framework for multi-user chat assistant. GroupGPT adopts an edge-cloud model collaboration architecture to decouple intervention timing from response generation, enabling efficient and accurate decision-making while preserving user privacy through on-device processing of sensitive information. The framework also supports multimodal inputs, including memes, images, videos, and voice this http URL support evaluation of timing accuracy and response quality, we further introduce MUIR, a benchmark dataset for multi-user chat assistant intervention reasoning. MUIR contains 2,500 annotated group chat segments with intervention labels and rationales. We evaluate a range of models on MUIR, spanning from open-source to proprietary variants, including both LLMs and their smaller counterparts. Extensive experiments demonstrate that GroupGPT generates accurate and well-timed responses, achieving an average score of 4.72/5.0 in LLM-based evaluation, and is well-received by users across diverse group chat scenarios. Moreover, GroupGPT reduces the token usage by up to 3 times compared to baselines, while providing privacy sanitization of user messages before cloud transmission. Code is available at: this https URL . Comments: 14 pages, 8 figures Subjects: Computation and Language (cs.CL) Cite as: arXiv:2603.01059 [cs.CL] (or arXiv:2603.01059v3 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2603.01059 arXiv-issued DOI via DataCite Submission history From: Zhuokang Shen [view email] [v1] Sun, 1 Mar 2026 11:29:25 UTC (1,723 KB) [v2] Tue, 17 Mar 2026 14:00:13 UTC (1,928 KB) [v3] Thu, 9 Apr 2026 00:42:09 UTC (2,015 KB)

Read original article ↗

No replies yet. Be first.

arxiv6h ago

GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

Related Articles

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?