arxivJun 2
arXiv:2606.01397v1 Announce Type: cross Abstract: A fixed-wing UAV must hold airspeed, altitude, and heading references under wind, gusts, and turbulence, channels coupled so that correcting one can degrade another. Classical autopilots stabilize the airframe well but adapt poorly when a hard crossw
arxivJun 2
arXiv:2606.01099v1 Announce Type: cross Abstract: Command understanding systems in smart home ecosystems can automate device control and substantially improve user experience. However, while they perform well on precise utterances (e.g., "turn on the bedroom light"), they struggle with ambiguous or
arxivMay 19
arXiv:2605.14133v2 Announce Type: replace Abstract: Interactive agent benchmarks face a tension between scalable construction and realistic workflow evaluation. Hand-authored tasks are expensive to extend and revise, while static prompt evaluation misses failures that only appear when agents operate
arxivMay 13
arXiv:2511.22963v3 Announce Type: replace-cross Abstract: Enabling humanoid robots to follow free-form natural language commands is a critical step toward seamless human-robot interaction and general-purpose embodied AI. However, existing methods remain limited, often constrained to simple instructi
arxivMay 11
arXiv:2410.06355v3 Announce Type: replace-cross Abstract: This paper presents UNCOM, a novel hybrid framework for interpreting natural human commands in tabletop scenarios. The system integrates multiple sources of information -- speech, gestures, and scene context -- to extract structured, actionab
arxivApr 30
arXiv:2603.04337v2 Announce Type: replace-cross Abstract: Constructing computer-aided design (CAD) models is labor-intensive but essential for engineering and manufacturing. Recent advances in Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequ
arxivApr 16
arXiv:2511.22364v2 Announce Type: replace-cross Abstract: Open-vocabulary mobile manipulation (OVMM) requires robots to follow language instructions, navigate, and manipulate while updating their world representation under dynamic environmental changes. However, most prior approaches update their wo
arxivApr 10
arXiv:2506.19420v2 Announce Type: replace Abstract: Multimodal sarcasm understanding is a high-order cognitive task. Although large language models (LLMs) have shown impressive performance on many downstream NLP tasks, growing evidence suggests that they struggle with sarcasm understanding. In this
arxivApr 9
arXiv:2604.07171v1 Announce Type: new Abstract: Decision-making in military aviation Prognostics and Health Management (PHM) faces significant challenges due to the "curse of dimensionality" in large-scale fleet operations, combined with sparse feedback and stochastic mission profiles. To address th
arxivApr 7
arXiv:2604.04233v1 Announce Type: cross Abstract: Human-robot collaboration in industrial settings requires precise and reliable communication to enhance operational efficiency. While Large Language Models (LLMs) understand general language, they often lack the domain-specific rigidity needed for sa
thevergeApr 2
Google is launching another update to its Home app, which is supposed to make controlling your smart home with its Gemini AI assistant "more natural and reliable," according to this week's release notes. With the update, you can describe the type of lighting you want, such as "the color of the ocean
arxivMar 31
arXiv:2603.27273v1 Announce Type: cross Abstract: Modular autonomous driving systems must coordinate global progress objectives with local safety-driven reactions under imperfect sensing and strict real-time constraints. This paper presents a ROS2-native arbitration module that continuously fuses th