arxiv22h ago
arXiv:2606.11639v1 Announce Type: new Abstract: The popularization of automatic speech recognition (ASR) systems has increased exploration of the demographic biases related to race, age, gender, and accent, often formed from imbalanced training data. Most of these studies focused on standard graphem
arxiv6d agobullish
arXiv:2603.01421v3 Announce Type: replace Abstract: While large language models accelerate scientific discovery, existing agents face severe limitations in adaptability, domain generalization, and multimodal scalability, often struggling to autonomously process raw, domain-specific experimental data
arxivJun 2
arXiv:2606.00815v1 Announce Type: new Abstract: Electroencephalography (EEG) supports a variety of brain-computer interface (BCI) tasks ranging from brain-state monitoring to human-LLM interactions. EEG foundation models are emerging, but evaluation remains fragmented due to heterogeneous datasets a
arxivMay 29bullish
arXiv:2605.29955v1 Announce Type: new Abstract: We present AutoformBot, a multi-agent system for building an Autoformalized Textbook Library At Scale (Atlas) in Lean 4. AutoformBot orchestrates thousands of LLM agents, equipped with formal verification tools, dependency-aware task scheduling, and co
arxivMay 26
arXiv:2605.24542v1 Announce Type: cross Abstract: This paper examines the erosion of Public Key Cryptography (PKC) security under adaptive adversarial optimisation driven by artificial intelligence. The problem addressed is the growing mismatch between algorithm-centric cryptographic security models
arxivMay 19bullish
arXiv:2605.18613v1 Announce Type: cross Abstract: Latent representations are at the heart of the majority of modern generative models. In the audio domain they are typically produced by a neural-audio-codec autoencoder. In this work we introduce SAME (Semantically-Aligned Music autoEncoder), an auto
arxivMay 15bullish
arXiv:2506.11067v3 Announce Type: replace Abstract: Objective: Develop a cost-effective, large language model (LLM)-based pipeline for automatically extracting Review of Systems (ROS) entities from clinical notes. Materials and Methods: The pipeline extracts ROS section from the clinical note using
arxivMay 14
arXiv:2605.12730v1 Announce Type: new Abstract: Existing AI systems for modeling human behavior operate at the level of individuals or detect events after they occur. As a result, they systematically fail to capture the collective dynamics that determine whether a group remains stable or transitions
arxivMay 14bullish
arXiv:2604.10720v2 Announce Type: replace Abstract: Artificial students -- models that simulate how learners act and respond within educational systems -- are a promising tool for evaluating tutoring strategies and feedback mechanisms at scale. However, most existing approaches rely on prompting lar
arxivMay 1bullish
arXiv:2604.24021v2 Announce Type: replace Abstract: We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proofs remains an outstanding challenge for LLMs. Th
arxivApr 30
arXiv:2604.26157v1 Announce Type: cross Abstract: Structural generalization in semantic parsing requires systems to apply learned compositional rules to novel structural combinations. Existing approaches either rely on hand-written algebraic rules (AM-Parser) or fail to generalize structurally (Tran
arxivApr 27bullish
arXiv:2604.22260v1 Announce Type: cross Abstract: Urban transportation systems face growing safety challenges that require scalable intelligence for emerging smart mobility infrastructures. While recent advances in foundation models and large-scale multimodal datasets have strengthened perception an
arxivApr 22
arXiv:2604.19354v1 Announce Type: new Abstract: Large Language Model (LLM) agents are increasingly proposed for autonomous cybersecurity tasks, but their capabilities in realistic offensive settings remain poorly understood. We present DeepRed, an open-source benchmark for evaluating LLM-based agent
arxivApr 13
arXiv:2604.08935v1 Announce Type: cross Abstract: Causal discovery from observational data remains a fundamental challenge in machine learning and statistics, particularly when variables represent inherently positive quantities such as gene expression levels, asset prices, company revenues, or popul
arxivApr 4bullish
arXiv:2407.15828v2 Announce Type: replace Abstract: Spoken dialogue is essential for human-AI interactions, providing expressive capabilities beyond text. Developing effective spoken dialogue systems (SDSs) requires large-scale, high-quality, and diverse spoken dialogue corpora. However, existing da
arxivApr 4bullish
arXiv:2604.01538v1 Announce Type: new Abstract: Large language models have been adopted in the medical domain for clinical documentation to reduce clinician burden. However, studies have reported that LLMs often "forget" a significant amount of instruction-following ability when fine-tuned using a t
arxivApr 3
arXiv:2604.00249v1 Announce Type: new Abstract: Single-agent large language model (LLM) systems struggle to simultaneously support diverse conversational functions and maintain safety in behavioral health communication. We propose a safety-aware, role-orchestrated multi-agent LLM framework designed
arxivApr 3
arXiv:2604.02244v1 Announce Type: cross Abstract: This is an extended version of our publication Learning state machines from data streams: A generic strategy and an improved heuristic, International Conference on Grammatical Inference (ICGI) 2023, Rabat, Morocco. It has been extended with a formal
arxivApr 3bullish
arXiv:2604.00137v1 Announce Type: new Abstract: Tool-integrated LLMs can retrieve, compute, and take real-world actions via external tools, but reliability remains a key bottleneck. We argue that failures stem from both tool-use accuracy (how well an agent invokes a tool) and intrinsic tool accuracy
arxivApr 1bullish
arXiv:2603.28336v2 Announce Type: replace Abstract: Systematic literature reviews in the social sciences overwhelmingly follow arborescent logics -- hierarchical keyword filtering, linear screening, and taxonomic classification -- that suppress the lateral connections, ruptures, and emergent pattern