Tag

#artificial-intelligence

86 articles tagged #artificial-intelligence

arxiv4d agobullish

DN-Hypo-Pipeline: An AI-Driven Workflow for Generating Hypotheses using Large Language Models and Scientific Explanations

arXiv:2606.08532v5 Announce Type: replace Abstract: Modern artificial intelligence excels at prediction but cannot explain. From large language models to AI-for-science systems, today's machines answer what by recombining patterns already present in the human literature, yet they cannot reason out w

TR1 model #explanation #scientific-discovery #hypothesis-generation Read on arxiv →

arxiv4d ago

Inducing Comparability of Factorised Probability Distributions

arXiv:2607.20502v1 Announce Type: new Abstract: To allow for principled comparison between two probabilistic graphical models defined over non-identical variable sets, they have to be lifted to a common measurable space. To this end, we propose an extension scheme for any two given models and establ

#probabilistic-graphical-models #artificial-intelligence #measure-theory Read on arxiv →

arxiv4d ago

A Counterfactual Cause in Situation Calculus

arXiv:2501.06857v3 Announce Type: replace Abstract: Perhaps the most popular modern formulation of actual causality is the HP account by Halpern and Pearl. Recent advancement has focused on extension of HP account to lift its limited expressiveness, in particular, Batusov and Soutchanski proposed a

#artificial-intelligence #causality #formalism Read on arxiv →

arxiv4d agobullish

Generative Artificial Intelligence in Bioinformatics: A Systematic Review of Models, Applications, and Methodological Advances

arXiv:2511.03354v2 Announce Type: replace-cross Abstract: Generative artificial intelligence (GenAI) is transforming bioinformatics by advancing genomics, proteomics, transcriptomics, structural biology, and drug discovery. Following the Preferred Reporting Items for Systematic Reviews and Meta-Anal

#bioinformatics #genomics #artificial-intelligence Read on arxiv →

arxiv5d agobullish

You Live More Than Once: Towards Hierarchical Skill Meta-Evolving

arXiv:2605.28390v2 Announce Type: replace Abstract: Test-time skill evolving is regarded as a new paradigm for enhancing deployed agentic systems. Existing works mainly focus on hard-coded skill evolving strategies or parametric learning that rely on expensive parameter updates in the underlying LLM

#artificial-intelligence #continual-learning #meta-learning Read on arxiv →

arxiv6d ago

Skillware: A Software Ontology and Engineering Lifecycle for Persistent Behavioral Artifacts

arXiv:2607.18970v1 Announce Type: cross Abstract: Agent Skills have become persistent behavioral artifacts across independent AI agent systems. They combine natural-language task specifications with metadata and optional references, scripts, assets, hooks, package manifests, tests, and companion int

#software-engineering #artificial-intelligence #agent-systems Read on arxiv →

arxivJul 13

iLENS: Interpretable LLM-Guided Mixture-of-Experts for Neuroimaging Survival Analysis

arXiv:2607.08778v1 Announce Type: cross Abstract: Alzheimer's Disease (AD) is a complex neurodegenerative disorder that continues to impact millions of people worldwide. Predicting AD conversion during the prodromal stage remains critical for disease understanding and patient care. As such, survival

IL1 model #machine-learning #artificial-intelligence #healthcare Read on arxiv →

arxivJul 10

Idiobionics: The Unification of Privacy and Intelligent Robotic Prostheses

arXiv:2607.07775v1 Announce Type: new Abstract: The human body is at the center of a growing family of technologies designed to tightly and persistently couple biological and digital systems. Robotic prostheses are a representative example of this tight coupling. Also referred to as bionic limbs, ro

#robotics #prosthetics #security Read on arxiv →

arxivJul 10

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

arXiv:2607.07760v1 Announce Type: new Abstract: We outline an adversarial social epistemology (ASE) for densely interactive communicative landscapes in which public assertions are scaffolded by chains of testimony, inference, institutional certification, and tacit trust. In such landscapes, agents h

#social-networks #misinformation #artificial-intelligence Read on arxiv →

arxivJul 10

Understanding Axes of Difficulty For Long Context Tasks Via PredicateLongBench

arXiv:2607.08284v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated rapidly improving long-context capabilities, prompting a wave of benchmarks designed to evaluate them. However, existing long-context evaluations - from Needle-in-a-Haystack (NIAH) tests to more recent mul

#benchmark #long-context #evaluation Read on arxiv →

arxivJul 10bullish

Multi-agent Autoformalization of Tensor Network Theory

arXiv:2607.07857v1 Announce Type: cross Abstract: We build a team of specialized large language-model agents and present an agent-driven workflow for research-level formalization in theoretical physics, with the autoformalization of the fundamental theorem of matrix-product states as a demonstration

#autoformalization #quantum-physics #artificial-intelligence Read on arxiv →

arxivJul 10bullish

A Vision Toward Energy-Efficient Domain-Specific Artificial Intelligence Models and Agents

arXiv:2510.22052v2 Announce Type: replace Abstract: The field of artificial intelligence (AI) has taken a tight hold on broad aspects of society, industry, business, and governance in ways that dictate the prosperity and might of the world's economies. The AI market size is projected to grow from {\

OP1 model #artificial-intelligence #machine-learning #energy-efficiency Read on arxiv →

arxivJul 3

When Should Service Agents Reconsider? Difficulty-Routed Control in Customer-Service Operations

arXiv:2607.01426v1 Announce Type: new Abstract: Autonomous customer-service agents are shifting from conversational interfaces toward operational execution roles: they retrieve firm records, apply service policies, and execute backend writes such as refunds, cancellations, exchanges, order modificat

#autonomous-agents #customer-service #service-control Read on arxiv →

arxivJul 2

From Signals to Structure: How Memory Architecture Drives Language Emergence in LLM Agents

arXiv:2607.00233v1 Announce Type: new Abstract: How do two agents invent a shared language from scratch? In a Lewis signaling game, a sender and receiver must coordinate on a code using only their interaction history. We study five memory architectures across varying channel configurations with LLM

LL1 model #language-invention #multiagent-systems #information-theory Read on arxiv →

arxivJul 1

When Regulation Has Memory: Hysteresis and Control Burden in Artificial Agency

arXiv:2606.30975v1 Announce Type: new Abstract: Adaptive agents are usually judged by what they do, but an agent can appear stable while the internal effort required to keep it stable is increasing. This hidden regulatory burden matters for artificial agents operating under noise, delay, or changing

#artificial-intelligence #regulation #control-theory Read on arxiv →

arxivJun 30

Situation Perception: A Necessary Primitive to Artificial Superintelligence

arXiv:2606.30481v1 Announce Type: cross Abstract: Current large language models are extraordinary statistical engines. They compress vast amounts of text into useful patterns and can explain science, write code, imitate reasoning, and participate in philosophical conversation. Yet pattern mastery is

#artificial-intelligence #language-models #superintelligence Read on arxiv →

arxivJun 27

Agentic Analysis for Agentic Infrastructure: An LLM-Powered Pipeline for Comparative Governance of DAO and Corporate AI Protocols

arXiv:2606.26203v1 Announce Type: new Abstract: As AI agent protocols proliferate, the governance structures shaping their interoperability standards remain empirically underexamined. We introduce an LLM-powered comparative pipeline for large-scale governance discourse analysis, integrating automate

LL1 model #governance #interoperability #artificial-intelligence Read on arxiv →

arxivJun 27bullish

Event-Aware Instructed Assistant for Referring Video Segmentation

arXiv:2606.26994v1 Announce Type: cross Abstract: Existing referring video segmentation methods often treat a video as a single event consisting of multiple images, overlooking the fact that a video typically contains multiple distinct events. Under such a mechanism, the model needs to directly unde

EV1 model #video-segmentation #computer-vision #artificial-intelligence Read on arxiv →

arxivJun 27

Unbiased Canonical Set-Valued Oracles Via Lattice Theory

arXiv:2606.26418v1 Announce Type: new Abstract: A non-agentic "oracle" AI that estimates probabilities of future events faces a self-reference problem: once its answer is learned and acted upon, it can change the very probability it was asked to report. One response, advocated for the Scientist AI p

#artificial-intelligence #machine-learning #self-reference Read on arxiv →

arxivJun 25bullish

Blockwise Policy-Drift Gating for On-Policy Distillation

arXiv:2606.24084v1 Announce Type: cross Abstract: On-policy distillation (OPD) trains a student policy using teacher signals computed on trajectories sampled by the student itself. Recent work shows that sampled-token OPD can be fragile on long-horizon reasoning tasks and that local teacher-support

#machine-learning #artificial-intelligence #computation Read on arxiv →

arxivJun 20

DRFLOW: A Deep Research Benchmark for Personalized Workflow Prediction

arXiv:2606.18191v2 Announce Type: replace Abstract: Deep research (DR) systems are increasingly used for complex information-seeking tasks, but existing works mainly focus on generating reports and summaries. In contrast, many enterprise tasks instead require an agent to identify concrete workflows

DR1 model #workflow #benchmark #personalization Read on arxiv →

arxivJun 20

Human-AI Agent Interaction in a Business Context

arXiv:2606.18716v1 Announce Type: cross Abstract: As AI agents are increasingly integrated into core business processes, understanding and designing effective interaction patterns between humans and AI agents becomes crucial for value creation. This study identifies and evaluates principles and crit

#human-computer-interaction #artificial-intelligence #user-experience Read on arxiv →

arxivJun 18bullish

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

arXiv:2606.01139v3 Announce Type: replace Abstract: Agent skills are procedural artifacts that enable LLM agents to execute workflows, verify constraints, and recover from failures. Existing self-evolving methods refine skills using accumulated trajectories. However, they struggle in cold-start sett

LL1 model #procedural-knowledge #skill-refining #execution-grounded Read on arxiv →

arxivJun 18

SWE-Future: Forecast-Conditioned Data Synthesis for Future-Oriented Software Engineering Agents

arXiv:2606.18733v1 Announce Type: cross Abstract: Realistic coding-agent benchmarks often replay public GitHub issues and pull requests, making them vulnerable to overlap with model pretraining, fine-tuning, synthetic-data generation, or benchmark-driven model selection. Fully synthetic tasks avoid

#software-engineering #artificial-intelligence #benchmark Read on arxiv →

arxivJun 17

Talking to Your Data: Exploring Embodied Conversation as an Interface for Personal Health Reflection

arXiv:2606.17767v1 Announce Type: cross Abstract: Personal health data from wearables are typically presented through dashboards of charts and summary statistics, requiring users to actively interpret patterns and implications. We explore an alternative interaction paradigm: engaging with personal h

#health-data #human-computer-interaction #artificial-intelligence Read on arxiv →

arxivJun 15

History of the Muddy Children Puzzle

arXiv:2606.13703v1 Announce Type: new Abstract: The Muddy Children Puzzle is a puzzle about knowledge and ignorance that has been inspiring for the development of epistemic logic. Who came up with it first? This is unclear. We trace the origin of the Muddy Children Puzzle through logical and literar

#logic #puzzle #artificial-intelligence Read on arxiv →

arxivJun 12bullish

AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility

arXiv:2606.13608v1 Announce Type: new Abstract: Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric harnesses that require heavy integration, create test-production mismatch, and limit fair comparison across diverse

#benchmark #evaluation #artificial-intelligence Read on arxiv →

arxivJun 12

Order Is Not Control

arXiv:2606.12923v1 Announce Type: cross Abstract: AI alignment, interpretability, steering, and neural perturbation studies identify order-inducing objects. We argue that order is not control. Control requires a receiver-gated response law: a denominator-indexed operator mapping material state, acti

LL1 model #machine-learning #artificial-intelligence #interpretability Read on arxiv →

arxivJun 12

Position: Generative Engine Optimization Creates Underexamined Risks, Governance Must Target Concentration, Disclosure, and Academic Blind Spots

arXiv:2606.12439v1 Announce Type: cross Abstract: Large language model (LLM) answer engines are increasingly used for information seeking, shifting visibility from ranked lists to synthesized answers. This enables Generative Engine Optimization (GEO), which targets LLM answer engines' evidence pool

#optimization #governance #artificial-intelligence Read on arxiv →

arxivJun 12

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

arXiv:2606.12476v1 Announce Type: cross Abstract: Token-level hallucination detectors are evaluated as classifiers, by AUC over all tokens, yet a streaming monitor is judged by its reaction time: the number of tokens that pass between the onset of a hallucination and the alarm. We formulate hallucin

#machine-learning #artificial-intelligence #natural-language-processing Read on arxiv →

arxivJun 12bullish

Arbor: Tree Search as a Cognition Layer for Autonomous Agents

arXiv:2606.12563v1 Announce Type: new Abstract: Arbor is a multi-agent framework that introduces structured tree search as a cognition layer for autonomous agents operating in large, stateful action spaces. Prior autonomous optimization systems operate on isolated targets with stateless evaluation.

#autonomous-agents #optimization #artificial-intelligence Read on arxiv →

arxivJun 12bullish

MDForge: Agentic Molecular Dynamics Pipeline Design under Sparse Simulator Feedback

arXiv:2606.12916v1 Announce Type: new Abstract: Molecular dynamics (MD) is the canonical in-silico method for atomistic molecular science, simulating molecular behavior from first-principle physics. Designing an MD pipeline for a new system requires substantial expert knowledge: running it on even o

MD1 model #molecular-dynamics #llm #pipeline-automation Read on arxiv →

arxivJun 11

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

arXiv:2604.25018v2 Announce Type: replace-cross Abstract: The Internet of Everything (IoE) represents an evolution of the Internet of Things (IoT) by integrating people, data, processes, and things into a unified intelligent ecosystem. IoE aims to enhance automation, decision-making, and service eff

#iot #emerging-technologies #artificial-intelligence Read on arxiv →

arxivJun 10bullish

Regimes: An Auditable, Held-Out-Gated Improvement Loop Demonstrated on LongMemEval with ActiveGraph

arXiv:2606.10241v1 Announce Type: new Abstract: Autonomous improvement loops are hard to trust because the improvement process is usually external scaffolding bolted onto the agent: failures go unlogged, diagnoses cannot be replayed, and promote-or-discard decisions land in a side database rather th

AC1 model #autonomous-improvement #event-sourced #auditable Read on arxiv →

arxivJun 10bullish

Business World Model

arXiv:2606.10044v1 Announce Type: new Abstract: Businesses are increasingly adopting AI-enabled tools to improve productivity, reduce costs, and enhance products and services. However, the transformative potential of AI extends beyond automating predefined tasks: it lies in enabling intelligent syst

#artificial-intelligence #business #planning Read on arxiv →

arxivJun 10bullish

RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty

arXiv:2602.12424v2 Announce Type: replace-cross Abstract: Benchmarks establish a standardized evaluation framework to systematically assess the performance of large language models (LLMs), facilitating objective comparisons and driving advancements in the field. However, existing benchmarks fail to

#evaluation #benchmark #language-models Read on arxiv →

arxivJun 6

Fault tolerance estimation in digital circuits with visualised generative networks

arXiv:2605.15212v2 Announce Type: replace-cross Abstract: We propose a new numerical method to estimate the fault tolerance of failure modes in digital circuit structures with a generative network sampling technique. From a random input of generated bitwise configurations of ideally digitalised anal

GE1 model #hardware #artificial-intelligence #circuit-design Read on arxiv →

arxivJun 6bullish

Reformulating Neural Operators in $d+1$ Dimensions for Embedding Evolution

arXiv:2505.11766v4 Announce Type: replace-cross Abstract: Neural Operators (NOs) are powerful architectures for learning mappings between function spaces. While most advances focus on refining kernel parameterizations over the $d$-dimensional physical domain, the evolution of lifted embeddings remai

#machine-learning #artificial-intelligence #quantum-physics Read on arxiv →

arxivJun 6bullish

MPCoT: Reward-Guided Multi-Path Latent Reasoning for Test-Time Scalable Vision-Language-Action

arXiv:2606.06245v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) policies remain brittle in long-horizon and high-uncertainty control, where one-pass action decoding provides limited inference-time deliberation. Explicit chain-of-thought can increase reasoning depth, but introduces tok

MP1 model #robotics #artificial-intelligence #research Read on arxiv →

arxivJun 6

Entropy-Based Evaluation of AI Agents: A Lightweight Framework for Measuring Behavioral Patterns

arXiv:2606.05872v1 Announce Type: new Abstract: AI agents are commonly evaluated using task success, reward, latency, and cost. These metrics are useful, but they often miss important aspects of agent behavior: whether an agent explores too much, repeats itself too rigidly, uses tools effectively, r

#evaluation #metrics #artificial-intelligence Read on arxiv →

arxivJun 5

Abduction Prover in Isabelle/HOL

arXiv:2606.04877v1 Announce Type: cross Abstract: Proof assistants based on expressive logics suffer limited automation for proof search, raising the cost of formal verification based on proof assistants. We address this problem by introducing the Abduction Prover for Isabelle/HOL. Given a challengi

#formal-verification #proof-assistants #artificial-intelligence Read on arxiv →

arxivJun 1bearish

Multi-Agent Teams Hold Experts Back

arXiv:2602.01011v4 Announce Type: replace-cross Abstract: Multi-agent LLM systems are increasingly deployed as autonomous collaborators, where agents interact freely rather than execute fixed, pre-specified workflows. In such settings, effective coordination cannot be fully designed in advance and m

#multiagent-systems #artificial-intelligence #machine-learning Read on arxiv →

arxivMay 29bullish

Extreme dynamic symmetry enables omnidirectional and multifunctional robots

arXiv:2605.29254v1 Announce Type: cross Abstract: Symmetry is a central organizing principle in natural systems, yet its use as a unifying design strategy in robotics has largely remained limited to geometric form. We show that symmetry can instead be leveraged at the level of dynamic actuation capa

#robotics #artificial-intelligence #research Read on arxiv →

arxivMay 29

The Best of the Two Worlds: Harmonizing Semantic and Hash IDs for Sequential Recommendation

arXiv:2512.10388v2 Announce Type: replace-cross Abstract: Conventional Sequential Recommender Systems (SRS) typically assign unique hash IDs (HID) to construct item embeddings, which mainly capture collaborative signals from historical user-item interactions. However, such embeddings are vulnerable

#recommendation-systems #information-retrieval #artificial-intelligence Read on arxiv →

arxivMay 28bullish

Learning Compositional Latent Structure with Vector Networks

arXiv:2605.28007v1 Announce Type: cross Abstract: Deep networks are powerful function approximators, but they typically store many different computations in shared weight matrices, making it difficult to selectively reuse or adapt parts of them when a familiar structure appears in novel combinations

VE1 model #machine-learning #artificial-intelligence #neural-networks Read on arxiv →

arxivMay 28bullish

Smaller, Younger, and More Impactful: How AI-Assisted Writing Transforms Research Teams

arXiv:2605.27404v1 Announce Type: cross Abstract: The era of Big Science has long been defined by increasingly large and specialized research teams pushing the frontiers of knowledge. However, recent advances in artificial intelligence (AI), particularly large language models (LLMs), are beginning t

#artificial-intelligence #research #academic-writing Read on arxiv →

arxivMay 28bullish

LaneRoPE: Positional Encoding for Collaborative Parallel Reasoning and Generation

arXiv:2605.27570v1 Announce Type: new Abstract: Parallel LLM test-time scaling techniques (e.g., best-of-$N$) require drawing $N>1$ sequences conditioned on the same input prompt. These methods boost accuracy while exploiting the computational efficiency of batching $N$ generations. However, each se

LA1 model #research #llm #parallel-processing Read on arxiv →

arxivMay 27

Is Agent Memory a Database? Rethinking Data Foundations for Long-Term AI Agent Memory

arXiv:2605.26252v1 Announce Type: new Abstract: Long-running AI agents need persistent memory. Memory supports learning across sessions, reduces repeated context injection, and enables auditing of past decisions. Current agent memory systems and database paradigms treat memory as storage. They local

#artificial-intelligence #databases #memory-management Read on arxiv →

arxivMay 22bullish

Towards Autonomous Mechanistic Reasoning in Virtual Cells

arXiv:2604.11661v3 Announce Type: replace-cross Abstract: Large language models (LLMs) have recently gained significant attention as a promising approach to accelerate scientific discovery. However, their application in open-ended scientific domains such as biology remains limited, primarily due to

#machine-learning #artificial-intelligence #biology Read on arxiv →

arxivMay 21

EMO-BOOST: Emotion-Augmented Audio-Visual Features for Improved Generalization in Deepfake Detection

arXiv:2605.19630v1 Announce Type: new Abstract: With every advancement in generative AI models, forensics is under increasing pressure. The constant emergence of new generation techniques makes it impossible to collect data for each manipulation to train a deepfake detection model. Thus, generalizin

EMEM2 models #deepfakes #detection #research Read on arxiv →

arxivMay 16

MediaClaw: Multimodal Intelligent-Agent Platform Technical Report

arXiv:2605.14771v1 Announce Type: new Abstract: MediaClaw is a multimodal agent platform built on the OpenClaw ecosystem. Its core design follows a three-layer architecture of unified abstraction, pluginized extension, and workflow orchestration. The system is intended to address practical deploymen

#multimodal #architecture #artificial-intelligence Read on arxiv →

arxivMay 16

Interestingness as an Inductive Heuristic for Future Compression Progress

arXiv:2605.14831v1 Announce Type: new Abstract: One of the bottlenecks on the way towards recursively self-improving systems is the challenge of interestingness: the ability to prospectively identify which tasks or data hold the potential for future progress. We formalize interestingness as an induc

#artificial-intelligence #machine-learning #complexity-theory Read on arxiv →

arxivMay 16

Monitoring Data-aware Temporal Properties (Extended Version)

arXiv:2605.14666v1 Announce Type: new Abstract: Dynamic systems in AI are often complex and heterogeneous, so that an internal specification is not accessible and verification techniques such as model checking are not applicable. Monitoring is in such cases an attractive alternative, as it evaluates

#monitoring #verification #artificial-intelligence Read on arxiv →

arxivMay 13bullish

Evidence Over Plans: Online Trajectory Verification for Skill Distillation

arXiv:2605.09192v1 Announce Type: new Abstract: Agent skills can remarkably improve task success rates by using human-written procedural documents, but their quality is difficult to assess without environment-grounded verification. Existing skill generation methods heavily rely on preference logs ra

#artificial-intelligence #skill-generation #distillation Read on arxiv →

arxivMay 11bullish

EviDep: Trustworthy Multimodal Depression Estimation via Disentangled Evidential Learning

arXiv:2604.16579v2 Announce Type: replace-cross Abstract: Automated multimodal depression estimation in unconstrained environments is inherently challenged by naturalistic noise and complex behavioral variability. Prevailing deterministic methods, however, produce uncalibrated point estimates withou

EV1 model #machine-learning #artificial-intelligence #mental-health Read on arxiv →

arxivMay 8

SkillRet: A Large-Scale Benchmark for Skill Retrieval in LLM Agents

arXiv:2605.05726v1 Announce Type: new Abstract: As LLM agents are increasingly deployed with large libraries of reusable skills, selecting the right skill for a user request has become a critical systems challenge. In small libraries, users may invoke skills explicitly by name, but this assumption b

#benchmark #llm #retrieval Read on arxiv →

arxivMay 8bullish

GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Image Generation

arXiv:2605.06641v1 Announce Type: new Abstract: Developing ceramic glazes is a costly, time-consuming process of trial and error due to complex chemistry, placing a significant burden on independent artists. While recent advances in multimodal AI offer a modern solution, the field lacks the large-sc

#ceramic-glazes #material-design #artificial-intelligence Read on arxiv →

arxivMay 8

BUILD-AND-FIND: An Effort-Aware Protocol for Evaluating Agent-Managed Codebases

arXiv:2605.06136v1 Announce Type: cross Abstract: Most coding-agent benchmarks ask whether generated code behaves correctly. That remains essential, but repository-level engineering is increasingly agent-managed: one agent writes a repository, and later agents inspect, audit, or extend it as working

#benchmark #software-engineering #artificial-intelligence Read on arxiv →

arxivMay 8

Prediction and Empowerment: A Theory of Agency through Bridge Interfaces

arXiv:2605.06346v1 Announce Type: new Abstract: We study agency under partial observability in deterministic physical or simulated worlds, where apparent randomness arises from uncertainty over initial conditions, fixed law bits, and unrolled exogenous noise. We model sensing and actuation as bridge

#artificial-intelligence #research #deterministic-models Read on arxiv →

arxivMay 8

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

arXiv:2603.18257v2 Announce Type: replace-cross Abstract: When an RL agent's observations contain distractors driven by the same confounders as its true state, observational data alone cannot identify which dimensions the agent controls. In our benchmarks, even state-conditioned observational select

SA1 model #machine-learning #artificial-intelligence #reinforcement-learning Read on arxiv →

arxivMay 8

Structural Instability of Feature Composition

arXiv:2605.05223v1 Announce Type: cross Abstract: Sparse Autoencoders (SAEs) have emerged as a powerful paradigm for disentangling feature superposition in transformer-based architectures, enabling precise control via activation steering. However, the theoretical foundations of compositional steerin

#machine-learning #artificial-intelligence #research Read on arxiv →

arxivApr 30bullish

Delineating Knowledge Boundaries for Honest Large Vision-Language Models

arXiv:2604.26419v1 Announce Type: cross Abstract: Large Vision-Language Models (VLMs) have achieved remarkable multimodal performance yet remain prone to factual hallucinations, particularly in long-tail or specialized domains. Moreover, current models exhibit a weak capacity to refuse queries that

#computer-vision #artificial-intelligence #trustworthiness Read on arxiv →

arxivApr 29

Latent-Hysteresis Graph ODEs: Modeling Coupled Topology-Feature Evolution via Continuous Phase Transitions

arXiv:2604.24293v1 Announce Type: cross Abstract: Graph neural ordinary differential equations (Graph ODEs) extend graph learning from discrete message-passing layers to continuous-time representation flows. While it supports adaptive long-range propagation, we show that Graph ODEs with strictly pos

GRHY2 models #graph-learning #machine-learning #artificial-intelligence Read on arxiv →

arxivApr 24bullish

The Last Harness You'll Ever Build

arXiv:2604.21003v1 Announce Type: new Abstract: AI agents are increasingly deployed on complex, domain-specific workflows -- navigating enterprise web applications that require dozens of clicks and form fills, orchestrating multi-step research pipelines that span search, extraction, and synthesis, a

#automation #meta-learning #artificial-intelligence Read on arxiv →

arxivApr 24bearish

Brief chatbot interactions produce lasting changes in human moral values

arXiv:2604.21430v1 Announce Type: new Abstract: Moral judgements form the foundation of human social behavior and societal systems. While Artificial Intelligence chatbots increasingly serve as personal advisors, their influence on moral judgments remains largely unexplored. Here, we examined whether

#artificial-intelligence #ethics #manipulation Read on arxiv →

arxivApr 24

Post-AGI Economies: Autonomy and the First Fundamental Theorem of Welfare Economics

arXiv:2604.21216v1 Announce Type: cross Abstract: The First Fundamental Theorem of Welfare Economics assumes that welfare-bearing agents are autonomous and implicitly relies on a binary distinction between autonomy and instrumentality. Welfare subjects are those who have autonomy and therefore the c

#economics #autonomy #artificial-intelligence Read on arxiv →

arxivApr 24

Reasoning on the Manifold: Bidirectional Consistency for Self-Verification in Diffusion Language Models

arXiv:2604.16565v2 Announce Type: replace-cross Abstract: While Diffusion Large Language Models (dLLMs) offer structural advantages for global planning, efficiently verifying that they arrive at correct answers via valid reasoning traces remains a critical challenge. In this work, we propose a geome

#machine-learning #artificial-intelligence #research Read on arxiv →

arxivApr 24

Formalising the Logit Shift Induced by LoRA: A Technical Note

arXiv:2604.20313v1 Announce Type: new Abstract: This technical note provides a first-order formalisation of the logit shift and fact-margin change induced by Low-Rank Adaptation (LoRA). Using a first-order Fr\'echet approximation around the base model trajectory, we show that the multi-layer LoRA ef

LO1 model #machine-learning #artificial-intelligence #research Read on arxiv →

arxivApr 24bullish

GS-Quant: Granular Semantic and Generative Structural Quantization for Knowledge Graph Completion

arXiv:2604.21649v1 Announce Type: new Abstract: Large Language Models (LLMs) have shown immense potential in Knowledge Graph Completion (KGC), yet bridging the modality gap between continuous graph embeddings and discrete LLM tokens remains a critical challenge. While recent quantization-based appro

#knowledge-graph #natural-language-processing #quantization Read on arxiv →

arxivApr 23bullish

DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid Format

arXiv:2511.17265v2 Announce Type: replace-cross Abstract: Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defense. Massive-scale AI models, which mimic th

#hardware #artificial-intelligence #edge-computing Read on arxiv →

arxivApr 21

Experience Compression Spectrum: Unifying Memory, Skills, and Rules in LLM Agents

arXiv:2604.15877v1 Announce Type: new Abstract: As LLM agents scale to long-horizon, multi-session deployments, efficiently managing accumulated experience becomes a critical bottleneck. Agent memory systems and agent skill discovery both address this challenge -- extracting reusable knowledge from

#artificial-intelligence #multiagent-systems #knowledge-management Read on arxiv →

arxivApr 17

Between a Rock and a Hard Place: The Tension Between Ethical Reasoning and Safety Alignment in LLMs

arXiv:2509.05367v4 Announce Type: replace-cross Abstract: Large Language Model safety alignment predominantly operates on a binary assumption that requests are either safe or unsafe. This classification proves insufficient when models encounter ethical dilemmas, where the capacity to reason through

#safety #security #cryptography Read on arxiv →

arxivApr 17

Integration of Deep Reinforcement Learning and Agent-based Simulation to Explore Strategies Counteracting Information Disorder

arXiv:2604.13047v1 Announce Type: cross Abstract: In recent years, the spread of fake news has triggered a growing interest in Information Disorders (ID) on social media, a phenomenon that has become a focal point of research across fields ranging from complexity theory and computer science to cogni

AGDE2 models #misinformation #social-simulation #artificial-intelligence Read on arxiv →

arxivApr 16bullish

Public Profile Matters: A Scalable Integrated Approach to Recommend Citations in the Wild

arXiv:2603.17361v2 Announce Type: replace-cross Abstract: Proper citation of relevant literature is essential for contextualising and validating scientific contributions. While current citation recommendation systems leverage local and global textual information, they often overlook the nuances of t

PRDA2 models #information-retrieval #citation-recommendation #artificial-intelligence Read on arxiv →

arxivApr 16

PrivacyReasoner: Can LLM Emulate a Human-like Privacy Mind?

arXiv:2601.09152v2 Announce Type: replace Abstract: Prior work on LLM-based privacy focuses on norm judgment over synthetic vignettes, rather than how people think about a specific data practice and formulate their opinions. We address this gap by designing PrivacyReasoner, an agent architecture gro

LLPR2 models #privacy #llm #artificial-intelligence Read on arxiv →

arxivApr 16

Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference

arXiv:2604.12168v1 Announce Type: cross Abstract: The applications of Generative Artificial Intelligence (GenAI) and their intersections with data-driven fields, such as healthcare, finance, transportation, and information security, have led to significant improvements in service efficiency and low

DE1 model #security #cryptography #homomorphic-encryption Read on arxiv →

arxivApr 16bullish

Human-Centric Topic Modeling with Goal-Prompted Contrastive Learning and Optimal Transport

arXiv:2604.12663v1 Announce Type: new Abstract: Existing topic modeling methods, from LDA to recent neural and LLM-based approaches, which focus mainly on statistical coherence, often produce redundant or off-target topics that miss the user's underlying intent. We introduce Human-centric Topic Mode

GCLL2 models #topic-modeling #natural-language-processing #artificial-intelligence Read on arxiv →

arxivApr 16

Efficiency of Proportional Mechanisms in Online Auto-Bidding Advertising

arXiv:2604.12799v1 Announce Type: cross Abstract: The rise of automated bidding strategies in online advertising presents new challenges in designing and analyzing efficient auction mechanisms. In this paper, we focus on proportional mechanisms within the context of auto-bidding and study the effici

#auction-mechanisms #game-theory #artificial-intelligence Read on arxiv →

arxivApr 14bullish

Active Inference with a Self-Prior in the Mirror-Mark Task

arXiv:2604.09673v1 Announce Type: cross Abstract: The mirror self-recognition test evaluates whether a subject touches a mark on its own body that is visible only in a mirror, and is widely used as an indicator of self-awareness. In this study, we present a computational model in which this behavior

TR1 model #self-awareness #machine-learning #artificial-intelligence Read on arxiv →

arxivApr 14

Seven simple steps for log analysis in AI systems

arXiv:2604.09563v1 Announce Type: new Abstract: AI systems produce large volumes of logs as they interact with tools and users. Analysing these logs can help understand model capabilities, propensities, and behaviours, or assess whether an evaluation worked as intended. Researchers have started deve

#log-analysis #research #artificial-intelligence Read on arxiv →

arxivApr 13bullish

Memory Intelligence Agent

arXiv:2604.04503v3 Announce Type: replace Abstract: Deep research agents (DRAs) integrate LLM reasoning with external tools. Memory systems enable DRAs to leverage historical experiences, which are essential for efficient reasoning and autonomous evolution. Existing methods rely on retrieving simila

ME1 model #artificial-intelligence #multiagent-systems #memory-intelligence Read on arxiv →

arxivApr 13bullish

Litmus (Re)Agent: A Benchmark and Agentic System for Predictive Evaluation of Multilingual Models

arXiv:2604.08970v1 Announce Type: cross Abstract: We study predictive multilingual evaluation: estimating how well a model will perform on a task in a target language when direct benchmark results are missing. This problem is common in multilingual deployment, where evaluation coverage is sparse and

LI1 model #multilingual #evaluation #benchmark Read on arxiv →

arxivApr 10bullish

The Art of Building Verifiers for Computer Use Agents

arXiv:2604.06240v1 Announce Type: cross Abstract: Verifying the success of computer use agent (CUA) trajectories is a critical challenge: without reliable verification, neither evaluation nor training signal can be trusted. In this paper, we present lessons learned from building a best-in-class veri

UNWEWE3 models #verification #evaluation #artificial-intelligence Read on arxiv →

arxivApr 10bullish

Information as Structural Alignment: A Dynamical Theory of Continual Learning

arXiv:2604.07108v1 Announce Type: cross Abstract: Catastrophic forgetting is not an engineering failure. It is a mathematical consequence of storing knowledge as global parameter superposition. Existing methods, such as regularization, replay, and frozen subnetworks, add external mechanisms to a sha

STGO2 models #continual-learning #catastrophic-forgetting #machine-learning Read on arxiv →

arxivApr 7bullish

Agentization of Digital Assets for the Agentic Web: Concepts, Techniques, and Benchmark

arXiv:2604.04226v1 Announce Type: cross Abstract: Agentic Web, as a new paradigm that redefines the internet through autonomous, goal-driven interactions, plays an important role in group intelligence. As the foundational semantic primitives of the Agentic Web, digital assets encapsulate interactive

#multiagent #artificial-intelligence #benchmark Read on arxiv →

arxivApr 3bullish

Prompt-Guided Prefiltering for VLM Image Compression

arXiv:2604.00314v1 Announce Type: cross Abstract: The rapid progress of large Vision-Language Models (VLMs) has enabled a wide range of applications, such as image understanding and Visual Question Answering (VQA). Query images are often uploaded to the cloud, where VLMs are typically hosted, hence

#image-compression #vision-language #efficiency Read on arxiv →