arxivJul 18bullish

Reachability-Aware Pretraining for Efficient Target-Oriented Path Exploration in Temporal Knowledge Graph Reasoning

arXiv:2607.14886v1 Announce Type: new Abstract: Temporal Knowledge Graph (TKG) reasoning under the extrapolation setting focuses on forecasting future time-stamped events (facts) from historical data in a temporal knowledge graph. Existing approaches, reinforcement learning (RL)-based multi-hop reas

RA1 model #temporal-knowledge-graph #reinforcement-learning #pretraining Read on arxiv →

arxivJul 14

Interpreting Latent CoT Reasoning as Dynamical Systems

arXiv:2607.09698v1 Announce Type: new Abstract: Recent latent reasoning methods, such as CODI and COCONUT, face a fundamental interpretability problem: they maintain multiple superimposed candidate traces in the hidden space at each step, unlike explicit- CoT, which follows a single transparent reas

COCOCO4 models · +1 #interpretability #reasoning #dynamical systems Read on arxiv →

arxivJul 2bullish

Efficient Multilingual Reasoning Transfer via Progressive Code-Switching

arXiv:2607.00485v1 Announce Type: new Abstract: Large reasoning models (LRMs) have achieved strong reasoning capabilities in English, yet their performance degrades significantly when required to reason in other languages. A natural solution is to transfer the model's English reasoning ability to ta

#language-models #transfer-learning #reasoning Read on arxiv →

arxivJul 2bullish

Graph-Native Reinforcement Learning Enables Traceable Scientific Hypothesis Generation through Conceptual Recombination

arXiv:2607.00924v1 Announce Type: new Abstract: Accelerating materials discovery requires AI systems that can generate scientifically valid hypotheses through multi-step, domain-grounded reasoning. Standard large language models often produce fluent but weakly traceable responses to open-ended mater

GR1 model #materials-science #graph-native #reasoning Read on arxiv →

arxivJul 1

When Does Learning to Stop Help? A Cost-Aware Study of Early Exits in Reasoning Models

arXiv:2606.30852v1 Announce Type: new Abstract: Reasoning models spend different amounts of useful computation across instances, but it remains unclear when a learned stopping rule improves over simple confidence or convergence thresholds. We study this question with LearnStop, a hidden-state-free c

QW1 model #reasoning #language models #stopping rules Read on arxiv →

arxivJun 12bullish

LLMs as ASP Programmers: Self-Correction Enables Task-Agnostic Nonmonotonic Reasoning

arXiv:2604.27960v2 Announce Type: replace Abstract: Recent large language models (LLMs) have achieved impressive reasoning milestones but continue to struggle with high computational costs, logical inconsistencies, and sharp performance degradation on high-complexity problems. While neuro-symbolic m

LL1 model #neuro-symbolic #reasoning #nonmonotonic Read on arxiv →

arxivJun 6bullish

ReTreVal: Reasoning Tree with Validation and Cross-Problem Memory for Large Language Models

arXiv:2601.02880v2 Announce Type: replace Abstract: Every existing inference-time reasoning framework discards all failure context at problem boundaries, leaving a model solving problem 500 no wiser than it was on problem 1. We present ReTreVal (Reasoning Tree with Validation), a training-free frame

REZESE3 models #inference-time #reasoning #llm Read on arxiv →

arxivJun 6bullish

Toward Culturally Aligned LLMs through Ontology-Guided Multi-Agent Reasoning

arXiv:2601.21700v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) increasingly support culturally sensitive decision making, yet often exhibit misalignment due to skewed pretraining data and the absence of structured value representations. Existing methods can steer outputs, but

#ontology #multimodal #cultural-sensitivity Read on arxiv →

arxivMay 29bullish

CosmicFish-HRM: Adaptive Reasoning via Hierarchical Recurrent Mechanisms in Compact Language Models

arXiv:2605.28919v1 Announce Type: cross Abstract: Large language models have achieved strong reasoning capabilities, though often at the cost of massive parameter counts and expensive inference. In this work, we explore a different direction: adaptive reasoning depth in compact language models. We p

CO1 model #compact-models #reasoning #autoregressive Read on arxiv →

arxivMay 29bullish

Unlocking the Working Memory of Large Language Models for Latent Reasoning

arXiv:2605.30343v1 Announce Type: cross Abstract: To improve the reasoning capabilities of large language models, test-time compute is typically scaled by generating intermediate tokens before the final answer. However, this couples reasoning to autoregressive generation and thereby conflates intern

#reasoning #language-models #working-memory Read on arxiv →

arxivMay 29bearish

The Price Reversal Phenomenon: When Cheaper Reasoning Models Cost More

arXiv:2603.23971v2 Announce Type: replace-cross Abstract: Developers and consumers increasingly choose reasoning models (RMs) based on their listed API prices. However, how accurately do these prices reflect actual inference costs? We conduct the first systematic study of this question, evaluating 8

GEGP2 models #cost #pricing #reasoning Read on arxiv →

arxivMay 26

Residual Drift Dominates Contradiction in Multi-Turn Constraint Reasoning

arXiv:2605.23940v1 Announce Type: new Abstract: How do multi-turn reasoning systems fail? The expected answer is logical contradiction, in which the system's maintained state becomes unsatisfiable. We show that the dominant mode is instead satisfiable drift, where the internal state stays consistent

MU1 model #reasoning #benchmark #multi-turn Read on arxiv →

arxivMay 25bullish

Scaling-Aware Adapter for Structure-Grounded LLM Reasoning

arXiv:2602.02780v3 Announce Type: replace Abstract: Large language models (LLMs) are enabling reasoning over 2D and 3D structures, yet existing methods remain modality-specific and typically compress structural inputs through sequence-based tokenization or fixed-length query connectors. Such archite

CU1 model #large-language-models #multimodal #reasoning Read on arxiv →

arxivMay 22bullish

DeFacto: Counterfactual Thinking with Images for Enforcing Evidence-Grounded and Faithful Reasoning

arXiv:2509.20912v4 Announce Type: replace Abstract: Recent advances in multimodal language models (MLLMs) have made thinking with images a dominant paradigm for multimodal reasoning. However, existing methods still fail to ensure evidence-answer consistency, where correct answers must be supported b

#multimodal #reasoning #counterfactual Read on arxiv →

arxivMay 13bullish

LLM-Guided Monte Carlo Tree Search over Knowledge Graphs: Composing Mechanistic Explanations for Drug-Disease Pairs

arXiv:2605.09542v1 Announce Type: new Abstract: Extracting multi-step explanations from knowledge graphs poses a combinatorial challenge requiring both heuristic guidance (as candidates proliferate with depth) and credit assignment (as path quality emerges over extended sequences). Frontier LLMs, st

FRTE2 models #neuro-symbolic #knowledge-graphs #reasoning Read on arxiv →

arxivMay 13bullish

Drop the Act: Probe-Filtered RL for Faithful Chain-of-Thought Reasoning

arXiv:2605.11467v1 Announce Type: new Abstract: Reasoning models post-hoc rationalize answers they have already committed to internally, producing chains of *reasoning theater*: deliberative-looking steps that contribute nothing to correctness. This wastes inference tokens, pollutes interpretability

MEQWCL3 models #reasoning #reinforcement-learning #interpretability Read on arxiv →

arxivMay 5

Evaluating Legal Reasoning Traces with Legal Issue Tree Rubrics

arXiv:2512.01020v2 Announce Type: replace Abstract: Evaluating the quality of LLM-generated reasoning traces in expert domains (e.g., law) is essential for ensuring credibility and explainability, yet remains challenging due to the inherent complexity of such reasoning tasks. We introduce LEGIT (LEG

LL1 model #evaluation #reasoning #legal Read on arxiv →

arxivApr 21bullish

Revisiting Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

arXiv:2510.10959v3 Announce Type: replace-cross Abstract: Reasoning ability has become a defining capability of Large Language Models (LLMs), with Reinforcement Learning with Verifiable Rewards (RLVR) emerging as a key paradigm to enhance it. However, RLVR training often suffers from policy entropy

LA1 model #machine-learning #reasoning #reinforcement-learning Read on arxiv →

arxivApr 17bullish

Training-Free Test-Time Contrastive Learning for Large Language Models

arXiv:2604.13552v1 Announce Type: cross Abstract: Large language models (LLMs) demonstrate strong reasoning capabilities, but their performance often degrades under distribution shift. Existing test-time adaptation (TTA) methods rely on gradient-based updates that require white-box access and need s

#adaptation #reasoning #language-models Read on arxiv →

arxivApr 8bullish

MMEmb-R1: Reasoning-Enhanced Multimodal Embedding with Pair-Aware Selection and Adaptive Control

arXiv:2604.06156v1 Announce Type: cross Abstract: MLLMs have been successfully applied to multimodal embedding tasks, yet their generative reasoning capabilities remain underutilized. Directly incorporating chain-of-thought reasoning into embedding learning introduces two fundamental challenges. Fir

MM1 model #multimodal-embedding #reasoning #computer-vision Read on arxiv →

arxivApr 7bullish

TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization

arXiv:2601.22776v2 Announce Type: replace Abstract: Multi-turn tool-integrated reasoning enables Large Language Models (LLMs) to solve complex tasks through iterative information retrieval. However, current reinforcement learning (RL) frameworks for search-augmented reasoning predominantly rely on s

QWQW2 models #reinforcement learning #large language models #reasoning Read on arxiv →