·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Beyond Accuracy: Measuring Logical Compliance of Predictive Models1h◆A Multi-Agent system for Multi-Objective constrained optimization1h◆Human-AI Agent Interaction in a Business Context1h◆Interpretable and Verifiable Hardware Generation with LLM-Driven Stepwise Refinement1h◆Techniques for Peak Memory Reduction for LoRA Fine-tuning of LLMs on Edge Devices1h◆Temporal Self-Imitation Learning1h◆Beyond Static Endpoints: Tool Programs as an Interface for Flexible Agentic Web Services1h◆Deontic Policies for Runtime Governance of Agentic AI Systems1h◆Hidden Anchors in Multi-Agent LLM Deliberation1h◆LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data1h◆Bistable by Construction: Wall-Clock-Calibrated State Monitors Have No Moment-Detection Regime at Agent Cadence1h◆Can In-Context Learning Support Intrinsic Curiosity?1h◆PSCT-Net: Geometry-Aware Pediatric Skull CT Reconstruction via Differentiable Back-Projection and Attention-Guided Refinement1h◆CRAX: Fast Safe Reinforcement Learning Benchmarking1h◆DataMagic: Transforming Tabular Data into Data Insight Video1h◆One Probe Won't Catch Them All: Towards Targeted Deception Detection1h◆Too long; didn't solve1h◆MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning1h◆Granularity-Regulated Adaptive Computational Efficiency for Optimal Verification in Test-Time Scaling1h◆When Does Streaming Tool Use Help? Characterizing Tool-Intent Stabilization in Streaming Retrieval-Augmented Generation1h◆Beyond Accuracy: Measuring Logical Compliance of Predictive Models1h◆A Multi-Agent system for Multi-Objective constrained optimization1h◆Human-AI Agent Interaction in a Business Context1h◆Interpretable and Verifiable Hardware Generation with LLM-Driven Stepwise Refinement1h◆Techniques for Peak Memory Reduction for LoRA Fine-tuning of LLMs on Edge Devices1h◆Temporal Self-Imitation Learning1h◆Beyond Static Endpoints: Tool Programs as an Interface for Flexible Agentic Web Services1h◆Deontic Policies for Runtime Governance of Agentic AI Systems1h◆Hidden Anchors in Multi-Agent LLM Deliberation1h◆LLM Doesn't Know What It Doesn't Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data1h◆Bistable by Construction: Wall-Clock-Calibrated State Monitors Have No Moment-Detection Regime at Agent Cadence1h◆Can In-Context Learning Support Intrinsic Curiosity?1h◆PSCT-Net: Geometry-Aware Pediatric Skull CT Reconstruction via Differentiable Back-Projection and Attention-Guided Refinement1h◆CRAX: Fast Safe Reinforcement Learning Benchmarking1h◆DataMagic: Transforming Tabular Data into Data Insight Video1h◆One Probe Won't Catch Them All: Towards Targeted Deception Detection1h◆Too long; didn't solve1h◆MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning1h◆Granularity-Regulated Adaptive Computational Efficiency for Optimal Verification in Test-Time Scaling1h◆When Does Streaming Tool Use Help? Characterizing Tool-Intent Stabilization in Streaming Retrieval-Augmented Generation1h◆
News/Temporal Self-Imitation Learning
arxiv
PublishedJune 19, 2026 at 4:00 AM
—neutral

Temporal Self-Imitation Learning

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2606.19752v1 Announce Type: cross Abstract: Long-horizon robot manipulation policies trained with reward shaping can still exploit dense rewards through inefficient interaction, while rare efficient behaviors may be forgotten during training. We argue that temporal efficiency itself provides a

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

// no spam · unsubscribe one-click · free forever

Discussion
Source
↗
arxiv
Read original ↗All from arxiv →

No replies yet. Be first.

Source
↗
arxiv
Read original ↗All from arxiv →

Related coverage

More from ARXIV
arxivBeyond Accuracy: Measuring Logical Compliance of Predictive Models1harxivA Multi-Agent system for Multi-Objective constrained optimization1harxivHuman-AI Agent Interaction in a Business Context1harxivInterpretable and Verifiable Hardware Generation with LLM-Driven Stepwise Refinement1h
The Bubble Brief
WEEKLY

Read AI insights every Tuesday — top movers, new releases, story of the week.

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗
HomeModelsNews