·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Mira Murati steps back into the spotlight, carefully59m◆SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning2h◆Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning2h◆Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models2h◆Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents2h◆GIPO: Gaussian Importance Sampling Policy Optimization2h◆Why Muon Outperforms Adam: A Curvature Perspective2h◆Vision Hopfield Memory Networks2h◆Provably Auditable and Safe LLM Agents from Human-Authored Ontologies2h◆FlexRank: Nested Low-Rank Knowledge Decomposition for Adaptive Model Deployment2h◆Stable Deep Reinforcement Learning via Isotropic Gaussian Representations2h◆q0: Primitives for Hyper-Epoch Pretraining2h◆What Type of Inference is Active Inference?2h◆Knowledge Index of Noah's Ark2h◆Do Transformers Need Three Projections? Systematic Study of QKV Variants2h◆Semantic Constraint Synthesis for Adaptive Trajectory Optimization via Large Language Models2h◆Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach2h◆Proof-Carrying Agent Actions: Model-Agnostic Runtime Governance for Heterogeneous Agent Systems2h◆SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation2h◆AgenticDiffusion: Agentic Diffusion-based Path Planning for Vision-Based UAV Navigation2h◆Mira Murati steps back into the spotlight, carefully59m◆SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning2h◆Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning2h◆Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models2h◆Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents2h◆GIPO: Gaussian Importance Sampling Policy Optimization2h◆Why Muon Outperforms Adam: A Curvature Perspective2h◆Vision Hopfield Memory Networks2h◆Provably Auditable and Safe LLM Agents from Human-Authored Ontologies2h◆FlexRank: Nested Low-Rank Knowledge Decomposition for Adaptive Model Deployment2h◆Stable Deep Reinforcement Learning via Isotropic Gaussian Representations2h◆q0: Primitives for Hyper-Epoch Pretraining2h◆What Type of Inference is Active Inference?2h◆Knowledge Index of Noah's Ark2h◆Do Transformers Need Three Projections? Systematic Study of QKV Variants2h◆Semantic Constraint Synthesis for Adaptive Trajectory Optimization via Large Language Models2h◆Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach2h◆Proof-Carrying Agent Actions: Model-Agnostic Runtime Governance for Heterogeneous Agent Systems2h◆SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation2h◆AgenticDiffusion: Agentic Diffusion-based Path Planning for Vision-Based UAV Navigation2h◆
News/AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents
arxiv
PublishedApril 6, 2026 at 4:00 AM
▼bearish

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2604.02947v1 Announce Type: new Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and execution environments. Unlike chat systems, they maintain state across interactions and translate intermediate outputs into concrete actions. T

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

// no spam · unsubscribe one-click · free forever

Discussion
Mentioned models
07
  • 01
    Claude Code
  • 02
    OpenClaw
  • 03
    IFlow
  • 04
    Qwen3-Coder
  • 05
    Kimi
  • 06
    GLM
  • 07
    DeepSeek
Source
↗
arxiv
Read original ↗All from arxiv →
Tags
04
#safety#benchmark#autonomous agents#language models

No replies yet. Be first.

Mentioned models
07
  • 01
    Claude Code
  • 02
    OpenClaw
  • 03
    IFlow
  • 04
    Qwen3-Coder
  • 05
    Kimi
  • 06
    GLM
  • 07
    DeepSeek
Source
↗
arxiv
Read original ↗All from arxiv →
Tags
04
#safety#benchmark#autonomous agents#language models

Related coverage

More from ARXIV
arxivSFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning2harxivOptical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning2harxivDynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models2harxivTemporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents2h
The Bubble Brief
WEEKLY

Read safety insights every Tuesday — top movers, new releases, story of the week.

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗
HomeModelsNews