DataBubble·

Model Detail

Step-3.5-Flash-Base-Midtrain

—

Provider: stepfun-aiCategory: codePipeline: text-generation

Downloads

132

Likes

Day

+0.0%

Week

+0.0%

Month

+0.0%

Download History

Truncated Rectified Flow Policy for Reinforcement Learning with One-Step Sampling

arXiv:2604.09159v1 Announce Type: new Abstract: Maximum entropy reinforcement learning (MaxEnt RL) has become a standard framework for sequential decision making, yet its standard Gaussian policy parameterization is inherently unimodal, limiting its ability to model complex multimodal action distrib

arxiv4h ago

ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion

arXiv:2604.09450v1 Announce Type: cross Abstract: Chest X-ray report generation (CXR-RG) has the potential to substantially alleviate radiologists' workload. However, conventional autoregressive vision--language models (VLMs) suffer from high inference latency due to sequential token decoding. Diffu

arxiv4h ago

Envisioning the Future, One Step at a Time

arXiv:2604.09527v1 Announce Type: cross Abstract: Accurately anticipating how complex, diverse scenes will evolve requires models that represent uncertainty, simulate along extended interaction chains, and efficiently explore many plausible futures. Yet most existing approaches rely on dense video o

arxiv4h ago

Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection

arXiv:2604.08800v1 Announce Type: cross Abstract: Stepping-stone intrusions (SSIs) are a prevalent network evasion technique in which attackers route sessions through chains of compromised intermediate hosts to obscure their origin. Effective SSI detection requires correlating the incoming and outgo

arxiv4h ago

Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym

arXiv:2604.09338v1 Announce Type: new Abstract: Spatial reasoning is central to navigation and robotics, yet measuring model capabilities on these tasks remains difficult. Existing benchmarks evaluate models in a one-shot setting, requiring full solution generation in a single response, unlike human

arxivneutral2d ago

Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models

arXiv:2503.13551v5 Announce Type: replace Abstract: Recent studies show that Large Language Models (LLMs) achieve strong reasoning capabilities through supervised fine-tuning or reinforcement learning. However, a key approach, the Process Reward Model (PRM), suffers from reward hacking, making it un

Related Models

Step-3.5-Flash

stepfun-ai · 126K downloads

Step-3.5-Flash-Base

stepfun-ai · 513 downloads

all-MiniLM-L6-v2

SBERT · 195.5M downloads

Kimi-K2.5

moonshotai · 6.0M downloads