DataBubble·

Model Detail

Step-3.5-Flash-Base

—

Provider: stepfun-aiCategory: codePipeline: text-generation

Downloads

513

Likes

Day

+0.0%

Week

+0.0%

Month

+0.0%

Download History

Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models

arXiv:2503.13551v5 Announce Type: replace Abstract: Recent studies show that Large Language Models (LLMs) achieve strong reasoning capabilities through supervised fine-tuning or reinforcement learning. However, a key approach, the Process Reward Model (PRM), suffers from reward hacking, making it un

arxivneutral2d ago

SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking

arXiv:2604.07922v1 Announce Type: cross Abstract: Large Reasoning Models (LRMs) have revolutionized complex problem-solving, yet they exhibit a pervasive "overthinking", generating unnecessarily long reasoning chains. While current solutions improve token efficiency, they often sacrifice fine-graine

arxivneutral3d ago

TraceSafe: A Systematic Assessment of LLM Guardrails on Multi-Step Tool-Calling Trajectories

arXiv:2604.07223v1 Announce Type: cross Abstract: As large language models (LLMs) evolve from static chatbots into autonomous agents, the primary vulnerability surface shifts from final outputs to intermediate execution traces. While safety guardrails are well-benchmarked for natural language respon

arxivneutral3d ago

On the Step Length Confounding in LLM Reasoning Data Selection

arXiv:2604.06834v1 Announce Type: cross Abstract: Large reasoning models have recently demonstrated strong performance on complex tasks that require long chain-of-thought reasoning, through supervised fine-tuning on large-scale and high-quality datasets. To construct such datasets, existing pipeline

arxivneutral3d ago

Reasoning Fails Where Step Flow Breaks

arXiv:2604.06695v1 Announce Type: new Abstract: Large reasoning models (LRMs) that generate long chains of thought now perform well on multi-step math, science, and coding tasks. However, their behavior is still unstable and hard to interpret, and existing analysis tools struggle with such long, str

arxivneutral3d ago

The Stepwise Informativeness Assumption: Why are Entropy Dynamics and Reasoning Correlated in LLMs?

arXiv:2604.06192v1 Announce Type: cross Abstract: Recent work uses entropy-based signals at multiple representation levels to study reasoning in large language models, but the field remains largely empirical. A central unresolved puzzle is why internal entropy dynamics, defined under the predictive

Related Models

Step-3.5-Flash

stepfun-ai · 126K downloads

Step-3.5-Flash-Base-Midtrain

stepfun-ai · 132 downloads

all-MiniLM-L6-v2

SBERT · 195.5M downloads

Kimi-K2.5

moonshotai · 6.0M downloads