Model Detail
Step-3.7-Flash-NVFP4
—Step-3.7-Flash-NVFP4 is a code generation model with 51.9B parameters released by stepfun-ai. The model is registered under the image-text-to-text pipeline tag on Hugging Face, distributed under the permissive apache-2.0 license.
Step-3.7-Flash-NVFP4 ships with 51.9B parameters. Total weight footprint is approximately 103.8 GB, which is the relevant figure when planning local-inference VRAM. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.
Step-3.7-Flash-NVFP4 is best fit for code completion, repository-scale Q&A, and pair-programming integrations. It is a less obvious choice for one-shot generation of security-critical code without review. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.
Mira Murati steps back into the spotlight, carefully
In the current environment, remaining heads down has diminishing returns; at some point, you have to make some noise just to remind the market you exist.
REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak
arXiv:2605.20654v2 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) demonstrate remarkable capabilities, they remain susceptible to sophisticated, multi-step jailbreak attacks that circumvent conventional surface-level safety alignment by exploiting the internal generation p
Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway
arXiv:2606.05219v1 Announce Type: new Abstract: Recent analyses of multi-pathway Deep Linear Networks use Gradient Flow to predict a "winner-takes-all" specialization in which path symmetry breaks and each feature concentrates in a single pathway. In this work, we show that discrete Gradient Descent
Synthesize and Reward -- Reinforcement Learning for Multi-Step Tool Use in Live Environments
arXiv:2606.03892v2 Announce Type: replace-cross Abstract: Training LLMs to orchestrate multi-step tool calls is held back by three coupled obstacles: realistic stateful execution environments are costly to build, synthetic training queries are often detached from the server's actual state (so the ge
Let It Be Simple: One-Step Action Generation for Vision-Language-Action Models
arXiv:2606.05737v1 Announce Type: cross Abstract: Diffusion-based vision-language-action (VLA) models often inherit the image-generation view: actions are generated by iterative denoising. We argue that VLA action generation has a different condition-target structure: the policy is conditioned on ri
Fast and Robust Convergence Rate for TD(0) with Linear Function Approximation, Universal Learning Steps and I.I.D. Samples
arXiv:2606.05967v1 Announce Type: cross Abstract: In this paper, we study the finite-time behavior of the TD(0) temporal-difference method with linear function approximation (LFA). We consider on-policy independent and identically distributed (i.i.d.) samples, a constant learning step, and the Polya