DataBubble·

Model Detail

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

—

Provider: NVIDIACategory: codePipeline: text-generationParameters: 30B

DB Score

3.7

Downloads

1.1M

Likes

787

Day

+0.0%

Week

+0.0%

Month

+0.0%

Overview

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a code generation model with 30B parameters released by NVIDIA. The model is registered under the text-generation pipeline tag on Hugging Face, and supports text->text inputs, distributed under a other license.

Pricing & Throughput

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is priced at $0.05/M input tokens and $0.2/M output tokens. Operationally the model offers a 256K-token context window, which matters when sizing it for prompt-heavy or latency-sensitive workloads. At this input rate the model sits in the commodity tier and is suitable for high-volume workloads where per-call cost dominates the decision.

Technical

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 ships with 30B parameters. Total weight footprint is approximately 31.6 GB, which is the relevant figure when planning local-inference VRAM. Distribution is governed by the other license — review the exact terms before commercial deployment.

Use Cases

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is best fit for code completion, repository-scale Q&A, and pair-programming integrations, high-volume batch jobs where per-call cost dominates the budget, and long-context tasks such as full-codebase analysis or book-length summarization (256K tokens). It is a less obvious choice for one-shot generation of security-critical code without review. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Pricing

Input ($/M tokens)

$0.05

Output ($/M tokens)

$0.2

Context Window

256K

Research Paper

arXiv: 2512.20848→

Model Info

Licenseother

Modalitytext->text

Citations56 (7 influential)

Recent newsView all news →

Fine-tune video and image models at scale with NVIDIA NeMo Automodel and 🤗 Diffusers

huggingfacebullish4d ago

NVIDIA Nemotron 3 Embed Ranks #1 Overall on RTEB, Advancing Agentic Retrieval

arxiv7d ago

Decomposing Runtime, Kernel, and Quantization Speedups via a Matched FP16 Intermediate: A Hardware-Conditioned Case Study on Four NVIDIA RTX A5000 GPUs

arXiv:2607.11368v1 Announce Type: cross Abstract: Reported serving speedups from quantized kernels typically bundle the weight format, the kernel, and the inference runtime into one number. We present an attribution study on four NVIDIA RTX A5000 GPUs, 24 GiB each, on a single host with NVLink-bridg

techcrunchneutral11d ago

Paris-based AI voice startup Gradium raises $100M seed, backed by Nvidia

The company is using the cash to open an office in the Bay Area and compete for talent there, "strengthening its position at the heart of the world's leading AI ecosystem."

techcrunchneutral11d ago

Nvidia is a victim of the compute marketplace it created

Having proven how valuable compute can be, the company finds itself at the center of a market everyone wants to be in — while simpler technologies and less interesting companies get rich on the sidelines.

techcrunchneutral20d ago