DataBubble·

Model Detail

Gemma-4-31B-IT-NVFP4

▼ 0.3%

Provider: NVIDIACategory: llmPipeline: text-generationParameters: 31B

DB Score

1.8

Downloads

2.3M

Likes

505

Day

-0.3%

Week

+0.0%

Month

+66.7%

Overview

Gemma-4-31B-IT-NVFP4 is a large language model with 31B parameters released by NVIDIA. The model is registered under the text-generation pipeline tag on Hugging Face, distributed under a other license.

Technical

Gemma-4-31B-IT-NVFP4 ships with 31B parameters. Total weight footprint is approximately 20.9 GB, which is the relevant figure when planning local-inference VRAM. Distribution is governed by the other license — review the exact terms before commercial deployment.

Trending Signal

Downloads of Gemma-4-31B-IT-NVFP4 have moved -0.3% over the past 24 hours, +66.7% over the trailing thirty days. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →

Use Cases

Gemma-4-31B-IT-NVFP4 is best fit for general-purpose chat and instruction-following workloads. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Research Paper

arXiv: 2403.08295→

Model Info

Licenseother

Citations1,088 (132 influential)

Recent newsView all news →

Trivial Prompt Reframing Bypasses Safety Guardrails in Google\'s MedGemma-4B

arXiv:2607.09804v1 Announce Type: cross Abstract: Open-weight medical language models are increasingly used as the base of patient-facing and clinician-support applications. Their model cards prohibit specific behaviors -- recommending exact drug dosages, issuing definitive diagnoses, prescribing tr

huggingface20d ago

Hugging Face and Cerebras bring Gemma 4 to real-time voice AI

arxiv21d ago

DistilledGemma: Balanced Efficiency-Accuracy for Person-Place Relation Extraction from Multilingual Historical Articles

arXiv:2606.29130v1 Announce Type: new Abstract: We present DistilledGemma, an efficient and accurate system for the HIPE-2026 shared task on person-place relation extraction from multilingual historical newspaper articles in English, German, and French. Our approach adopts a three-stage knowledge di

arxivneutral31d ago

How Transparent is DiffusionGemma?

arXiv:2606.20560v1 Announce Type: cross Abstract: LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a continuous

arxiv36d ago

Neither Parallel Nor Sequential: How DiffusionGemma Actually Commits Tokens

arXiv:2606.14620v1 Announce Type: new Abstract: Open diffusion language models are marketed as parallel, non-autoregressive decoders, yet the order in which a shipped checkpoint actually commits its tokens is almost never measured. We instrument DiffusionGemma 26B, a masked discrete-diffusion mixtur

arxiv48d ago

Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines

arXiv:2605.25645v2 Announce Type: replace-cross Abstract: We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical comparison of TPU and GPU platforms for large language model adaptation. Using LoRA on a Google TPU v5

Related Models

Qwen3.6-35B-A3B-NVFP4

NVIDIA · 8.7M downloads

Gemma-4-26B-A4B-NVFP4

NVIDIA · 2.2M downloads

bert-base-uncased

google-bert · 69.6M downloads

paraphrase-multilingual-MiniLM-L12-v2

SBERT · 48.6M downloads