Model Detail
Gemma-4-31B-IT-NVFP4
▼ 0.3%Gemma-4-31B-IT-NVFP4 is a large language model with 31B parameters released by NVIDIA. The model is registered under the text-generation pipeline tag on Hugging Face, distributed under a other license.
Gemma-4-31B-IT-NVFP4 ships with 31B parameters. Total weight footprint is approximately 20.9 GB, which is the relevant figure when planning local-inference VRAM. Distribution is governed by the other license — review the exact terms before commercial deployment.
Downloads of Gemma-4-31B-IT-NVFP4 have moved -0.3% over the past 24 hours, +66.7% over the trailing thirty days. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.
Gemma-4-31B-IT-NVFP4 is best fit for general-purpose chat and instruction-following workloads. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.
Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines
arXiv:2605.25645v2 Announce Type: replace-cross Abstract: We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical comparison of TPU and GPU platforms for large language model adaptation. Using LoRA on a Google TPU v5
Borrowed Geometry: Cross-Distribution Head-Importance Fingerprints of Frozen Pretrained Gemma 4 31B
arXiv:2605.00333v2 Announce Type: replace-cross Abstract: Frozen Gemma 4 31B weights pretrained exclusively on text, unmodified, transfer through a thin trainable interface to non-text modalities the substrate has never processed. On the L24--L29 slice (192 attention heads), an English-text TxtCopy
PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation
arXiv:2605.05159v1 Announce Type: new Abstract: We present our system for SemEval-2026 Task 9: Multilingual Polarization Detection, a binary classification task spanning 22 languages. Our approach fine-tunes separate Gemma~3 models (12B and 27B parameters) per language using Low-Rank Adaptation (LoR
MedGemma 1.5 Technical Report
arXiv:2604.05081v2 Announce Type: replace Abstract: We introduce MedGemma 1.5 4B, the latest model in the MedGemma collection. MedGemma 1.5 expands on MedGemma 1 by integrating additional capabilities: high-dimensional medical imaging (CT/MRI volumes and histopathology whole slide images), anatomica
Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B
arXiv:2604.24070v1 Announce Type: cross Abstract: Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tuning (CSFT) with self
A Large Language Model Based Pipeline for Review of Systems Entity Recognition from Clinical Notes
arXiv:2506.11067v3 Announce Type: replace Abstract: Objective: Develop a cost-effective, large language model (LLM)-based pipeline for automatically extracting Review of Systems (ROS) entities from clinical notes. Materials and Methods: The pipeline extracts ROS section from the clinical note using