DataBubble·

Model Detail

gemma-4-31B-it-GGUF

—

Provider: unslothCategory: multimodalPipeline: image-text-to-textParameters: 31B

DB Score

38.2

Downloads

528K

Likes

548

Day

+0.0%

Week

+0.0%

Month

+24.8%

Overview

gemma-4-31B-it-GGUF is a multimodal model with 31B parameters released by unsloth. The model is registered under the image-text-to-text pipeline tag on Hugging Face, and supports text+image+video->text inputs, distributed under the permissive apache-2.0 license.

Pricing & Throughput

gemma-4-31B-it-GGUF is priced at $0.38/M input tokens and $1.15/M output tokens. Operationally the model offers a 131K-token context window, which matters when sizing it for prompt-heavy or latency-sensitive workloads. At this input rate the model sits in the commodity tier and is suitable for high-volume workloads where per-call cost dominates the decision.

Technical

gemma-4-31B-it-GGUF ships with 31B parameters, distributed as a quantized weight variant for lower-VRAM inference. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of gemma-4-31B-it-GGUF have moved +24.8% over the trailing thirty days. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →

Use Cases

gemma-4-31B-it-GGUF is best fit for mixed text-and-image reasoning tasks such as document understanding, and high-volume batch jobs where per-call cost dominates the budget. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Pricing

Input ($/M tokens)

$0.38

Output ($/M tokens)

$1.15

Context Window

131K

Model Info

Licenseapache-2.0

Modalitytext+image+video->text

Citations1,149 (137 influential)

Recent newsView all news →

Gemma 4 Technical Report

arXiv:2607.02770v2 Announce Type: replace Abstract: We introduce Gemma 4, a new generation of open-weight, natively multimodal language models in the Gemma model family. Designed to advance compute efficiency and reasoning, the Gemma 4 model suite features dense and Mixture-of-Experts architectures,

arxivneutral4d ago

Do Active SAE Feature Planes Carry More Holonomy? A Preregistered Reversal in Gemma

arXiv:2607.20522v1 Announce Type: new Abstract: This paper tests whether holonomy concentrates on active sparse-autoencoder (SAE) feature planes in Gemma 2 2B, a concrete operationalization of the broader semantic-concentration prediction. Holonomy is measured at the final-token layer-12 to layer-13

arxiv14d ago

Trivial Prompt Reframing Bypasses Safety Guardrails in Google\'s MedGemma-4B

arXiv:2607.09804v1 Announce Type: cross Abstract: Open-weight medical language models are increasingly used as the base of patient-facing and clinician-support applications. Their model cards prohibit specific behaviors -- recommending exact drug dosages, issuing definitive diagnoses, prescribing tr

huggingface27d ago

Hugging Face and Cerebras bring Gemma 4 to real-time voice AI

arxiv28d ago

DistilledGemma: Balanced Efficiency-Accuracy for Person-Place Relation Extraction from Multilingual Historical Articles

arXiv:2606.29130v1 Announce Type: new Abstract: We present DistilledGemma, an efficient and accurate system for the HIPE-2026 shared task on person-place relation extraction from multilingual historical newspaper articles in English, German, and French. Our approach adopts a three-stage knowledge di

arxivneutral38d ago

How Transparent is DiffusionGemma?

arXiv:2606.20560v1 Announce Type: cross Abstract: LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a continuous

Related Models

Qwen3.6-27B-MTP-GGUF

unsloth · 3.0M downloads

Qwen3.6-27B-NVFP4

unsloth · 2.7M downloads

Qwen3-VL-2B-Instruct

Qwen · 22.5M downloads

gemma-4-26B-A4B-it

Google · 13.1M downloads