DataBubble·

Model Detail

gemma-4-26B-A4B-it

▼ 0.3%

Provider: GoogleCategory: multimodalPipeline: image-text-to-textParameters: 26B

DB Score

1.4

Downloads

13.2M

Likes

Day

-0.3%

Week

+3.1%

Month

+0.0%

Overview

gemma-4-26B-A4B-it is a multimodal model with 26B parameters released by Google. The model is registered under the image-text-to-text pipeline tag on Hugging Face, and supports text+image+video->text inputs, distributed under the permissive apache-2.0 license.

Pricing & Throughput

gemma-4-26B-A4B-it is priced at $0.13/M input tokens and $0.4/M output tokens. Operationally the model offers a 262K-token context window, which matters when sizing it for prompt-heavy or latency-sensitive workloads. At this input rate the model sits in the commodity tier and is suitable for high-volume workloads where per-call cost dominates the decision.

Technical

gemma-4-26B-A4B-it ships with 26B parameters. Total weight footprint is approximately 26.5 GB, which is the relevant figure when planning local-inference VRAM. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of gemma-4-26B-A4B-it have moved -0.3% over the past 24 hours, +3.1% over the trailing seven days. The trend is mildly positive, consistent with a model that is being picked up incrementally rather than going viral. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →

Use Cases

gemma-4-26B-A4B-it is best fit for mixed text-and-image reasoning tasks such as document understanding, high-volume batch jobs where per-call cost dominates the budget, and long-context tasks such as full-codebase analysis or book-length summarization (262K tokens). Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Pricing

Input ($/M tokens)

$0.13

Output ($/M tokens)

$0.4

Context Window

262K

Research Paper

arXiv: 2607.02770→

Model Info

Licenseapache-2.0

Modalitytext+image+video->text

Citations1,149 (137 influential)

Recent newsView all news →

Trivial Prompt Reframing Bypasses Safety Guardrails in Google\'s MedGemma-4B

arXiv:2607.09804v1 Announce Type: cross Abstract: Open-weight medical language models are increasingly used as the base of patient-facing and clinician-support applications. Their model cards prohibit specific behaviors -- recommending exact drug dosages, issuing definitive diagnoses, prescribing tr

huggingface20d ago

Hugging Face and Cerebras bring Gemma 4 to real-time voice AI

arxiv21d ago

DistilledGemma: Balanced Efficiency-Accuracy for Person-Place Relation Extraction from Multilingual Historical Articles

arXiv:2606.29130v1 Announce Type: new Abstract: We present DistilledGemma, an efficient and accurate system for the HIPE-2026 shared task on person-place relation extraction from multilingual historical newspaper articles in English, German, and French. Our approach adopts a three-stage knowledge di

arxivneutral31d ago

How Transparent is DiffusionGemma?

arXiv:2606.20560v1 Announce Type: cross Abstract: LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a continuous

arxiv36d ago

Neither Parallel Nor Sequential: How DiffusionGemma Actually Commits Tokens

arXiv:2606.14620v1 Announce Type: new Abstract: Open diffusion language models are marketed as parallel, non-autoregressive decoders, yet the order in which a shipped checkpoint actually commits its tokens is almost never measured. We instrument DiffusionGemma 26B, a masked discrete-diffusion mixtur

arxiv48d ago

Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines

arXiv:2605.25645v2 Announce Type: replace-cross Abstract: We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical comparison of TPU and GPU platforms for large language model adaptation. Using LoRA on a Google TPU v5

Related Models

gemma-4-31B-it

Google · 12.3M downloads

gemma-4-E4B-it

Google · 5.5M downloads

Qwen3-VL-2B-Instruct

Qwen · 22.5M downloads

gemma-4-26B-A4B-it

Google · 13.6M downloads