gemma-4-12B-it news

26 articles mentioning gemma-4-12B-it

arxivJul 14

Trivial Prompt Reframing Bypasses Safety Guardrails in Google\'s MedGemma-4B

arXiv:2607.09804v1 Announce Type: cross Abstract: Open-weight medical language models are increasingly used as the base of patient-facing and clinician-support applications. Their model cards prohibit specific behaviors -- recommending exact drug dosages, issuing definitive diagnoses, prescribing tr

huggingfaceJul 1

Hugging Face and Cerebras bring Gemma 4 to real-time voice AI

arxivJun 30

DistilledGemma: Balanced Efficiency-Accuracy for Person-Place Relation Extraction from Multilingual Historical Articles

arXiv:2606.29130v1 Announce Type: new Abstract: We present DistilledGemma, an efficient and accurate system for the HIPE-2026 shared task on person-place relation extraction from multilingual historical newspaper articles in English, German, and French. Our approach adopts a three-stage knowledge di

arxivJun 20

How Transparent is DiffusionGemma?

arXiv:2606.20560v1 Announce Type: cross Abstract: LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a continuous

arxivJun 15

Neither Parallel Nor Sequential: How DiffusionGemma Actually Commits Tokens

arXiv:2606.14620v1 Announce Type: new Abstract: Open diffusion language models are marketed as parallel, non-autoregressive decoders, yet the order in which a shipped checkpoint actually commits its tokens is almost never measured. We instrument DiffusionGemma 26B, a masked discrete-diffusion mixtur

arxivJun 3

Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines

arXiv:2605.25645v2 Announce Type: replace-cross Abstract: We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical comparison of TPU and GPU platforms for large language model adaptation. Using LoRA on a Google TPU v5

arxivMay 20

Borrowed Geometry: Cross-Distribution Head-Importance Fingerprints of Frozen Pretrained Gemma 4 31B

arXiv:2605.00333v2 Announce Type: replace-cross Abstract: Frozen Gemma 4 31B weights pretrained exclusively on text, unmodified, transfer through a thin trainable interface to non-text modalities the substrate has never processed. On the L24--L29 slice (192 attention heads), an English-text TxtCopy

arxivMay 7

PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

arXiv:2605.05159v1 Announce Type: new Abstract: We present our system for SemEval-2026 Task 9: Multilingual Polarization Detection, a binary classification task spanning 22 languages. Our approach fine-tunes separate Gemma~3 models (12B and 27B parameters) per language using Low-Rank Adaptation (LoR

arxivMay 6

MedGemma 1.5 Technical Report

arXiv:2604.05081v2 Announce Type: replace Abstract: We introduce MedGemma 1.5 4B, the latest model in the MedGemma collection. MedGemma 1.5 expands on MedGemma 1 by integrating additional capabilities: high-dimensional medical imaging (CT/MRI volumes and histopathology whole slide images), anatomica

arxivApr 29

Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B

arXiv:2604.24070v1 Announce Type: cross Abstract: Small instruct-tuned LLMs produce degenerate verbal confidence under minimal elicitation: ceiling rates above 95%, near-chance Type-2 AUROC, and Invalid validity profiles. We test whether confidence-conditioned supervised fine-tuning (CSFT) with self

huggingfaceApr 22

Gemma 4 VLA Demo on Jetson Orin Nano Super

arxivApr 11

Enabling Intrinsic Reasoning over Dense Geospatial Embeddings with DFR-Gemma

arXiv:2604.07490v1 Announce Type: new Abstract: Representation learning for geospatial and spatio-temporal data plays a critical role in enabling general-purpose geospatial intelligence. Recent geospatial foundation models, such as the Population Dynamics Foundation Model (PDFM), encode complex popu

arxivApr 9

Gemma 4, Phi-4, and Qwen3: Accuracy-Efficiency Tradeoffs in Dense and MoE Reasoning Language Models

arXiv:2604.07035v1 Announce Type: new Abstract: Mixture-of-experts (MoE) language models are often expected to offer better quality-efficiency tradeoffs than dense models because only a subset of parameters is activated per token, but the practical value of that advantage depends on end-to-end behav

arxivApr 8

MedGemma Technical Report

arXiv:2507.05201v4 Announce Type: replace Abstract: Artificial intelligence (AI) has significant potential in healthcare applications, but its training and deployment faces challenges due to healthcare's diverse data, complex tasks, and the need to preserve privacy. Foundation models that perform we

huggingfaceApr 2

gemma-4-12B-it news

26 articles mentioning gemma-4-12B-it

arxivJul 14

PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

arxivMay 6