·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Startup Battlefield 200 applications officially close in 3 days2h◆Google will pay SpaceX $920M per month for compute3h◆The most interesting startups right now want to get you off your phone4h◆This is your laptop… on AI5h◆New York lawmakers pass one-year ban on new data centers6h◆The token bill comes due: Inside the industry scramble to manage AI’s runaway costs7h◆The latest AI news we announced in May 20267h◆The ‘together tech’ wave might be the most intriguing startup bet of 20268h◆This AI startup says it can tell if a script will make a hit film8h◆AirTrunk commits $30B to build 5GW of AI data centers in India8h◆The Meta hack shows there’s more to AI security than Mythos13h◆Mira Murati steps back into the spotlight, carefully16h◆SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning18h◆Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning18h◆Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models18h◆Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents18h◆Why Muon Outperforms Adam: A Curvature Perspective18h◆Vision Hopfield Memory Networks18h◆Provably Auditable and Safe LLM Agents from Human-Authored Ontologies18h◆FlexRank: Nested Low-Rank Knowledge Decomposition for Adaptive Model Deployment18h◆Startup Battlefield 200 applications officially close in 3 days2h◆Google will pay SpaceX $920M per month for compute3h◆The most interesting startups right now want to get you off your phone4h◆This is your laptop… on AI5h◆New York lawmakers pass one-year ban on new data centers6h◆The token bill comes due: Inside the industry scramble to manage AI’s runaway costs7h◆The latest AI news we announced in May 20267h◆The ‘together tech’ wave might be the most intriguing startup bet of 20268h◆This AI startup says it can tell if a script will make a hit film8h◆AirTrunk commits $30B to build 5GW of AI data centers in India8h◆The Meta hack shows there’s more to AI security than Mythos13h◆Mira Murati steps back into the spotlight, carefully16h◆SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning18h◆Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning18h◆Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models18h◆Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents18h◆Why Muon Outperforms Adam: A Curvature Perspective18h◆Vision Hopfield Memory Networks18h◆Provably Auditable and Safe LLM Agents from Human-Authored Ontologies18h◆FlexRank: Nested Low-Rank Knowledge Decomposition for Adaptive Model Deployment18h◆
DataBubble·

Model Detail

Qwen logo

Qwen3-8B

▼ 0.6%
Provider: QwenCategory: llmPipeline: text-generationParameters: 8B
DB Score
20.9
Downloads
12.8M
Likes
1K
Day
-0.6%
Week
+0.0%
Month
+41.8%
Overview

Qwen3-8B is a large language model with 8B parameters released by Qwen. The model is registered under the text-generation pipeline tag on Hugging Face, and supports text->text inputs, distributed under the permissive apache-2.0 license.

Pricing & Throughput

Qwen3-8B is priced at $0.04/M input tokens and $0.14/M output tokens. Operationally the model offers a 33K-token context window, which matters when sizing it for prompt-heavy or latency-sensitive workloads. At this input rate the model sits in the commodity tier and is suitable for high-volume workloads where per-call cost dominates the decision.

Technical

Qwen3-8B ships with 8B parameters. The published knowledge cutoff is 2025-03-31, so newer events will not be reflected in zero-shot answers without retrieval. Total weight footprint is approximately 8.2 GB, which is the relevant figure when planning local-inference VRAM. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of Qwen3-8B have moved -0.6% over the past 24 hours, +41.8% over the trailing thirty days. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →
Use Cases

Qwen3-8B is best fit for general-purpose chat and instruction-following workloads, and high-volume batch jobs where per-call cost dominates the budget. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History
Pricing
Input ($/M tokens)
$0.04
Output ($/M tokens)
$0.14
Context Window
33K
Research Paper
arXiv: 2309.00071→
Model Info
Licenseapache-2.0
Modalitytext->text
Knowledge Cutoff2025-03-31
Citations3,775 (409 influential)
Recent newsView all news →
Related News
arxiv3d ago

LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification

arXiv:2606.00647v1 Announce Type: cross Abstract: Detecting psychological defense mechanisms in conversational text remains a challenging clinical NLP problem. For the PsyDefDetect 2026 shared task (nine-class utterance classification evaluated via macro F1), our team LinguIUTics achieves a macro F1

arxivneutral49d ago

Benchmarking Linguistic Adaptation in Comparable-Sized LLMs: A Study of Llama-3.1-8B, Mistral-7B-v0.1, and Qwen3-8B on Romanized Nepali

arXiv:2604.14171v1 Announce Type: new Abstract: Romanized Nepali, the Nepali language written in the Latin alphabet, is the dominant medium for informal digital communication in Nepal, yet it remains critically underresourced in the landscape of Large Language Models (LLMs). This study presents a sy

huggingface249d ago

Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models

arxivneutral21d ago

Procedural-skill SFT across capacity tiers: A W-Shaped pre-SFT Trajectory and Regime-Asymmetric Mechanism on 0.8B-4B Qwen3.5 Models

arXiv:2605.11907v2 Announce Type: replace Abstract: We measure procedural-skill SFT contribution across three Qwen3.5 dense scales (0.8B, 2B, 4B) on a 200-task / 40-skill holdout, with Claude Haiku 4.5 as a frontier reference. The corpus is 353 rows of (task + procedural-skill block, Opus chain-of-t

arxiv25d ago

Qwen3-VL-Seg: Unlocking Open-World Referring Segmentation with Vision-Language Grounding

arXiv:2605.07141v1 Announce Type: cross Abstract: Open-world referring segmentation requires grounding unconstrained language expressions to precise pixel-level regions. Existing multimodal large language models (MLLMs) exhibit strong open-world visual grounding, but their outputs remain limited to

arxiv44d ago

Qwen3.5-Omni Technical Report

arXiv:2604.15804v2 Announce Type: replace Abstract: In this work, we present Qwen3.5-Omni, the latest advancement in the Qwen-Omni model family. Representing a significant evolution over its predecessor, Qwen3.5-Omni scales to hundreds of billions of parameters and supports a 256k context length. By

Related Models
Qwen logo
Qwen3-VL-2B-Instruct
Qwen · 22.5M downloads
Qwen logo
Qwen3-0.6B
Qwen · 20.6M downloads
google-bert logo
bert-base-uncased
google-bert · 69.6M downloads
sentence-transformers logo
paraphrase-multilingual-MiniLM-L12-v2
SBERT · 49.8M downloads
HomeModelsNews