DataBubble·

Model Detail

Qwen3-1.7B

—

Provider: QwenCategory: llmPipeline: text-generationParameters: 1.7B

DB Score

2.8

Downloads

3.6M

Likes

473

Day

+0.0%

Week

+0.9%

Month

+0.0%

Overview

Qwen3-1.7B is a large language model with 1.7B parameters released by Qwen. The model is registered under the text-generation pipeline tag on Hugging Face, distributed under the permissive apache-2.0 license.

Technical

Qwen3-1.7B ships with 1.7B parameters. Total weight footprint is approximately 2.0 GB, which is the relevant figure when planning local-inference VRAM. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of Qwen3-1.7B have moved +0.9% over the trailing seven days. The trend is mildly positive, consistent with a model that is being picked up incrementally rather than going viral. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →

Use Cases

Qwen3-1.7B is best fit for general-purpose chat and instruction-following workloads. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Research Paper

arXiv: 2309.16609→

Model Info

Licenseapache-2.0

Citations3,775 (409 influential)

Recent newsView all news →

LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification

arXiv:2606.00647v1 Announce Type: cross Abstract: Detecting psychological defense mechanisms in conversational text remains a challenging clinical NLP problem. For the PsyDefDetect 2026 shared task (nine-class utterance classification evaluated via macro F1), our team LinguIUTics achieves a macro F1

arxivneutral21d ago

Procedural-skill SFT across capacity tiers: A W-Shaped pre-SFT Trajectory and Regime-Asymmetric Mechanism on 0.8B-4B Qwen3.5 Models

arXiv:2605.11907v2 Announce Type: replace Abstract: We measure procedural-skill SFT contribution across three Qwen3.5 dense scales (0.8B, 2B, 4B) on a 200-task / 40-skill holdout, with Claude Haiku 4.5 as a frontier reference. The corpus is 353 rows of (task + procedural-skill block, Opus chain-of-t

arxiv25d ago

Qwen3-VL-Seg: Unlocking Open-World Referring Segmentation with Vision-Language Grounding

arXiv:2605.07141v1 Announce Type: cross Abstract: Open-world referring segmentation requires grounding unconstrained language expressions to precise pixel-level regions. Existing multimodal large language models (MLLMs) exhibit strong open-world visual grounding, but their outputs remain limited to

arxiv44d ago

Qwen3.5-Omni Technical Report

arXiv:2604.15804v2 Announce Type: replace Abstract: In this work, we present Qwen3.5-Omni, the latest advancement in the Qwen-Omni model family. Representing a significant evolution over its predecessor, Qwen3.5-Omni scales to hundreds of billions of parameters and supports a 256k context length. By

arxivbullish11d ago

AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning

arXiv:2601.17261v4 Announce Type: replace Abstract: Zeroth-Order (ZO) optimization has emerged as a promising solution for fine-tuning LLMs under strict memory constraints, as it avoids the prohibitive memory cost of storing activations for backpropagation. However, existing ZO methods typically emp

Related Models

Qwen3-VL-2B-Instruct

Qwen · 22.5M downloads

Qwen3-0.6B

Qwen · 22.2M downloads

bert-base-uncased

google-bert · 69.6M downloads

paraphrase-multilingual-MiniLM-L12-v2

SBERT · 50.3M downloads