Model Detail
Qwen3.5-9B-DeepSeek-V4-Flash-GGUF
—Qwen3.5-9B-DeepSeek-V4-Flash-GGUF is a multimodal model with 9B parameters released by Jackrong. The model is registered under the image-text-to-text pipeline tag on Hugging Face, distributed under the permissive apache-2.0 license.
Qwen3.5-9B-DeepSeek-V4-Flash-GGUF ships with 9B parameters, distributed as a quantized weight variant for lower-VRAM inference. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.
Qwen3.5-9B-DeepSeek-V4-Flash-GGUF is best fit for mixed text-and-image reasoning tasks such as document understanding. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.
Procedural-skill SFT across capacity tiers: A W-Shaped pre-SFT Trajectory and Regime-Asymmetric Mechanism on 0.8B-4B Qwen3.5 Models
arXiv:2605.11907v2 Announce Type: replace Abstract: We measure procedural-skill SFT contribution across three Qwen3.5 dense scales (0.8B, 2B, 4B) on a 200-task / 40-skill holdout, with Claude Haiku 4.5 as a frontier reference. The corpus is 353 rows of (task + procedural-skill block, Opus chain-of-t
Qwen3.5-Omni Technical Report
arXiv:2604.15804v2 Announce Type: replace Abstract: In this work, we present Qwen3.5-Omni, the latest advancement in the Qwen-Omni model family. Representing a significant evolution over its predecessor, Qwen3.5-Omni scales to hundreds of billions of parameters and supports a 256k context length. By