·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Andrew Yang thinks the next big startup opportunity is lowering the cost of living3h◆Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI4h◆SpaceX IPO: Live updates on everything you need to know8h◆Meta’s months-old AI unit is a soul-crushing gulag, say the engineers stuck inside it8h◆Chinese cybercrime operation that used AI to scam ‘hundreds of thousands of victims’ sued by Google10h◆Mistral is rumored to be raising €3B at €20B valuation13h◆Siri is good now??13h◆Elon Musk is the world’s first trillionaire14h◆SpaceX, Anthropic, and OpenAI’s hot IPO summer14h◆olmo-eval: An evaluation workbench for the model development loop15h◆It’s hot IPO summer, and the MANGOS are ripe15h◆SpaceX’s massive IPO: all the latest news16h◆Jeff Bezos’ AI startup aims to build an ‘artificial general engineer’17h◆New OpenAI Academy courses for the next era of work21h◆Siri won’t be your AI girlfriend1d◆Cheaper, faster, and culturally aware, Avataar’s video AI is built for India’s scale1d◆EDEN: A Large-Scale Corpus of Clinical Notes for Italian1d◆ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection1d◆LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling1d◆APPO: Agentic Procedural Policy Optimization1d◆Andrew Yang thinks the next big startup opportunity is lowering the cost of living3h◆Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI4h◆SpaceX IPO: Live updates on everything you need to know8h◆Meta’s months-old AI unit is a soul-crushing gulag, say the engineers stuck inside it8h◆Chinese cybercrime operation that used AI to scam ‘hundreds of thousands of victims’ sued by Google10h◆Mistral is rumored to be raising €3B at €20B valuation13h◆Siri is good now??13h◆Elon Musk is the world’s first trillionaire14h◆SpaceX, Anthropic, and OpenAI’s hot IPO summer14h◆olmo-eval: An evaluation workbench for the model development loop15h◆It’s hot IPO summer, and the MANGOS are ripe15h◆SpaceX’s massive IPO: all the latest news16h◆Jeff Bezos’ AI startup aims to build an ‘artificial general engineer’17h◆New OpenAI Academy courses for the next era of work21h◆Siri won’t be your AI girlfriend1d◆Cheaper, faster, and culturally aware, Avataar’s video AI is built for India’s scale1d◆EDEN: A Large-Scale Corpus of Clinical Notes for Italian1d◆ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection1d◆LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling1d◆APPO: Agentic Procedural Policy Optimization1d◆
DataBubble·

Model Detail

deepseek-ai logo

DeepSeek-V4-Pro

▼ 2.2%
Provider: DeepSeekCategory: llmPipeline: text-generation
DB Score
25.2
Downloads
3.4M
Likes
5K
Day
-2.2%
Week
+0.0%
Month
+607.2%
Overview

DeepSeek-V4-Pro is a large language model with 430.8B parameters released by DeepSeek. The model is registered under the text-generation pipeline tag on Hugging Face, and supports text->text inputs, distributed under the permissive mit license.

Performance

DeepSeek-V4-Pro reports a Chatbot Arena ELO of 1,461 across 9,970 votes. Other benchmark slots are still empty in our dataset, so this single figure is best read as a partial picture rather than a full evaluation.

How we score this →
Pricing & Throughput

DeepSeek-V4-Pro is priced at $1.74/M input tokens and $3.48/M output tokens. Operationally the model offers a 1049K-token context window, which matters when sizing it for prompt-heavy or latency-sensitive workloads. Pricing in this range is the working middle of the API market — neither the cheapest nor the most expensive option per token, so cost-fit is usually a function of how much output you generate.

Technical

DeepSeek-V4-Pro ships with 430.8B parameters. Total weight footprint is approximately 861.6 GB, which is the relevant figure when planning local-inference VRAM. The mit license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of DeepSeek-V4-Pro have moved -2.2% over the past 24 hours, +607.2% over the trailing thirty days. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →
Use Cases

DeepSeek-V4-Pro is best fit for general-purpose chat and instruction-following workloads, and long-context tasks such as full-codebase analysis or book-length summarization (1049K tokens). Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History
Pricing
Input ($/M tokens)
$1.74
Output ($/M tokens)
$3.48
Context Window
1049K
Arena & Community
Arena ELO
1,461
Arena Votes
9,970
Model Info
Licensemit
Modalitytext->text
Citations756 (67 influential)
Recent newsView all news →
Related News
arxiv3d ago

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

arXiv:2606.10392v1 Announce Type: new Abstract: Financial named-entity recognition (NER) is essential for translating unstructured financial reports and news into structured knowledge graphs. However, general-purpose large language models (LLMs) often misclassify financial entities or ignore domain-

arxiv3d ago

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

arXiv:2606.09079v2 Announce Type: replace-cross Abstract: Conventional LLMs keep the full KV cache loaded during decoding, causing a severe GPU memory bottleneck for ultra-long context serving. In this report, we propose Lookahead Sparse Attention (LSA), a novel inference paradigm powered by a Neura

arxiv18d ago

SoK: A Comprehensive Security Analysis of Jailbreak Resilience in GPT and DeepSeek Models

arXiv:2506.18543v2 Announce Type: replace-cross Abstract: The rapid proliferation of Large Language Models (LLMs) has heightened concerns regarding their exposure to jailbreak attacks, which craft adversarial inputs designed to elicit unsafe content. Although proprietary models such as GPT-4 have be

arxiv18d ago

DeepSeekMath Meets Order Book: Group-Aware Policy Optimization for High-Frequency Directional Trading

arXiv:2605.25527v1 Announce Type: new Abstract: This paper studies reinforcement learning for high-frequency trading on limit order books by pairing an Order-Flow-based state model with policy-gradient methods. Instead of value-based RL techniques like tabular Q-learning, our approach deploys policy

arxivneutral22d ago

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

arXiv:2605.00392v3 Announce Type: replace-cross Abstract: DeepSeek-OCR leverages visual-text compression to reduce long-text processing costs and accelerate inference, yet visual tokens remain prone to redundant textual and structural information. Moreover, current token pruning methods for conventi

arxivneutral23d ago

Refining and Reusing Annotation Guidelines for LLM Annotation

arXiv:2605.20809v1 Announce Type: new Abstract: While Large Language Models (LLMs) demonstrate remarkable performance on zero-shot annotation tasks, they often struggle with the specialized conventions of gold-standard benchmarks. We propose the systematic reuse and refinement of annotation guidelin

Related Models
deepseek-ai logo
DeepSeek-V3.2
DeepSeek · 11.2M downloads
deepseek-ai logo
DeepSeek-R1
DeepSeek · 5.4M downloads
google-bert logo
bert-base-uncased
google-bert · 69.6M downloads
sentence-transformers logo
paraphrase-multilingual-MiniLM-L12-v2
SBERT · 46.1M downloads
HomeModelsNews