·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Andrew Yang thinks the next big startup opportunity is lowering the cost of living4h◆Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI5h◆SpaceX IPO: Live updates on everything you need to know9h◆Meta’s months-old AI unit is a soul-crushing gulag, say the engineers stuck inside it9h◆Chinese cybercrime operation that used AI to scam ‘hundreds of thousands of victims’ sued by Google11h◆Mistral is rumored to be raising €3B at €20B valuation14h◆Siri is good now??15h◆Elon Musk is the world’s first trillionaire15h◆SpaceX, Anthropic, and OpenAI’s hot IPO summer16h◆olmo-eval: An evaluation workbench for the model development loop16h◆It’s hot IPO summer, and the MANGOS are ripe16h◆SpaceX’s massive IPO: all the latest news17h◆Jeff Bezos’ AI startup aims to build an ‘artificial general engineer’18h◆New OpenAI Academy courses for the next era of work22h◆Siri won’t be your AI girlfriend1d◆Cheaper, faster, and culturally aware, Avataar’s video AI is built for India’s scale1d◆EDEN: A Large-Scale Corpus of Clinical Notes for Italian1d◆ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection1d◆LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling1d◆APPO: Agentic Procedural Policy Optimization1d◆Andrew Yang thinks the next big startup opportunity is lowering the cost of living4h◆Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI5h◆SpaceX IPO: Live updates on everything you need to know9h◆Meta’s months-old AI unit is a soul-crushing gulag, say the engineers stuck inside it9h◆Chinese cybercrime operation that used AI to scam ‘hundreds of thousands of victims’ sued by Google11h◆Mistral is rumored to be raising €3B at €20B valuation14h◆Siri is good now??15h◆Elon Musk is the world’s first trillionaire15h◆SpaceX, Anthropic, and OpenAI’s hot IPO summer16h◆olmo-eval: An evaluation workbench for the model development loop16h◆It’s hot IPO summer, and the MANGOS are ripe16h◆SpaceX’s massive IPO: all the latest news17h◆Jeff Bezos’ AI startup aims to build an ‘artificial general engineer’18h◆New OpenAI Academy courses for the next era of work22h◆Siri won’t be your AI girlfriend1d◆Cheaper, faster, and culturally aware, Avataar’s video AI is built for India’s scale1d◆EDEN: A Large-Scale Corpus of Clinical Notes for Italian1d◆ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection1d◆LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling1d◆APPO: Agentic Procedural Policy Optimization1d◆
News/model/DeepSeek-V4-Pro

DeepSeek-V4-Pro news

21 articles mentioning DeepSeek-V4-Pro

arxiv3d ago

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

arXiv:2606.10392v1 Announce Type: new Abstract: Financial named-entity recognition (NER) is essential for translating unstructured financial reports and news into structured knowledge graphs. However, general-purpose large language models (LLMs) often misclassify financial entities or ignore domain-

arxiv3d ago

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

arXiv:2606.09079v2 Announce Type: replace-cross Abstract: Conventional LLMs keep the full KV cache loaded during decoding, causing a severe GPU memory bottleneck for ultra-long context serving. In this report, we propose Lookahead Sparse Attention (LSA), a novel inference paradigm powered by a Neura

arxivMay 26

SoK: A Comprehensive Security Analysis of Jailbreak Resilience in GPT and DeepSeek Models

arXiv:2506.18543v2 Announce Type: replace-cross Abstract: The rapid proliferation of Large Language Models (LLMs) has heightened concerns regarding their exposure to jailbreak attacks, which craft adversarial inputs designed to elicit unsafe content. Although proprietary models such as GPT-4 have be

arxivMay 26

DeepSeekMath Meets Order Book: Group-Aware Policy Optimization for High-Frequency Directional Trading

arXiv:2605.25527v1 Announce Type: new Abstract: This paper studies reinforcement learning for high-frequency trading on limit order books by pairing an Order-Flow-based state model with policy-gradient methods. Instead of value-based RL techniques like tabular Q-learning, our approach deploys policy

arxivMay 22

RTPrune: Reading-Twice Inspired Token Pruning for Efficient DeepSeek-OCR Inference

arXiv:2605.00392v3 Announce Type: replace-cross Abstract: DeepSeek-OCR leverages visual-text compression to reduce long-text processing costs and accelerate inference, yet visual tokens remain prone to redundant textual and structural information. Moreover, current token pruning methods for conventi

techcrunchMay 6

DeepSeek could hit $45B valuation from its first investment round

The Chinese AI lab came to prominence in early 2025 after launching a large language model that trained on a fraction of the compute power and at a fraction of the cost of the big U.S. models like those from OpenAI and Anthropic.

mit-tech-reviewApr 24

Three reasons why DeepSeek’s new model matters

On April 24, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. The model can process much longer prompts than its last generation, thanks to a new design that helps it handle large amounts of text more efficiently. Like DeepSeek’s previous models, V4 is open sou

techcrunchApr 24

DeepSeek previews new AI model that ‘closes the gap’ with frontier models

DeepSeek says both models are more efficient and performant than DeepSeek V3.2 due to architectural improvements, and have almost "closed the gap" with current leading models, both open and closed, on reasoning benchmarks.

thevergeApr 24

China’s DeepSeek previews new AI model a year after jolting US rivals

Chinese AI company DeepSeek released a preview of its hotly anticipated next-generation AI model V4 on Friday, saying that the open-source model can compete with leading closed-source systems from US rivals including Anthropic, Google, and OpenAI. DeepSeek says V4 marks a major improvement over prio

huggingfaceApr 24

DeepSeek-V4: a million-token context that agents can actually use

arxivApr 22

Fine-tuning DeepSeek-OCR-2 for Molecular Structure Recognition

arXiv:2604.03476v2 Announce Type: replace-cross Abstract: Optical Chemical Structure Recognition (OCSR) is critical for converting 2D molecular diagrams from printed literature into machine-readable formats. While Vision-Language Models have shown promise in end-to-end OCR tasks, their direct applic

arxivMar 31

Can AI be a Teaching Partner? Evaluating ChatGPT, Gemini, and DeepSeek across Three Teaching Strategies

arXiv:2603.26673v1 Announce Type: cross Abstract: There are growing promises that Large Language Models (LLMs) can support students' learning by providing explanations, feedback, and guidance. However, despite their rapid adoption and widespread attention, there is still limited empirical evidence r

huggingfaceFeb 3

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

huggingfaceJan 27

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

huggingfaceJan 20

One Year Since the “DeepSeek Moment”

huggingfaceJan 31

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

huggingfaceJan 30

How to deploy and fine-tune DeepSeek models on AWS

huggingfaceJan 28

Open-R1: a fully open reproduction of DeepSeek-R1

arxivMay 21

Refining and Reusing Annotation Guidelines for LLM Annotation

arXiv:2605.20809v1 Announce Type: new Abstract: While Large Language Models (LLMs) demonstrate remarkable performance on zero-shot annotation tasks, they often struggle with the specialized conventions of gold-standard benchmarks. We propose the systematic reuse and refinement of annotation guidelin

#research#language models#benchmark
arxivMay 19

Fidelity Probes for Specification--Code Alignment

arXiv:2605.17246v1 Announce Type: cross Abstract: We introduce fidelity probes: natural-language questions generated from a reference artifact with code-derived ground-truth answers, answered from a candidate specification. The fraction of agreeing probes, which we call the fidelity, decomposes into

#machine learning#artificial intelligence#benchmark
arxivApr 6bearish

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

arXiv:2604.02947v1 Announce Type: new Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and execution environments. Unlike chat systems, they maintain state across interactions and translate intermediate outputs into concrete actions. T

#safety#benchmark#autonomous agents
HomeModelsNews