·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Theker just raised $85M to build the factory robot that doesn’t specialize in anything1h◆Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world1h◆SpaceX officially prices shares at $135 in the largest IPO ever6h◆Our new community investments in Virginia support local jobs and expand energy affordability.6h◆SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift6h◆Amazon’s data centers used 2.5 billion gallons of water last year9h◆Deezer’s new tool can identify AI music from Spotify, Apple Music, and others10h◆Pool’s new app turns your screenshots into something useful11h◆DoorDash’s new AI chatbot lets you order with prompts and photos12h◆Anthropic apologizes for invisible Claude Fable guardrails15h◆Google DeepMind is worried about what happens when millions of agents start to interact15h◆Deezer launches an AI music detector for other streaming services18h◆Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing22h◆MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning22h◆Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!22h◆ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation22h◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions22h◆The Impossibility of Eliciting Latent Knowledge22h◆Mapping Scientific Literature with Large Language Models and Topic Modeling22h◆Grounding Computer Use Agents on Human Demonstrations22h◆Theker just raised $85M to build the factory robot that doesn’t specialize in anything1h◆Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world1h◆SpaceX officially prices shares at $135 in the largest IPO ever6h◆Our new community investments in Virginia support local jobs and expand energy affordability.6h◆SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift6h◆Amazon’s data centers used 2.5 billion gallons of water last year9h◆Deezer’s new tool can identify AI music from Spotify, Apple Music, and others10h◆Pool’s new app turns your screenshots into something useful11h◆DoorDash’s new AI chatbot lets you order with prompts and photos12h◆Anthropic apologizes for invisible Claude Fable guardrails15h◆Google DeepMind is worried about what happens when millions of agents start to interact15h◆Deezer launches an AI music detector for other streaming services18h◆Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing22h◆MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning22h◆Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!22h◆ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation22h◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions22h◆The Impossibility of Eliciting Latent Knowledge22h◆Mapping Scientific Literature with Large Language Models and Topic Modeling22h◆Grounding Computer Use Agents on Human Demonstrations22h◆
Tag

#nlp

11 articles tagged #nlp

arxivMay 29

Slide Deck Q&A Quality Assurance App: A Multi-Stage Pipeline for Pedagogical Question Generation

arXiv:2605.26428v2 Announce Type: replace Abstract: Generating high-quality, pedagogically useful questions from lecture slide decks is difficult because important instructional content is distributed across both text and visual elements, and because useful questions must be scaffolded across the fl

#education#nlp#question-generationRead on arxiv →
arxivMay 26

Exploring Profiles of Cognitive Distortions Associated with Mental Health Disorders

arXiv:2605.24996v1 Announce Type: new Abstract: Cognitive distortions, distorted patterns of thinking, have been increasingly studied in computational mental health research. Although they are related to many, if not all, mental health disorders, most existing studies focus primarily on depression.

TR1 model#mental-health#research#nlpRead on arxiv →
arxivMay 21

Assessing socio-economic climate impacts from text data

arXiv:2605.20793v1 Announce Type: new Abstract: Recent advances in natural language processing (NLP) and large language models (LLMs) have enabled the systematic use of large-scale textual data from news, social media, and reports to create datasets with socio-economic impacts of climate hazards suc

#nlp#climate#disaster-riskRead on arxiv →
arxivMay 5

ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts

arXiv:2605.00116v1 Announce Type: cross Abstract: In this article, we introduce ViLegalNLI, the first large-scale Vietnamese Natural Language Inference (NLI) dataset specifically constructed for the legal domain. The dataset consists of 42,012 premise-hypothesis pairs derived from official statutory

#nlp#dataset#legalRead on arxiv →
arxivMay 5bearish

Lost in the Tower of Babel: The Adverse Effects of Incidental Multilingualism in LLMs

arXiv:2605.01224v1 Announce Type: new Abstract: This paper argues that contemporary multilingual NLP has converged on a fragile and misleading paradigm of incidental multilingualism. Today's LLMs appear multilingual largely because they are trained on massive, uneven web corpora, not because multili

LL1 model#nlp#multilingualism#language-modelsRead on arxiv →
arxivMay 1

Supercharging Agenda Setting Research: The ParlaCAP Dataset of 28 European Parliaments and a Scalable Multilingual LLM-Based Classification

arXiv:2602.16516v2 Announce Type: replace Abstract: This paper introduces ParlaCAP, a large-scale dataset for analyzing parliamentary agenda setting across Europe, and proposes a cost-effective method for building domain-specific policy topic classifiers. Applying the Comparative Agendas Project (CA

PALA2 models#dataset#nlp#policyRead on arxiv →
arxivApr 28bullish

ComplianceNLP: Knowledge-Graph-Augmented RAG for Multi-Framework Regulatory Gap Detection

arXiv:2604.23585v1 Announce Type: new Abstract: Financial institutions must track over 60,000 regulatory events annually, overwhelming manual compliance teams; the industry has paid over USD 300 billion in fines and settlements since the 2008 financial crisis. We present ComplianceNLP, an end-to-end

COOPLE4 models · +1#compliance#regulatory#nlpRead on arxiv →
arxivApr 23

Structured Disagreement in Health-Literacy Annotation: Epistemic Stability, Conceptual Difficulty, and Agreement-Stratified Inference

arXiv:2604.19943v1 Announce Type: new Abstract: Annotation pipelines in Natural Language Processing (NLP) commonly assume a single latent ground truth per instance and resolve disagreement through label aggregation. Perspectivist approaches challenge this view by treating disagreement as potentially

#nlp#annotation#health-literacyRead on arxiv →
arxivApr 22bullish

Model-Agnostic Meta Learning for Class Imbalance Adaptation

arXiv:2604.18759v1 Announce Type: new Abstract: Class imbalance is a widespread challenge in NLP tasks, significantly hindering robust performance across diverse domains and applications. We introduce Hardness-Aware Meta-Resample (HAMR), a unified framework that adaptively addresses both class imbal

HA1 model#nlp#class-imbalance#resamplingRead on arxiv →
arxivApr 21

Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures

arXiv:2604.16042v2 Announce Type: cross Abstract: While Large Language Models (LLMs) have achieved strong performance across many NLP tasks, their opaque internal mechanisms hinder trustworthiness and safe deployment. Existing surveys in explainable AI largely focus on post-hoc explanation methods t

#explainability#nlp#researchRead on arxiv →
arxivApr 4

A Dynamic Atlas of Persian Poetic Symbolism: Families, Fields, and the Historical Rewiring of Meaning

arXiv:2604.01467v1 Announce Type: new Abstract: Persian poetry is often remembered through recurrent symbols before it is remembered through plot. Wine vessels, gardens, flames, sacred titles, bodily beauty, and courtly names return across centuries, yet computational work still tends to flatten thi

#poetry#nlp#corpus-analysisRead on arxiv →
HomeModelsNews