·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Startup Battlefield 200 applications officially close in 3 days1h◆Google will pay SpaceX $920M per month for compute2h◆The most interesting startups right now want to get you off your phone4h◆This is your laptop… on AI5h◆New York lawmakers pass one-year ban on new data centers6h◆The token bill comes due: Inside the industry scramble to manage AI’s runaway costs6h◆The latest AI news we announced in May 20267h◆The ‘together tech’ wave might be the most intriguing startup bet of 20267h◆This AI startup says it can tell if a script will make a hit film7h◆AirTrunk commits $30B to build 5GW of AI data centers in India8h◆The Meta hack shows there’s more to AI security than Mythos12h◆Mira Murati steps back into the spotlight, carefully16h◆SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning17h◆Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning17h◆Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models17h◆Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents17h◆Why Muon Outperforms Adam: A Curvature Perspective17h◆Vision Hopfield Memory Networks17h◆Provably Auditable and Safe LLM Agents from Human-Authored Ontologies17h◆FlexRank: Nested Low-Rank Knowledge Decomposition for Adaptive Model Deployment17h◆Startup Battlefield 200 applications officially close in 3 days1h◆Google will pay SpaceX $920M per month for compute2h◆The most interesting startups right now want to get you off your phone4h◆This is your laptop… on AI5h◆New York lawmakers pass one-year ban on new data centers6h◆The token bill comes due: Inside the industry scramble to manage AI’s runaway costs6h◆The latest AI news we announced in May 20267h◆The ‘together tech’ wave might be the most intriguing startup bet of 20267h◆This AI startup says it can tell if a script will make a hit film7h◆AirTrunk commits $30B to build 5GW of AI data centers in India8h◆The Meta hack shows there’s more to AI security than Mythos12h◆Mira Murati steps back into the spotlight, carefully16h◆SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning17h◆Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning17h◆Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models17h◆Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents17h◆Why Muon Outperforms Adam: A Curvature Perspective17h◆Vision Hopfield Memory Networks17h◆Provably Auditable and Safe LLM Agents from Human-Authored Ontologies17h◆FlexRank: Nested Low-Rank Knowledge Decomposition for Adaptive Model Deployment17h◆
News/model/Lance

Lance news

48 articles mentioning Lance

arxiv2d ago

CoughSense: Five-Class Respiratory Disease Classification via Whisper Encoder Fine-Tuning and Dual-Encoder Cross-Attention Fusion with Balanced Contrastive Learning

arXiv:2606.02998v1 Announce Type: new Abstract: Automated cough analysis offers a path to low-cost respiratory screening, but most existing work stops at binary COVID-19 detection. A practical tool needs to tell apart several respiratory conditions from one cough recording on a consumer smartphone.

arxiv2d ago

AutoTail-BSFGM: Class-Balance-Aware Fine-Tuning for Chinese Scholarly Text Classification

arXiv:2606.03576v1 Announce Type: new Abstract: Scholarly text classification supports literature organization, subject indexing, and research intelligence, but Chinese scholarly corpora often contain imbalanced and semantically adjacent disciplinary labels. We propose AutoTail-BSFGM, a class-balanc

arxiv2d ago

Auditable Climate Risk Intelligence from Fragmented ESG Data: Deterministic Orchestration and Imbalance-Aware Learning for Scope 1-3 Validation

arXiv:2606.02604v1 Announce Type: cross Abstract: ESG and climate risk data remain fragmented across heterogeneous Scope 1, Scope 2, and Scope 3 reporting environments, while conventional validation pipelines lack provenance aware auditability, hidden drift detection, and reproducibility oriented go

arxiv2d ago

Towards Compact Autonomous Driving Perception with Balanced Learning and Multi-sensor Fusion

arXiv:2606.02979v1 Announce Type: cross Abstract: We present a novel compact deep multi-task learning model to handle various autonomous driving perception tasks in one forward pass. The model performs multiple views of semantic segmentation, depth estimation, light detection and ranging (LiDAR) seg

arxiv3d ago

Transferable Self-Harm Surveillance from Emergency Department Triage Notes Using an Evidence-Augmented Machine Learning Approach

arXiv:2606.02545v1 Announce Type: new Abstract: Self-harm is a major public health concern, but current surveillance relying on hospital presentations is inadequate due to the low sensitivity of diagnostic codes. Emergency Department (ED) triage notes, recorded at the initial point of contact, provi

arxiv3d ago

LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification

arXiv:2606.00647v1 Announce Type: cross Abstract: Detecting psychological defense mechanisms in conversational text remains a challenging clinical NLP problem. For the PsyDefDetect 2026 shared task (nine-class utterance classification evaluated via macro F1), our team LinguIUTics achieves a macro F1

arxiv3d ago

PATRA: Pattern-Aware Alignment and Balanced Reasoning for Time Series Question Answering

arXiv:2602.23161v4 Announce Type: replace Abstract: Time series reasoning demands both the perception of complex dynamics and logical depth. However, existing LLM-based approaches exhibit two limitations: they often treat time series merely as text or images, failing to capture the patterns like tre

arxiv3d ago

Conditioned free-energy density of proteins using unbalanced solutions to constraint satisfaction problems

arXiv:2606.01329v1 Announce Type: new Abstract: We show that computing the log-partition function (free-energy) of conditioned inhomogeneous Curie--Weiss spin Hamiltonians reduces to an unbalanced $2 \to 1$ norm computation, and design a polynomial-time SDP algorithm for this problem with a lower bo

arxiv3d ago

Balancing Knowledge Distillation for Imbalance Learning with Bilevel Optimization

arXiv:2605.17839v3 Announce Type: replace-cross Abstract: Knowledge distillation transfers knowledge from a high capacity teacher to a compact student using a mixture of hard and soft losses. On imbalanced data, a fixed weighting between hard and soft losses becomes brittle the learning process. Rec

arxiv3d ago

Symmetric Hermite quadrature-based balanced truncation for learning linear dynamical systems from derivative data

arXiv:2606.00298v1 Announce Type: cross Abstract: Data-driven reduced-order modeling is an essential component in the computer-aided design of control systems. In this work, we present a novel symmetric Hermite formulation of the quadrature-based balanced truncation algorithm that constructs linear

arxiv3d ago

Challenges in the calibration of tree-based models for imbalanced classification

arXiv:2412.16209v5 Announce Type: replace Abstract: When using machine learning for imbalanced binary classification problems, it is common to subsample the majority class to create a (more) balanced training dataset. This biases the model's predictions because the model learns from data that is not

arxiv3d ago

Beyond the Simplex: Balanced Prototype Geometry for Scorer-Agnostic Open-Set Recognition

arXiv:2606.01883v1 Announce Type: new Abstract: Open-set recognition (OSR) requires a classifier to reject inputs from unseen classes which is essential in safety-critical settings such as medical imaging. Simplex based methods, which fix class prototypes at the vertices of a regular simplex and the

arxiv3d ago

On Imbalanced Regression with Hoeffding Trees

arXiv:2602.22101v3 Announce Type: replace-cross Abstract: Many real-world applications generate continuous data streams for regression. Hoeffding trees and their variants have a long-standing tradition due to their effectiveness, either alone or as base models in broader ensembles. Recent batch-lear

arxiv3d ago

Hybrid Imbalanced Regression Through Unified Data-Level and Algorithm-Level Balancing

arXiv:2606.01221v1 Announce Type: cross Abstract: Imbalanced learning is a critical challenge in machine learning, where underrepresented target values can bias models and degrade prediction performance on rare but important cases. Although extensively studied in classification, imbalanced regressio

arxiv4d ago

Configurable Reward Model for Balanced Safety Alignment

arXiv:2605.30487v1 Announce Type: new Abstract: Aligning large language models (LLMs) to heterogeneous and rapidly evolving safety requirements remains a critical challenge. Existing instruction-tuned LLMs and standalone safety classifiers often fail to generalize to new safety configurations, motiv

arxiv4d ago

Balanced LoRA: Removing Parameter Invariance to Accelerate Convergence

arXiv:2605.31484v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) is the most widely adopted method for fine-tuning large language models. Notably, LoRA is inherently overparameterized: multiple pairs of low-rank factors can yield the same adapted weight matrix. We show--both theoretically

arxiv4d ago

Translation Analytics for Freelancers II: Benchmarking Local LLMs for Confidential Translation Workflows

arXiv:2605.31452v1 Announce Type: new Abstract: Building on our previous work, this paper develops practical, low-barrier methods for freelance translators and smaller language service providers to evaluate translation technologies using rigorous yet accessible analytic methods. Here we address a hi

arxivMay 29

DAMEL: Dual-Axis Multi-Expert Learning for Class-Imbalanced Learning

arXiv:2605.30135v1 Announce Type: cross Abstract: Various algorithms have been proposed to address the challenges posed by class-imbalanced learning from real-world data with long-tailed distributions. While these algorithms reduce prediction bias through rebalancing techniques, they often introduce

arxivMay 29

Rel-MOSS: Towards Imbalanced Relational Deep Learning on Relational Databases

arXiv:2603.07916v2 Announce Type: replace Abstract: In recent advances, to enable a fully data-driven learning paradigm on relational databases (RDB), relational deep learning (RDL) is proposed to structure the RDB as a heterogeneous entity graph and adopt the graph neural network (GNN) as the predi

arxivMay 29

A Minimal Bifurcation Model of Load Imbalance in a Softmax Mixture-of-Experts Router

arXiv:2605.29121v1 Announce Type: cross Abstract: We propose a minimal dynamical model of adaptive softmax routing for a two-expert Mixture-of-Experts (MoE) layer. The model is obtained as a mean-field limit of a discrete reinforcement rule: the selected expert receives a small score increment, whil

arxivMay 29

Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training

arXiv:2603.00454v2 Announce Type: replace-cross Abstract: Generative Flow Networks (GFlowNets) enable fine-tuning large language models to approximate reward-proportional posteriors, but they remain prone to mode collapse, manifesting as prefix collapse and length bias. We attribute this to two fact

arxivMay 29

Stable-GFlowNet: Toward Diverse and Robust LLM Red-Teaming via Contrastive Trajectory Balance

arXiv:2605.00553v2 Announce Type: replace Abstract: Large Language Model (LLM) Red-Teaming, which proactively identifies vulnerabilities of LLMs, is an essential process for ensuring safety. Finding effective and diverse attacks in red-teaming is important, but achieving both is challenging. Generat

arxivMay 28

Stance Detection in Prediction Markets: Addressing Imbalanced Trader Commentary via Counterfactual Augmentation and Market Context

arXiv:2605.28745v1 Announce Type: new Abstract: Prediction markets such as Polymarket aggregate crowd beliefs into real-time probability estimates, and the comments traders post beneath each market contain rich directional stance signals that prices alone cannot capture. This work introduces the fir

arxivMay 28

AdaDPO: Self-Adaptive Direct Preference Optimization with Balanced Gradient Updates

arXiv:2605.28440v1 Announce Type: new Abstract: DPO has become a widely adopted alternative to RLHF for aligning LLMs with human preferences, eliminating the need for a separate reward model or RL loop. Recent theoretical analysis uncovers an asymmetric gradient behavior in DPO: the loss suppresses

arxivMay 28

Long Live The Balance: Information Bottleneck Driven Tree-based Policy Optimization

arXiv:2605.28109v1 Announce Type: new Abstract: Recent advances in online reinforcement learning (RL) for large language models (LLMs) have demonstrated promising performance in complex reasoning tasks. However, they often exhibit an imbalanced exploration-exploitation trade-off, resulting in unstab

arxivMay 27

Enhancing Autonomous Online Intrusion Detection for IoT with Balanced Learning, Reliable Pseudo-Labels, and Lightweight Architectures

arXiv:2605.26166v1 Announce Type: cross Abstract: The rapid proliferation of Internet of Things (IoT) devices has created an urgent demand for adaptive, resource-efficient Intrusion Detection Systems (IDS) capable of handling dynamic and evolving cyber threats. This paper investigates AOC-IDS, a sta

arxivMay 27

The AI Cognitive Trojan Horse: How Large Language Models May Bypass Human Epistemic Vigilance

arXiv:2601.07085v2 Announce Type: replace-cross Abstract: Large language model (LLM)-based conversational AI systems present a challenge to human cognition that current frameworks for understanding misinformation and persuasion do not adequately address. This paper proposes that a significant episte

arxivMay 27

An Empirical Study of Machine Learning Robustness and Scalability for Imbalanced Tabular Clinical Data in Emergency and Critical Care

arXiv:2512.21602v3 Announce Type: replace Abstract: Every year, millions of patients pass through emergency departments and intensive care units, where clinicians must make high-stakes decisions under time pressure and uncertainty. Machine learning could support prediction of deterioration, triage,

arxivMay 27

Focal Reward: Balanced Reinforcement Learning under Rubric-Based Rewards

arXiv:2605.26579v1 Announce Type: new Abstract: The open-ended generation in LLMs usually requires multi-dimensional rubrics to adequately assess quality and guide the improvement of reinforcement learning. However, a critical dilemma inherent in this training paradigm is the imbalanced reward polar

arxivMay 27

Suicide Risk Assessment from AI-powered Video Surveillance: An Interpretable Framework for Prevention in Metro Stations

arXiv:2605.22904v2 Announce Type: replace-cross Abstract: Understanding and monitoring human behavior in metro stations play an important role in supporting suicide prevention efforts, where early identification of high-risk situations can enable timely intervention. This requires assessing suicide

arxivMay 26

Complement Submodular Information Measures for Balanced and Robust Data Selection

arXiv:2605.24779v1 Announce Type: cross Abstract: Submodular optimization has become a fundamental paradigm for data selection, retrieval, summarization, and representation learning due to its ability to model coverage, diversity, and representativeness. However, classical submodular objectives opti

arxivMay 26

On the Impact of Class Imbalance on the Learning Dynamics of Deep Neural Networks:An Intuitive Insight

arXiv:2605.24908v1 Announce Type: cross Abstract: Class imbalance in deep neural networks (DNNs) has witnessed a rapid increase in research attention in recent years. However, the varying accounts of the reasons behind the poor performance of DNN on imbalance data in pertinent literature shows that

arxivMay 26

Unbalanced Incomplete Multi-view Clustering via the Scheme of View Evolution: Weak Views are Meat; Strong Views do Eat

arXiv:2011.10254v3 Announce Type: replace-cross Abstract: Incomplete multi-view clustering is an important technique to deal with real-world incomplete multi-view data. Previous works assume that all views have the same incompleteness, i.e., balanced incompleteness. However, different views often ha

arxivMay 26

Feature Resemblance: Towards a Theoretical Understanding of Analogical Reasoning in Transformers

arXiv:2603.05143v3 Announce Type: replace Abstract: Understanding reasoning in large language models is complicated by evaluations that conflate multiple reasoning types. We isolate analogical reasoning, where a model transfers an attribute between entities that share known properties, and study whe

arxivMay 26

CopulaSMOTE: A Copula-Based Oversampling Approach for Imbalanced Classification in Diabetes Prediction

arXiv:2506.17326v3 Announce Type: replace Abstract: Class imbalance remains a practical obstacle in the development of clinical prediction models for conditions such as diabetes mellitus, where the number of confirmed cases is often much smaller than the number of controls. The Synthetic Minority Ov

arxivMay 26

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

arXiv:2512.23995v2 Announce Type: replace-cross Abstract: Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To accommodate the growing number of experts in practice, modern inference systems commonly adopt expert p

arxivMay 26

PerSoMed: A Large-Scale Balanced Dataset for Persian Social Media Text Classification

arXiv:2602.19333v2 Announce Type: replace Abstract: This research introduces the first large-scale, well-balanced Persian social media text classification dataset, specifically designed to address the lack of comprehensive resources in this domain. The dataset comprises 36,000 posts across nine cate

arxivMay 26

Multi-market value-stacking: Battery control for combined imbalance participation and non-uniform FCR bidding

arXiv:2605.23964v1 Announce Type: cross Abstract: The growing share of Renewable Energy Sources (RES) in modern power systems increases both grid imbalances and frequency deviations, reinforcing the need for ancillary services such as Frequency Containment Reserve (FCR) and passive balancing. Batter

arxivMay 26

Decoding Stimulus Reconstruction-Based Auditory Attention Robustly in Unbalanced EEG Datasets

arXiv:2605.25605v1 Announce Type: cross Abstract: In the past decade, numerous studies have applied deep neural networks (DNNs) to decode auditory attention (AAD) from Electroencephalogram (EEG) signals via stimulus reconstruction. However, the influence of dataset balance on the decoding performanc

arxivMay 25

Class-Dependent Hybrid Data Augmentation for Multiclass Migraine Classification under Severe Class Imbalance

arXiv:2605.23453v1 Announce Type: new Abstract: We conducted a reproducibility-oriented re-evaluation of prior migraine classification studies, correcting for data leakage and metric bias. We then introduced (i) a clinically motivated aggregation of two hemiplegic subtypes following ICHD-3 {\S}1.2.3

arxivMay 25

Selective Ambulance Dispatch Under Contextual Travel-Time Uncertainty

arXiv:2605.23378v1 Announce Type: cross Abstract: Ambulance response is time-critical in out-of-hospital cardiac arrest (OHCA), where dispatchers must balance timely arrivals with limited fleet capacity. Static territories and deterministic travel-time estimates are vulnerable to dynamic congestion,

arxivMay 25

Anatomy-Guided Vision-Language Learning with Angular Prototype Separation for Multi-Label Video Capsule Endoscopy Classification Under Class Imbalance

arXiv:2603.17879v2 Announce Type: replace-cross Abstract: This work presents a multi-label temporal event detection framework for video capsule endoscopy (VCE) that addresses the extreme class imbalance inherent in the Galar dataset by combining two principal contributions: an Angular Separation Los

arxivMay 22

Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition

arXiv:2605.21565v1 Announce Type: new Abstract: Multimodal Emotion Recognition in Conversations (MERC) is a crucial task for understanding human interactions, where multimodal approaches integrating language, facial expressions, and vocal tone have achieved significant progress. However, modality mi

arxivMay 22

SPECTRA: Spectral Domain-Aware Graph Generation for Imbalanced Molecular Property Regression

arXiv:2511.04838v2 Announce Type: replace Abstract: Molecular property regression struggles with cases in chemically relevant target ranges that are underrepresented in datasets. Standard average error minimization approaches underperform in these highly relevant cases, and oversampling approaches l

arxivMay 22

Visibility nowcasting in South Korea: a machine learning approach to class imbalance and distribution shift

arXiv:2605.21507v1 Announce Type: cross Abstract: Atmospheric visibility is a critical variable for transportation safety and air quality management, however, accurate prediction remains challenging due to the complex interactions between meteorological conditions and air pollutants, as well as the

arxivMay 22

Lance: Unified Multimodal Modeling by Multi-Task Synergy

arXiv:2605.18678v2 Announce Type: replace-cross Abstract: We present Lance, a lightweight native unified model supporting multimodal understanding, generation, and editing for both images and videos. Rather than relying on model capacity scaling or text-image-dominant designs, Lance explores a pract

arxivMay 22

Disentangling Sampling from Training Budget in Class-Imbalanced CT Body Composition Segmentation

arXiv:2605.20405v1 Announce Type: cross Abstract: Class imbalance is a fundamental challenge in medical image segmentation, where frequent classes typically dominate training at the expense of rare classes. Loss-based approaches mitigate imbalance by reweighting the per-pixel loss within the batch,

arxivMay 22

Correcting Class Imbalance in Prior-Data Fitted Networks for Tabular Classification

arXiv:2605.21742v1 Announce Type: new Abstract: Prior-data fitted networks (PFNs) have achieved exceptional performance on tabular classification tasks. However, like other classifiers, their performance can suffer under the effect of class imbalance, resulting in poor performance for rare classes.

HomeModelsNews