nomic-embed-text-v1.5 news

48 articles mentioning nomic-embed-text-v1.5

arxivMay 21

Assessing socio-economic climate impacts from text data

arXiv:2605.20793v1 Announce Type: new Abstract: Recent advances in natural language processing (NLP) and large language models (LLMs) have enabled the systematic use of large-scale textual data from news, social media, and reports to create datasets with socio-economic impacts of climate hazards suc

#nlp #climate #disaster-risk

arxivMay 21

Supervised Latent Restructuring for Small-Data Quantum Learning in Plant Phenomics

arXiv:2605.20413v1 Announce Type: new Abstract: High-dimensional biological data often exhibit a severe mismatch between feature dimensionality and sample size, making reliable classification difficult in extremely small-data regimes. In these settings, kernel methods can lose discriminative power w

arxivMay 21

The Economics of Model Collapse: Equilibrium, Welfare, and Optimal Provenance Subsidies in Synthetic Data Markets

arXiv:2605.20279v1 Announce Type: cross Abstract: Generative artificial intelligence is rapidly transforming the supply side of training data: an increasing share of new tokens, images, and structured records is produced by previous-generation models rather than by human originators. Recursive train

arxivMay 21

The Economics of AI Inference: Inflation Dynamics, Welfare Costs, and Optimal Monetary Policy under the Inference-Cost Phillips Curve

arXiv:2605.20281v1 Announce Type: cross Abstract: We develop a unified microeconomic and monetary theory of artificial intelligence inference costs and their pass-through to inflation, welfare, and optimal monetary policy. We introduce the Inference-Cost Phillips Curve (ICPC), an augmented New Keyne

arxivMay 19

Computational Challenges in Token Economics: Bridging Economic Theory and AI System Design

arXiv:2605.17410v1 Announce Type: new Abstract: Token economics has emerged as a useful lens for understanding resource allocation, value creation, and pricing in large language model systems. While recent work has increasingly treated tokens as economic primitives, there remains a substantial gap b

arxivMay 19

Agent Bazaar: Enabling Economic Alignment in Multi-Agent Marketplaces

arXiv:2605.17698v1 Announce Type: new Abstract: The deployment of Large Language Models (LLMs) as autonomous economic agents introduces systemic risks that extend beyond individual capability failures. As agents transition to directly interacting with marketplaces, their collective behavior can ampl

arxivMay 18

Genome-Factory: A Library for Tuning, Deploying, and Interpreting Genomic Foundation Models

arXiv:2509.12266v2 Announce Type: replace-cross Abstract: We introduce Genome-Factory, the first integrated Python library for tuning, deploying, and interpreting genomic foundation models. Our core contribution is to simplify and unify the workflow for genomic model development: data collection, mo

arxivMay 16

AttnGen: Attention-Guided Saliency Learning for Interpretable Genomic Sequence Classification

arXiv:2605.14073v1 Announce Type: cross Abstract: Deep neural networks have achieved strong performance in genomic sequence classification; however, relating their predictions to biologically meaningful sequence patterns remains challenging. In this work, we present AttnGen, an attention-guided trai

arxivMay 13

Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM

arXiv:2605.10447v1 Announce Type: cross Abstract: Agent-based models (ABMs) are increasingly used in macroeconomics, but their analysis still often relies on ad hoc Monte Carlo campaigns with heterogeneous statistical effort across parameter settings. We show how statistical model checking (SMC), im

arxivMay 13

Token Economics for LLM Agents: A Dual-View Study from Computing and Economics

arXiv:2605.09104v1 Announce Type: new Abstract: As LLM agents evolve, tokens have emerged as the core economic primitives of Agentic AI. However, their exponential consumption introduces severe computational, collaborative, and security bottlenecks. Current surveys remain fragmented across system op

arxivMay 12

EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments

arXiv:2506.08136v3 Announce Type: replace Abstract: We introduce EconWebArena, a benchmark for evaluating autonomous agents on complex, multimodal economic tasks in realistic web environments. The benchmark comprises 360 curated tasks from 82 authoritative websites spanning domains such as macroecon

arxivMay 8

AstroAlertBench: Evaluating the Accuracy, Reasoning, and Honesty of Multimodal LLMs in Astronomical Classification

arXiv:2605.05573v1 Announce Type: cross Abstract: Modern astronomical observatories generate a massive volume of multimodal data, creating a critical bottleneck for expert human review. While multimodal large language models (LLMs) have shown promise in interpreting complex visual and textual inputs

arxivMay 6

DIPLI: Deep Image Prior Lucky Imaging for Blind Astronomical Image Restoration

arXiv:2503.15984v3 Announce Type: replace-cross Abstract: Modern image restoration and super-resolution methods utilize deep learning due to its superior performance compared to traditional algorithms. However, deep learning typically requires large labeled training datasets, which are rarely availa

arxivMay 6

StreakMind: AI detection and analysis of satellite streaks in astronomical images with automated database integration

arXiv:2605.03429v1 Announce Type: cross Abstract: Artificial satellites and space debris increasingly contaminate astronomical images, affecting scientific surveys and producing large volumes of streaked exposures. Manual inspection is no longer feasible at scale, and reliable detection and characte

arxivMay 5

What price to pay? Auto-tuning a building MPC controller for optimal economic cost

arXiv:2501.10859v2 Announce Type: replace-cross Abstract: Demand-side management (DSM) programs introduce complex pricing, requiring advanced control for cost minimization. Model Predictive Control (MPC) offers a solution but its performance hinges on appropriate hyperparameter tuning. We propose us

arxivMay 5

CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift

arXiv:2605.00074v1 Announce Type: cross Abstract: DNA-synthesis providers screen incoming orders by searching the requested sequence against curated hazard lists. We show that this baseline collapses to a 100% false-flag rate when the hazardous sequence comes from a taxonomic family absent from the

arxivMay 5

Deeper detection limits in astronomical imaging using self-supervised spatiotemporal denoising

arXiv:2602.17205v2 Announce Type: replace-cross Abstract: The detection limit of astronomical imaging observations is limited by several noise sources. Some of that noise is correlated between neighbouring image pixels and exposures, so in principle could be learned and corrected. We present an astr

arxivMay 1

AgentEconomist: An End-to-end Agentic System Translating Economic Intuitions into Executable Computational Experiments

arXiv:2604.27725v1 Announce Type: cross Abstract: A long-standing challenge in economics lies not in the lack of intuition, but in the difficulty of translating intuitive insights into verifiable research. To address this challenge, we introduce AgentEconomist, an end-to-end interactive system desig

arxivApr 30

Mining Negative Sequential Patterns to Improve Viral Genomic Feature Representation and Classification

arXiv:2604.25968v1 Announce Type: cross Abstract: Viruses represent the most abundant biological entities on Earth and play a pivotal role in microbial ecosystems, yet, as prominent human pathogens, they are closely linked to human morbidity and mortality. Accurate identification of viral sequences

arxivApr 29

Intervention-Aware Multiscale Representation Learning from Imaging Phenomics and Perturbation Transcriptomics

arXiv:2604.22832v1 Announce Type: cross Abstract: Microscopy-based phenotypic profiling is scalable for drug discovery but lacks the mechanistic depth of transcriptomics, which remains costly and scarce. Existing multimodal approaches either use images to support other modalities or naively align re

arxivApr 29

A systematic evaluation of vision-language models for observational astronomical reasoning tasks

arXiv:2604.24589v1 Announce Type: new Abstract: Vision-language models (VLMs) are increasingly proposed as general-purpose tools for scientific data interpretation, yet their reliability on real astronomical observations across diverse modalities remains untested. We present AstroVLBench, a comprehe

arxivApr 28

VAMP-Net: An Interpretable Multi-Path Network of Genomic Permutation-Invariant Set Attention and Quality-Aware 1D-CNN for MTB Drug Resistance

arXiv:2512.21786v2 Announce Type: replace Abstract: Genomic prediction of drug resistance in Mycobacterium tuberculosis is often hindered by complex epistatic interactions and variable sequencing quality. We present the Interpretable Variant-Aware Multi-Path Network (VAMP-Net), a novel architecture

arxivApr 24

The Economics of p(doom): Scenarios of Existential Risk and Economic Growth in the Age of Transformative AI

arXiv:2503.07341v2 Announce Type: replace-cross Abstract: Recent advances in artificial intelligence (AI) have led to a wide range of predictions about its long-term impact on humanity. A central focus is the potential emergence of transformative AI (TAI), eventually capable of outperforming humans

arxivApr 24

Ideological Bias in LLMs' Economic Causal Reasoning

arXiv:2604.21334v1 Announce Type: new Abstract: Do large language models (LLMs) exhibit systematic ideological bias when reasoning about economic causal effects? As LLMs are increasingly used in policy analysis and economic reporting, where directionally correct causal judgments are essential, this

arxivApr 24

Post-AGI Economies: Autonomy and the First Fundamental Theorem of Welfare Economics

arXiv:2604.21216v1 Announce Type: cross Abstract: The First Fundamental Theorem of Welfare Economics assumes that welfare-bearing agents are autonomous and implicitly relies on a binary distinction between autonomy and instrumentality. Welfare subjects are those who have autonomy and therefore the c

#economics #autonomy #artificial-intelligence

arxivApr 23

Cross-Modal Taxonomic Generalization in (Vision-) Language Models

arXiv:2603.07474v2 Announce Type: replace-cross Abstract: What is the interplay between semantic representations learned by language models (LM) from surface form alone to those learned from more grounded evidence? We study this question for a scenario where part of the input comes from a different

arxivApr 22

Are Large Language Models Economically Viable for Industry Deployment?

arXiv:2604.19342v1 Announce Type: new Abstract: Generative AI-powered by Large Language Models (LLMs)-is increasingly deployed in industry across healthcare decision support, financial analytics, enterprise retrieval, and conversational automation, where reliability, efficiency, and cost control are

arxivApr 21

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

arXiv:2604.05523v2 Announce Type: replace Abstract: The ability of large language models (LLMs) to manage and acquire economic resources remains unclear. In this paper, we introduce \textbf{Market-Bench}, a comprehensive benchmark that evaluates the capabilities of LLMs in economically-relevant task

arxivApr 21

Healthcare AI for Automation or Allocation? A Transaction Cost Economics Framework

arXiv:2604.16465v1 Announce Type: new Abstract: Healthcare productivity is shaped not only by clinical complexity but by the costs of coordinating work under uncertainty. Transaction-cost economics offers a theory of these coordination frictions, yet has rarely been operationalised at task level acr

arxivApr 16

Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

arXiv:2604.12737v2 Announce Type: replace-cross Abstract: While Federated Learning (FL) mitigates direct data exposure, the resulting trained models remain susceptible to membership inference attacks (MIAs). This paper presents an empirical evaluation of Differential Privacy (DP) as a defense mechan

arxivApr 16

Sandpile Economics: Theory, Identification, and Evidence

arXiv:2604.13890v1 Announce Type: cross Abstract: Why do capitalist economies recurrently generate crises whose severity is disproportionate to the size of the triggering shock? This paper proposes a structural answer grounded in the evolutionary geometry of production networks. As economies evolve

arxivApr 14

From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping

arXiv:2604.09907v1 Announce Type: cross Abstract: To improve crop genetics, high-throughput, effective and comprehensive phenotyping is a critical prerequisite. While such tasks were traditionally performed manually, recent advances in multimodal foundation models, especially in vision-language mode

arxivApr 14

Self-Certifying Primal-Dual Optimization Proxies for Large-Scale Batch Economic Dispatch

arXiv:2510.15850v2 Announce Type: replace-cross Abstract: Recent research has shown that optimization proxies can be trained to high fidelity, achieving average optimality gaps under 1% for large-scale problems. However, worst-case analyses show that there exist in-distribution queries that result i

arxivApr 13

Distilling Genomic Models for Efficient mRNA Representation Learning via Embedding Matching

arXiv:2604.08574v1 Announce Type: cross Abstract: Large Genomic Foundation Models have recently achieved remarkable results and in-vivo translation capabilities. However these models quickly grow to over a few Billion of parameters and are expensive to run when compute is limited. To overcome this c

arxivApr 13

dnaHNet: A Scalable and Hierarchical Foundation Model for Genomic Sequence Learning

arXiv:2602.10603v3 Announce Type: replace Abstract: Genomic foundation models have the potential to decode DNA syntax, yet face a fundamental tradeoff in their input representation. Standard fixed-vocabulary tokenizers fragment biologically meaningful motifs such as codons and regulatory elements, w

arxivApr 10

What a Comfortable World: Ergonomic Principles Guided Apartment Layout Generation

arXiv:2604.08411v1 Announce Type: cross Abstract: Current data-driven floor plan generation methods often reproduce the ergonomic inefficiencies found in real-world training datasets. To address this, we propose a novel approach that integrates architectural design principles directly into a transfo

thevergeApr 8

OpenAI made economic proposals — here’s what DC thinks of them

Happy ceasefire day and welcome to Regulator, a newsletter for Verge subscribers about Big Tech's rocky journey through the world of politics. If you're not a subscriber yet, you can do so here, but my only request is that you sign up before Donald Trump decides to revisit his previous threats towar

arxivApr 7

Entropy, Disagreement, and the Limits of Foundation Models in Genomics

arXiv:2604.04287v1 Announce Type: new Abstract: Foundation models in genomics have shown mixed success compared to their counterparts in natural language processing. Yet, the reasons for their limited effectiveness remain poorly understood. In this work, we investigate the role of entropy as a funda

arxivApr 7

The Ideation Bottleneck: Decomposing the Quality Gap Between AI-Generated and Human Economics Research

arXiv:2604.03338v1 Announce Type: cross Abstract: Autonomous AI systems can now generate complete economics research papers, but they substantially underperform human-authored publications in head-to-head comparisons. This paper decomposes the quality gap into two independent components: research id

arxivApr 6

SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation Models

arXiv:2604.02660v1 Announce Type: new Abstract: As Large Language Models (LLMs) increasingly power decision-making systems across critical domains, understanding and mitigating their biases becomes essential for responsible AI deployment. Although bias assessment frameworks have proliferated for att

arxivApr 6

Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor Economics

arXiv:2604.02403v1 Announce Type: cross Abstract: This paper establishes the theoretical and practical foundations for using Large Language Models (LLMs) as measurement instruments for latent economic variables -- specifically variables that describe the cognitive content of occupational tasks at a

arxivApr 3

Modeling Irregular Astronomical Time Series with Neural Stochastic Delay Differential Equations

arXiv:2508.17521v2 Announce Type: replace Abstract: Astronomical time series from large-scale surveys like LSST are often irregularly sampled and incomplete, posing challenges for classification and anomaly detection. We introduce a new framework based on Neural Stochastic Delay Differential Equatio

arxivApr 2

Neural Ordinary Differential Equations for Modeling Socio-Economic Dynamics

arXiv:2604.00632v1 Announce Type: cross Abstract: Poverty is a complex dynamic challenge that cannot be adequately captured using predefined differential equations. Nowadays, artificial machine learning (ML) methods have demonstrated significant potential in modelling real-world dynamical systems. A

arxivApr 1

Economics of Human and AI Collaboration: When is Partial Automation More Attractive than Full Automation?

arXiv:2603.29121v1 Announce Type: cross Abstract: This paper develops a unified framework for evaluating the optimal degree of task automation. Moving beyond binary automate-or-not assessments, we model automation intensity as a continuous choice in which firms minimize costs by selecting an AI accu

arxivMar 31

On the Carbon Footprint of Economic Research in the Age of Generative AI

arXiv:2603.26712v1 Announce Type: cross Abstract: Generative artificial intelligence (AI) is increasingly used to write and refactor research code, expanding computational workflows. At the same time, Green AI research has largely measured the footprint of models rather than the downstream workflows

openaiOct 23

AI in South Korea—OpenAI’s Economic Blueprint

OpenAI's Korea Economic Blueprint outlines how South Korea can scale trusted AI through sovereign capabilities and strategic partnerships to drive growth.

openaiOct 22

AI in Japan—OpenAI’s Japan Economic Blueprint

OpenAI’s Japan Economic Blueprint outlines how Japan can harness AI to boost innovation, strengthen competitiveness, and enable sustainable, inclusive growth.

openaiSep 4

Expanding economic opportunity with AI

OpenAI is launching a Jobs Platform and new Certifications to connect workers with jobs, training, and certifications. Learn how we’re expanding economic opportunity and making AI skills more accessible.