DataBubble·

Model Detail

nomic-embed-text-v1.5

▼ 0.6%

Provider: nomic-aiCategory: codePipeline: sentence-similarity

DB Score

1.2

Downloads

17.1M

Likes

833

Day

-0.6%

Week

+0.0%

Month

+0.0%

Overview

nomic-embed-text-v1.5 is a code generation model with 68M parameters released by nomic-ai. The model is registered under the sentence-similarity pipeline tag on Hugging Face, distributed under the permissive apache-2.0 license.

Technical

nomic-embed-text-v1.5 ships with 68M parameters. The apache-2.0 license is permissive, allowing commercial deployment and derivative work without per-seat fees, though attribution requirements still apply.

Trending Signal

Downloads of nomic-embed-text-v1.5 have moved -0.6% over the past 24 hours. That is a slight downtrend, consistent with normal cooling as newer models compete for the same workloads. These numbers are signal, not guarantee — week-over-week download counts on Hugging Face also reflect mirror traffic, CI scrapes, and one-off benchmarking runs.

Read about databubble_score →

Use Cases

nomic-embed-text-v1.5 is best fit for code completion, repository-scale Q&A, and pair-programming integrations. It is a less obvious choice for one-shot generation of security-critical code without review. Treat this as a starting matrix rather than a benchmark verdict — the right deployment usually depends on the specific evaluation suite that mirrors your workload.

Download History

Research Paper

arXiv: 2402.01613→

Model Info

Licenseapache-2.0

Citations295 (25 influential)

Recent newsView all news →

Synthetic Personalities: How Well Can LLMs Mimic Individual Respondents Using Socio-Economic Microdata?

arXiv:2606.04592v1 Announce Type: cross Abstract: LLM-based digital twins promise to scale and accelerate market research, but most published twins are either coarse persona bots conditioned on a few demographic questions or detailed individual-level twins built on purpose-collected surveys and inte

arxiv17h ago

BioBlue: Systematic runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format

arXiv:2509.02655v3 Announce Type: replace-cross Abstract: Many AI alignment discussions of "runaway optimisation" focus on RL agents: unbounded utility maximisers that over-optimise a proxy objective (e.g., "paperclip maximiser", specification gaming) at the expense of everything else. LLM-based sys

arxiv17h ago

$p$-adic Bi-Filtrations for Topological Machine Learning on Genomic Sequences

arXiv:2606.06117v1 Announce Type: cross Abstract: We introduce pVR, a topological machine learning framework for alignment-free genomic sequence classification that combines $p$-adic numbers with topological data analysis. Each DNA sequence is encoded along two complementary axes: a $p$-adic distanc

arxivneutral1d ago

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling

arXiv:2606.04552v1 Announce Type: new Abstract: Genomic foundation models increasingly adopt large language model architectures, yet almost universally rely on fixed tokenization schemes such as $k$-mers, BPE, or single nucleotides, which impose arbitrary sequence boundaries that may obscure biologi

arxiv1d ago

GENEB: Why Genomic Models Are Hard to Compare

arXiv:2606.04525v1 Announce Type: new Abstract: Progress in genomic foundation models is difficult to assess due to fragmented benchmarks, incompatible evaluation protocols, and task-specific reporting. As a result, claims of superiority or generality across models are often not directly comparable.

arxivneutral2d ago

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

arXiv:2606.03092v1 Announce Type: new Abstract: Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet real-world deployment is constrained by strict computational budgets. In this work, we formulate inference budget allocation as a global const

Related Models

all-MiniLM-L6-v2

SBERT · 254.9M downloads

nomic-embed-text-v1.5

nomic-ai · 17.1M downloads