Tag

#artificial intelligence

9 articles tagged #artificial intelligence

arxiv3d agobullish

PhasorFlow: A Python Library for Unit Circle Based Computing

arXiv:2603.15886v4 Announce Type: replace-cross Abstract: We present PhasorFlow, an open-source Python library for computing on the $S^1$ unit circle. Inputs are encoded as complex phasors $z=e^{i\phi}$ on the $N$-torus ($\mathbb{T}^N$); as computation proceeds through unitary wave-interference gate

PHVAPH4 models · +1 #open-source #machine learning #artificial intelligence Read on arxiv →

arxiv3d agobullish

Function-Aware Fill-in-the-Middle as Mid-Training for Coding Agent Foundation Models

arXiv:2607.12463v2 Announce Type: replace Abstract: Coding agents must integrate external tool returns into ongoing reasoning - a capability that standard left-to-right pretraining on code exposes only in its forward direction. We observe that the action-observation-continuation loop of a coding age

QWQW2 models #pretraining #self-supervised #mid-training Read on arxiv →

arxiv6d ago

Sensitivity to Subjective Expected Utility Maximization: A Methodological Study, with an Illustrative Application to LLM Decision-Making

arXiv:2607.11920v1 Announce Type: cross Abstract: Evaluating decisions made under uncertainty is hard when labeled outcomes are scarce, costly, or confounded with luck. We treat subjective expected utility (SEU) maximization as a stated standard and define a graded measure -- SEU sensitivity -- of a

GPCL2 models #econometrics #artificial intelligence #methodology Read on arxiv →

arxivJun 18bearish

"Did you lie?" Evaluating Lie Detectors across Model Scale and Belief-Verified Model Organisms

arXiv:2606.12618v2 Announce Type: replace Abstract: Robust lie detectors for language models could enable powerful techniques for auditing, monitoring, and post-hoc investigation of model behaviour, but evaluating them requires testbeds where models verifiably believe the opposite of what they say.

DICHLO4 models · +1 #lie detection #language models #model evaluation Read on arxiv →

arxivMay 21bullish

From SGD to Muon: Adaptive Optimization via Schatten-p Norms

arXiv:2605.19781v1 Announce Type: new Abstract: Modern optimizers, like Muon, impose matrix-wise geometry constraints on their updates. These matrix-wise constraints can be unified under Linear Minimization Oracle (LMO) theory. However, all current methods impose fixed LMO geometries for the update

MUSGAD5 models · +2 #optimization #deep learning #neural networks Read on arxiv →

arxivMay 19

Fidelity Probes for Specification--Code Alignment

arXiv:2605.17246v1 Announce Type: cross Abstract: We introduce fidelity probes: natural-language questions generated from a reference artifact with code-derived ground-truth answers, answered from a candidate specification. The fraction of agreeing probes, which we call the fidelity, decomposes into

LLANDE7 models · +4 #machine learning #artificial intelligence #benchmark Read on arxiv →

arxivMay 16bullish

Pelican-Unified 1.0: A Unified Embodied Intelligence Model for Understanding, Reasoning, Imagination and Action

arXiv:2605.15153v1 Announce Type: cross Abstract: We present Pelican-Unified 1.0, the first embodied foundation model trained according to the principle of unification. Pelican-Unified 1.0 uses a single VLM as a unified understanding module, mapping scenes, instructions, visual contexts, and action

PE1 model #robotics #artificial intelligence #unified models Read on arxiv →

arxivMay 11

The Single-File Test: A Longitudinal Public-Interface Evaluation of First-Output LLM Web Generation with Social Reach Tracking

arXiv:2605.06707v1 Announce Type: cross Abstract: This paper presents an eight-week observational comparison of 68 single-file HTML generations collected across 17 public experiments in the "HTML AI Battle" project between December 10, 2025 and February 4, 2026. Four reasoning model families, GPT, G

GPGEGR4 models · +1 #software engineering #artificial intelligence #benchmark Read on arxiv →

arxivApr 16bullish

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

arXiv:2604.11950v1 Announce Type: cross Abstract: While recent LLM-based agents can identify many candidate bugs in source code, their reports remain static hypotheses that require manual validation, limiting the practicality of automated bug detection. We frame this challenge as a test generation t

CLCO2 models #software engineering #bug detection #test generation Read on arxiv →