Model Detail
bert-base-indonesian-NER
—Machine learning and emoji prediction: How much accuracy can MARBERT achieve?
arXiv:2604.21108v2 Announce Type: replace Abstract: This study investigates Machine Learning (ML) in the prediction of emojis in Arabic tweets employing the (state-of-the-art) MARBERT model. A corpus of 11379 CA tweets representing multiple Arabic colloquial dialects was collected from X.com via Pyt
Duluth at SemEval-2026 Task 6: DeBERTa with LLM-Augmented Data for Unmasking Political Question Evasions
arXiv:2604.20168v1 Announce Type: new Abstract: This paper presents the Duluth approach to SemEval-2026 Task 6 on CLARITY: Unmasking Political Question Evasions. We address Task 1 (clarity-level classification) and Task 2 (evasion-level classification), both of which involve classifying question--an
NameBERT: Scaling Name-Based Nationality Classification with LLM-Augmented Open Academic Data
arXiv:2604.10401v2 Announce Type: replace Abstract: Inferring nationality from personal names is a critical capability for equity and bias monitoring, personalization, and a valuable tool in biomedical and sociological research. However, existing name-based nationality classifiers are typically trai
Diagnosable ColBERT: Debugging Late-Interaction Retrieval Models Using a Learned Latent Space as Reference
arXiv:2604.19566v1 Announce Type: cross Abstract: Reliable biomedical and clinical retrieval requires more than strong ranking performance: it requires a practical way to find systematic model failures and curate the training evidence needed to correct them. Late-interaction models such as ColBERT p
WARBERT: A Hierarchical BERT-based Model for Web API Recommendation
arXiv:2509.23175v2 Announce Type: replace-cross Abstract: With the rise of Web 2.0 and microservices, the increasing availability of Web APIs has intensified the need for effective recommendation systems. Existing approaches are generally categorized into two methods: recommendation-type methods, wh
LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning
arXiv:2604.16058v1 Announce Type: cross Abstract: The rapid proliferation of Large Language Models (LLMs) in software development has made distinguishing AI-generated code from human-written code a critical challenge with implications for academic integrity, code quality assurance, and software secu