Generating Concept Lexicalizations via Dictionary-Based Cross-Lingual Sense Projection
View PDF HTML (experimental) Abstract:We study the task of automatically expanding WordNet-style lexical resources to new languages through sense generation. We generate senses by associating target-language lemmas with existing lexical concepts via semantic projection. Given a sense-tagged English corpus and its translation, our method projects English synsets onto aligned target-language tokens and assigns the corresponding lemmas to those synsets. To generate these alignments and ensure their quality, we augment a pre-trained base aligner with a bilingual dictionary, which is also used to filter out incorrect sense projections. We evaluate the method on multiple languages, comparing it to prior methods, as well as dictionary-based and large language model baselines. Results show that the proposed project-and-filter strategy improves precision while remaining interpretable and requiring few external resources. We plan to make our code, documentation, and generated sense inventories accessible. Comments: To be published in the proceedings of Canadian AI 2026 Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI) Cite as: arXiv:2604.14397 [cs.CL] (or arXiv:2604.14397v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2604.14397 arXiv-issued DOI via DataCite (pending registration) Submission history From: Ning Shi [view email] [v1] Wed, 15 Apr 2026 20:27:26 UTC (290 KB)
No replies yet. Be first.