Multilingual Training and Evaluation Resources for Vision-Language Models
arXiv:2604.18347v1 Announce Type: new Abstract: Vision Language Models (VLMs) achieved rapid progress in the recent years. However, despite their growth, VLMs development is heavily grounded on English, leading to two main limitations: (i) the lack of multilingual and multimodal datasets for trainin