arxivApr 11
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale
arXiv:2604.04771v2 Announce Type: replace-cross Abstract: Current document parsing methods advance primarily through model architecture innovation, while systematic engineering of training data remains underexplored. Yet state-of-the-art models spanning diverse architectures and parameter scales exh