Abstract: We benchmark 90 chunker–model configurations across seven arXiv domains (2520 retrieval runs) and show that a sentence-based splitter with a 512-token window and 200-token overlap reaches ...