[QA] Pre-training Small Base LMs with Fewer Tokens | Arxiv Papers | Podwise