mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVR
AuthorsKonstantin Dobler†**‡, Simon Lehnerer‡, Federico Scozzafava, Jonathan Janke, Mohamed Ali
mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVR
AuthorsKonstantin Dobler†**‡, Simon Lehnerer‡, Federico Scozzafava, Jonathan Janke, Mohamed Ali
Reinforcement Learning with Verifiable Rewards (RLVR) has been successfully applied to significantly boost the capabilities of pretrained large language models, especially in the math and logic problem domains. However, current research and available training datasets remain English-centric. While multilingual training data and benchmarks have been created in the past, they were not created with RLVR and current model capability in mind, and their level of difficulty is often too low to provide appropriate training signals for current models. To address this gap, we provide mAceReason-Math, a dataset of high-quality translations of challenging math problems sourced from a corpus specifically curated for RLVR (AceReason-Math). We further take specific care to clean and improve our translations, resulting in a coverage of 14 languages with more than 10,000 samples per language. We release the dataset to facilitate multilingual RLVR research and benchmarking in the research community.
Multilingual Reasoning Gym: Multilingual Scaling of Procedural Reasoning Environments
March 13, 2026research area Speech and Natural Language Processing
We present the Multilingual Reasoning Gym, an extension of Reasoning Gym (Stojanovski et al., 2025), that procedurally generates verifiable reasoning problems across 14 languages. We translate templates for 94 tasks with native-speaker validation in 10 languages and targeted code or template adaptations to ensure linguistic naturalness. The Multilingual Reasoning Gym preserves the core benefits of the procedural generation approach used in the…
Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
May 16, 2025research area Speech and Natural Language Processingconference ACL
Current Large Language Models (LLMs) are predominantly designed with English as the primary language, and even the few that are multilingual tend to exhibit strong English-centric biases. Much like speakers who might produce awkward expressions when learning a second language, LLMs often generate unnatural outputs in non-English languages, reflecting English-centric patterns in both vocabulary and grammar. Despite the importance of this issue,…