Apple Workshop on Natural Language and Interactive Systems 2025: Opening Up the Full Language Model Pipeline: OLMo and Tülu
AuthorsHanna Hajishirzi (University of Washington)
Apple Workshop on Natural Language and Interactive Systems 2025: Opening Up the Full Language Model Pipeline: OLMo and Tülu
AuthorsHanna Hajishirzi (University of Washington)
PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
May 4, 2026research area Speech and Natural Language Processing, research area Tools, Platforms, Frameworks
Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with calls to external tools. However, training such agents using outcome-only rewards suffers from credit-assignment ambiguity, obscuring which intermediate steps (or tool-use decisions) lead to success or failure. In this paper, we propose PORTool, an importance-aware policy-optimization algorithm that…
Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents
May 1, 2026research area Methods and Algorithms, research area Tools, Platforms, FrameworksWorkshop at ACL
This paper was accepted at the Fifth Workshop on Natural Language Generation, Evaluation, and Metrics at ACL 2026.
Tool-calling agents are evaluated on tool selection, parameter accuracy, and scope recognition, yet LLM trajectory assessments remain inherently post-hoc. Disconnected from the active execution loop, such assessments identify errors that are usually addressed through prompt-tuning or retraining, and fundamentally cannot…