Over-Searching in Search-Augmented Large Language Models
AuthorsRoy Xie†**, Deepak Gopinath**, David Qiu**, Dong Lin, Haitian Sun, Saloni Potdar, Bhuwan Dhingra†
Over-Searching in Search-Augmented Large Language Models
AuthorsRoy Xie†**, Deepak Gopinath**, David Qiu**, Dong Lin, Haitian Sun, Saloni Potdar, Bhuwan Dhingra†
Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval. However, they often over-search – unnecessarily invoking search tool even when it does not improve response quality, which leads to computational inefficiency and hallucinations by incorporating irrelevant context. In this work, we conduct a systematic evaluation of over-searching across multiple dimensions, including query types, model categories, retrieval conditions, and multi-turn conversations. Our finding shows: (i) search generally improves answer accuracy on answerable queries but harms abstention on unanswerable ones; (ii) over-searching is more pronounced in complex reasoning models and deep research systems, is exacerbated by noisy retrieval, and compounds across turns in multi-turn conversations; and (iii) the composition of retrieved evidence is crucial, as the presence of negative evidence improves abstention. To quantify over-searching, we introduce Tokens Per Correctness (TPC), an evaluation metric that captures the performance-cost trade-off for search-augmented LLMs. Lastly, we investigate mitigation approaches at both the query and retrieval levels and release the OverSearchQA benchmark to foster continued research into efficient search-augmented LLMs.
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
January 12, 2026research area Computer Vision, research area Speech and Natural Language Processing
Multimodal Large Language Models (MLLMs) in real-world applications require access to external knowledge sources and must remain responsive to the dynamic and ever-changing real-world information in order to address information-seeking and knowledge-intensive user queries. Existing approaches, such as retrieval augmented generation (RAG) methods, search agents, and search equipped MLLMs, often suffer from rigid pipelines, excessive search calls,…
Context Tuning for Retrieval Augmented Generation
December 18, 2023research area Knowledge Bases and Search, research area Speech and Natural Language ProcessingWorkshop at EACL
This paper was accepted at the UncertaiNLP workshop at EACL 2024.
Large language models (LLMs) have the remarkable ability to solve new tasks with just a few examples, but they need access to the right tools. Retrieval Augmented Generation (RAG) addresses this problem by retrieving a list of relevant tools for a given task. However, RAG’s tool retrieval step requires all the required information to be explicitly present in the query. This is a…