AMES: Approximate Multi-modal Enterprise Search via Late Interaction Retrieval
AuthorsTony Joseph, Carlos Pareja, David Lopes Pegna, Abhishek Singh
AMES: Approximate Multi-modal Enterprise Search via Late Interaction Retrieval
AuthorsTony Joseph, Carlos Pareja, David Lopes Pegna, Abhishek Singh
We present AMES (Approximate Multimodal Enterprise Search), a unified multimodal late interaction retrieval architecture which is backend agnostic. AMES demonstrates that fine-grained multimodal late interaction retrieval can be deployed within a production grade enterprise search engine without architectural redesign. Text tokens, image patches, and video frames are embedded into a shared representation space using multi-vector encoders, enabling cross-modal retrieval without modality specific retrieval logic. AMES employs a two-stage pipeline: parallel token level ANN search with per document Top-M MaxSim approximation, followed by accelerator optimized Exact MaxSim re-ranking. Experiments on the ViDoRe V3 benchmark show that AMES achieves competitive ranking performance within a scalable, production ready Solr based system.
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
January 12, 2026research area Computer Vision, research area Speech and Natural Language Processing
Multimodal Large Language Models (MLLMs) in real-world applications require access to external knowledge sources and must remain responsive to the dynamic and ever-changing real-world information in order to address information-seeking and knowledge-intensive user queries. Existing approaches, such as retrieval augmented generation (RAG) methods, search agents, and search equipped MLLMs, often suffer from rigid pipelines, excessive search calls,…
Context Tuning for Retrieval Augmented Generation
December 18, 2023research area Knowledge Bases and Search, research area Speech and Natural Language ProcessingWorkshop at EACL
This paper was accepted at the UncertaiNLP workshop at EACL 2024.
Large language models (LLMs) have the remarkable ability to solve new tasks with just a few examples, but they need access to the right tools. Retrieval Augmented Generation (RAG) addresses this problem by retrieving a list of relevant tools for a given task. However, RAG’s tool retrieval step requires all the required information to be explicitly present in the query. This is a…