The rapid growth of Large Language Models (LLMs) has increased the demand for effective retrieval mechanisms in Retrieval Augmented Generation (RAG). Current RAG models primarily rely on vector similarity matching, which limits their ability to uncover latent semantic relationships between queries and documents. To enhance the retrieval phase of RAG, we propose a framework that incorporates topic modeling in the RAG pipeline for semantically reranking the retrieved results. This approach, which we refer to as Topic Enhanced Reranking (TER), enables the retrieval system to utilize latent topics within queries and documents, thereby improving the overall precision and semantic relevance of the RAG results. We present a detailed overview of the proposed method, followed by its experimental evaluation on both the retrieval stage and the end-to-end performance of the RAG pipeline. We compare our results against baseline RAG and a state of the art reranker in order to demonstrate the effectiveness of TER in improving both the retrieval performance and the quality of text produced by the LLM. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.