Archimedes NLP Theme: Invited Lecture, Tuesday 9 June, 18:00-19:30 (Greek time)
Title: "Retrieval Augmented Large Language Models (RAG-LLMs)"
Dial-in information is not available for this meeting.
Abstract:
It has been impressive to Deep Learning researchers how nowadays general-purpose sequence-to-sequence models are getting really powerful, they manage to capture the world knowledge in parameters, they achieve strong results on loads of tasks and are applicable for almost everything. However, they still often hallucinate, may usually struggle to access, and apply knowledge and are difficult to update. On the other hand, modern Information Retrieval (IR) is great as well, as externally reviewed knowledge may become useful for a huge variety of NLP tasks. Modern IR provides a precise and accurate knowledge access mechanism, it is trivial to update, whereas by “modern” IR we refer to dense retrieval that starts to outperform traditional IR. On the negative side though, it still needs retrieval supervision or heuristics such as BM25, as well as some –usually task specific–way to integrate into downstream tasks.
The main idea behind Retrieval Augmented Large Language Models (RAG-LLMs) was to combine the massive success of parametric sequence-to-sequence models with the strengths of neural retrievers by coupling Large Language Models (LLMs) to an external memory mechanism based on either sparse or dense retrievers, as well as combinations of the two approaches. The “semi-parametric” design of these models, in which the LLM generator acts as a parametric memory and the retriever as a non-parametric memory provides better customization, addresses the issue of staleness and enables grounding that may reduce hallucination through attribution. In this talk, I will present several examples of RAG-LLMs, I will discuss the training techniques exploited in each of them and will propose directions for future work.
Stay tuned for future events:
If you are an AI researcher or practitioner, please consider becoming a member of the Hellenic Artificial Intelligence Society (EETN,
http://www.eetn.gr/en/).
________________________________________________________________________________
Meeting ID: 332 590 795 076 84