Archimedes Talk on "From Approximate Membership Filters to LLM Hallucination: A Rate-Distortion View" by Jingwei Li (IEOR, Columbia University)
Abstract:
Large language models often hallucinate with high confidence on “random facts” that lack inferable patterns. This work formalizes the memorization of such facts as a membership testing problem, connecting the discrete error metrics of Bloom-type filters with the continuous confidence scores of LLMs. In the sparse regime, the optimal memory-error tradeoff is characterized by a rate-distortion theorem: the memory required per stored fact is determined by the minimum KL divergence between score distributions on facts and non-facts. This framework gives a distinctive explanation for hallucination under an idealized setting. Even with optimal training, perfect data, and a closed-world assumption, the information-theoretically optimal strategy under limited capacity is not simply to abstain, forget, or remain uncertain, but to assign high confidence to some non-facts. Thus, hallucination emerges as a natural consequence of lossy compression. The same theorem also recovers and sharpens classical space lower bounds for Bloom-type and two-sided filters, highlighting a fundamental frontier between hallucination, over-refusal, and memory.