Title: A proposal for the mathematical structure computed by large language models
Presenter: Dr.Yiannis Vlassopoulos (Institute for Language and Speech Processing at "Athena" Research Center)

Abstract: Large Language Models are transformer neural networks which are trained to produce a probability distribution on the possible next words to given texts in a corpus, in such a way that the most likely word predicted, is the actual word in the training text.
We will explain what is the mathematical structure defined by such conditional probability distributions of text extensions. Changing the viewpoint from probabilities to log probabilities we observe that the data of text extensions are encoded in a directed (non-symmetric) metric structure defined on the space of texts . We then construct a directed metric polyhedron P(), in which is isometrically embedded as generators of certain special extremal rays. Each such generator encodesextensions of a text along with the corresponding probabilities.
Moreover P() is (min; +) (i.e. tropically) generated by the text extremal rays. This leads to a duality theorem relating the polyhedron P() defined by text ex- tensions to one defined by text restrictions. We also explain that the generator of the extremal ray corresponding to a text is approximated by a Boltzmann weighted linear combination of generators of extremal rays corresponding to the words making up that text. We note that these constructions generalise the familiar view of language as a monoid or as a poset with the subtext order.
This is joint work with Stephane Gaubert.
Bio: Yiannis Vlassopoulos earned a degree in Mathematics from the University of Athens in1992 and a Ph.D from Duke University in 1998.
His thesis was on Algebraic Geometry related to String Theory (specifically so called Mirror Symmetry, a duality between Symplectic and Algebraic Geometry). He obtained a Marie Curie Individual Fellowship with Pr. Maxim Kontsevich at the Institut des Hautes Etudes Scientifiques (IHES) in Paris, in 2002.
He subsequently worked as a researcher in IHES for an extended period of time until 2019, on non-Commutative Derived Algebraic Geometry and Topological Quantum Field Theories. In particular, in collaboration with Maxim Kontsevich, they introduced the notion of Pre-Calabi-Yau algebra which is a non-commutative analogue of a Poisson structure.
He also worked as a visiting Professor at the University of Vienna (Austria) and has been a visiting fellow at the University of Miami, Aarhus University (Denmark), the Max Planck Institute for Mathematics in Bonn and the Simons Center for Geometry and Physics at the Stoney Brook University in NY.
He obtained an ENTER fellowship at the University of Athens for the period 2006-2008.
Since 2015 he has focused on Natural Language modelling. Initially, using Tensor Networks and applying algebra and physics technics. He cofounded a company in NY in 2017 in order to develop this technology. Currently he is using Category Theory and Tropical geometry in order to model the structure that Transformer Neural Networks (like GPT) learn when they are trained to guess the next word in a text. One of the main goals is to understand how semantics is encoded and could potentially be controlled as far as logical implications are concerned.
________________________________________________________________________________
Microsoft Teams Need help?Join the meeting now
Meeting ID: 347 633 128 071
Passcode: Z76hKX
________________________________________________________________________________