[Archimedes Talks&Tutorial Series] Tensor Decompositions in Large Scale Deep Learning

25

20th Athens Colloquium on Algorithms and Complexity

Details will follow Home Program Lecturers & Speakers Registration Venue Organizers & Sponsors Home TBC Program Duration: 2 days - July 25-26,

Date : 2025-08-25

26

20th Athens Colloquium on Algorithms and Complexity

Details will follow Home Program Lecturers & Speakers Registration Venue Organizers & Sponsors Home TBC Program Duration: 2 days - July 25-26,

Date : 2025-08-26

27

20th Athens Colloquium on Algorithms and Complexity

Details will follow Home Program Lecturers & Speakers Registration Venue Organizers & Sponsors Home TBC Program Duration: 2 days - July 25-26,

Date : 2025-08-27

Dates

2024-06-13 12:00 - 15:30

Venue

Artemidos 1 - Amphitheater

Archimedes Talks & Tutorial Series: Tensor Decompositions in Large Scale Deep Learning.

Prof. Mihalis A. Nicolaou (Associate Professor at the Computation-based Science and Technology Research Center at The Cyprus Institute) & James Oldfield (PhD Student at Queen Mary University of London)

Talk (12:00-13:00, Mihalis Nicolaou)

Despite the initial success of deep learning in discriminative tasks, most recently, large-scale, often generative models (i.e., foundation models) have emerged as a dominant paradigm. Utilizing broad-data pre-training at scale, such models have proven more robust and generalizable to alternatives, albeit being considered more opaque due to their sheer scale. In this talk, we will discuss recent works that leverage tensor methods to make large-scale deep networks more interpretable, controllable, fair, and efficient – for example, by enabling unsupervised local editing in pre-trained networks, making fine-tuning of large models to new tasks efficient, scaling sub computations to achieve specialization, and grounding visual variability to concepts in vision-language models.

Tutorial (13:15-15:15, James Oldfield)

Modern deep learning architectures, such as transformers and convolutional neural networks (CNNs), leverage multi-dimensional representations (tensors) to process input data effectively. Consequently, many standard operations in deep neural networks can be understood through repeated multiplications and summations over various intermediate tensors and weights. In this tutorial, we explore how to unify these common operations in PyTorch through multilinear operations, and how this paradigm provides a flexible framework for designing and implementing novel deep learning architectures and techniques. Concrete examples from our recent work will be presented, including using factorized computation with einsum to efficiently scale the expert count in mixture of experts (μMoE layer).

Bio:Mihalis A. Nicolaou is Associate Professor at the Computation-based Science and Technology Research Center at The Cyprus Institute. Previously, he has held positions at Imperial College London and the University of London. He received the B.Sc. degree from the University of Athens, Greece, and the M.Sc. and Ph.D. degrees from the Department of Computing, Imperial College London, U.K.

Bio:James Oldfield is currently a PhD student at Queen Mary University of London. Previously, he was a research intern at The Cyprus Institute and Huawei Noah's Ark. His recent research focuses on interpretable and controllable deep learning models.

________________________________________________________________________________

Microsoft Teams Need help?

Join the meeting now

Meeting ID: 362 050 420 962

Passcode: 5HMeLm

For organizers: Meeting options

________________________________________________________________________________

Stay connected! Subscribe to our mailing list by emailing sympa@lists.athenarc.gr
with the subject "subscribe archimedes-news Firstname LastName"
(replace with your details)