[Archimedes Talks&Tutorial Series] Tensor Decompositions in Large Scale Deep Learning

Dates
2024-06-13 12:00 - 15:30
Venue
Artemidos 1 - Amphitheater

 

Archimedes Talks & Tutorial Series: Tensor Decompositions in Large Scale Deep Learning.

Prof. Mihalis A. Nicolaou (Associate Professor at the Computation-based Science and Technology Research Center at The Cyprus Institute) & James Oldfield (PhD Student at Queen Mary University of London)

Talk (12:00-13:00, Mihalis Nicolaou)

Despite the initial success of deep learning in discriminative tasks, most recently, large-scale, often generative models (i.e., foundation models) have emerged as a dominant paradigm. Utilizing broad-data pre-training at scale, such models have proven more robust and generalizable to alternatives, albeit being considered more opaque due to their sheer scale. In this talk, we will discuss recent works that leverage tensor methods to make large-scale deep networks more interpretable, controllable, fair, and efficient – for example, by enabling unsupervised local editing in pre-trained networks, making fine-tuning of large models to new tasks efficient, scaling sub computations to achieve specialization, and grounding visual variability to concepts in vision-language models.


Tutorial (13:15-15:15, James Oldfield)

Modern deep learning architectures, such as transformers and convolutional neural networks (CNNs), leverage multi-dimensional representations (tensors) to process input data effectively. Consequently, many standard operations in deep neural networks can be understood through repeated multiplications and summations over various intermediate tensors and weights. In this tutorial, we explore how to unify these common operations in PyTorch through multilinear operations, and how this paradigm provides a flexible framework for designing and implementing novel deep learning architectures and techniques. Concrete examples from our recent work will be presented, including using factorized computation with einsum to efficiently scale the expert count in mixture of experts (μMoE layer).



Bio:Mihalis A. Nicolaou is Associate Professor at the Computation-based Science and Technology Research Center at The Cyprus Institute. Previously, he has held positions at Imperial College London and the University of London. He received the B.Sc. degree from the University of Athens, Greece, and the M.Sc. and Ph.D. degrees from the Department of Computing, Imperial College London, U.K.


Bio:James Oldfield is currently a PhD student at Queen Mary University of London. Previously, he was a research intern at The Cyprus Institute and Huawei Noah's Ark. His recent research focuses on interpretable and controllable deep learning models.


 

________________________________________________________________________________
Microsoft Teams Need help?
Meeting ID: 362 050 420 962
Passcode: 5HMeLm

For organizers: Meeting options
 
 
________________________________________________________________________________
 
 
 
 

The project “ARCHIMEDES Unit: Research in Artificial Intelligence, Data Science and Algorithms” with code OPS 5154714 is implemented by the National Recovery and Resilience Plan “Greece 2.0” and is funded by the European Union – NextGenerationEU.

greece2.0 eu_arch_logo_en

 

Stay connected! Subscribe to our mailing list by emailing sympa@lists.athenarc.gr
with the subject "subscribe archimedes-news Firstname LastName"
(replace with your details)