神马福利影片

Skip to main content Skip to navigation

Programme

map

All talks are held in-person in MS.01 in the Zeeman BuildingLink opens in a new window. Registration and coffee breaks are held outside MS.01.

To contact the organisers, you may email probai-scaling-26@googlegroups.com.

Tentative schedule

More details will be made available closer to the workshop.

Day Time Activity
22 Jun (Mon)

13:00 - 13:45

Registration and coffee
  14:00 - 15:30 Tutorial: Tensor Programs to Derive Infinite Width Limits (Leena C Vankadara)
  15:30 - 16:00

Coffee break 

  16:00 - 17:30

Tutorial: The Proportional Depth-Width Scaling Limit of Neural Networks (Mufan Li)

Abstract: We study the scaling limit of neural networks without skip connects, where the depth d and width n approach infinity at a constant ratio d/n. In this limiting regime, we can review each layer of the neural network as a time discretization, and derive a limiting SDE for the feature covariance matrix.

     
23 Jun (Tue) 08:50 - 09:20 Coffee
  09:20 - 11:00

Tutorial: Dynamical Mean Field Theory, Random Matrices and Learning in High Dimensions (Blake Bordelon)

  11:00 - 12:00 Tutorial: Infinite-size Limit for ResNets, Part I (Louis-Pierre Chaintron)
  12:00 - 13:30

Lunch provided at venue

  13:30 - 15:00 Research Talk: Infinite-size Limit for ResNets, Part II (Louis-Pierre Chaintron)
  15:00 - 15:30 Coffee break
  15:30 - 16:40

Research talk: How to train an LLM (Sam Smith)

Abstract: Drawing on the experience of designing and scaling Griffin (https://arxiv.org/abs/2402.19427) and RecurrentGemma, I will introduce some of the key practical concepts behind training large language models. Likely to include: a brief introduction to Transformers, including why MLPs, not Attention, usually dominate computation. A simple mental model of the computational bottlenecks on TPUs and GPUs. How to train models too large to fit in memory on a single device. Scaling laws and hyper-parameter tuning. A detailed discussion of LLM inference. If time permits, I will discuss how to design recurrent models competitive with transformers, their advantages and drawbacks.

  16:40 - 17:50  Research talk
  17:50 - 18:30 Break
  18:30 - 20:30

On-campus dinner for attendees (registration required)

     
 24 Jun (Wed) 08:30 - 09:00 Coffee
  09:00 - 10:10 Research talk
  10:10 - 11:20 Research talk
  11:20 - 11:40 Coffee break
  11:40 - 12:50

Research talk

  12:50 - 14:20

Workshop closure & lunch provided at venue



Let us know you agree to cookies