We had an amazing talk by Dr. Guru Guruganesh from Google Research team on their recent NeurIPS paper: "Big Bird: Transformers for Longer Sequences".

It was a great opportunity to learn about the novel optimization strategies and techniques they are exploring, in order to train and scale up transformer models making them bigger, better and more efficient.