Friday, June 13, 2025 10am to 12pm
About this Event
Training Large Language Models (LLMs) requires a large neural network, large data, and large compute. We will discuss these difficulties. We’ll look at the Transformer architecture in detail to develop a quantitative understanding of how it works and how specifically tools like ChatGPT, DeepSeek, Llama, etc. work. We will then use a pre-trained SentenceTransformer model to do a range of classification on real-world data.
0 people are interested in this event
User Activity
No recent activity