Dion Optimizer: Transforming Distributed AI Training Efficiency Optimizers such as Adam and AdamW have been essential to training large-scale neural networks. However, as model sizes soar into the trillions of parameters, the need for more efficient training metho... AI optimization deep learning distributed training large language models open source orthonormal updates PyTorch scalability
Democratizing Scalable Mixture-of-Experts Training in PyTorch with NVIDIA NeMo Automodel Training state-of-the-art Mixture-of-Experts (MoE) models has traditionally requiredspecialists with deep distributed systems knowledge and access to high-end infrastructure. Now, NVIDIA’s NeMo Automo... distributed training LLMs MoE NVIDIA open source performance optimization PyTorch
How Monarch and Lightning AI Are Transforming Distributed PyTorch Training in Notebooks Scaling AI experiments across massive GPU clusters is often a logistical challenge, especially for teams who want to maintain the interactive, iterative workflow of notebook development. The new integ... AI development debugging distributed training GPU clusters Lightning AI Monarch notebooks PyTorch