## 🚀 Feature Add an arg for trainer to turn on detect_anomaly. So that when gradients get nan, the model will automatically detect. ### Motivation For easier debugging.