Activation Functions & Optimization
Aim
To study and compare different activation functions and optimization algorithms by training the same MLP on the Fashion-MNIST dataset. To analyze the performance of ReLU, Sigmoid, and Tanh activations with SGD and Adam optimizers. To evaluate their impact on training dynamics using overlaid loss/accuracy curves and gradient-flow visualizations.