Attention is all you need (Transformer) - Model explanation (including math), Inference and Training | Umar Jamil | Podwise