Why are Transformer so effective in Large Language Models like ChatGPT | Jay Shah | Podwise