3 Trillion Tokens Unveiled: Navigating the Landscape of the Largest Open-Source LLM Data Set | Midjourney | Podwise