Logo

EleutherAI releases massive AI training dataset of licensed and open domain text - TechCrunch

08.06.2025 12:57

EleutherAI releases massive AI training dataset of licensed and open domain text - TechCrunch

EleutherAI, an AI research organization, has released what it claims is one of the largest collections of licensed and open-domain text for training AI models.

The dataset, called the Common Pile v0… [+3510 chars]