Introducing SlimPajama-627B: the largest extensively deduplicated, multi-corpora, open-source dataset for training large language models #1176
Introducing SlimPajama-627B: the largest extensively deduplicated, multi-corpora, open-source dataset for training large language models #1176
Comments
Post a Comment