localllama
LocalLLaMA noneabove1182 Now 94%

HUGE dataset released for open source use

together.ai

30T tokens, 20.5T in English, allegedly high quality, can't wait to see people start putting it to use!

Related github: https://github.com/togethercomputer/RedPajama-Data

33
4
Comments 4