Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think you are likely wrong, but given neither of us are going to spend millions of dollars training two versions of GPT-3 we will have to agree to disagree. Meta seems to agree with me, since when they trained LLaMA they used one token per digit.


> neither of us are going to spend millions of dollars training two versions of GPT-3 we will have to agree to disagree

I encourage you to apply to TRC https://sites.research.google/trc/about/

You'll be able to access millions of dollars worth of compute. It's how I got my start.

I love when I'm mistaken, since that's how science is pushed forward – we can't be certain we're right, only that we're not wrong yet. So it would be delightful if you formulate this into a testable hypothesis and falsify it yourself.

You can use The Pile to train your GPT: https://pile.eleuther.ai/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: