Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>> As an AI researcher in academia, it is frustrating to be blocked from doing a lot of research in this space due to computational constraints and a lack of the required data.

Computational constraints aside, the data used to train GPT-3 was mainly Open Crawl, which is freely available by a non-profit org:

https://commoncrawl.org/big-picture/frequently-asked-questio...

>> What is Common Crawl?

>> Common Crawl is a 501(c)(3) non-profit organization dedicated to providing a copy of the internet to internet researchers, companies and individuals at no cost for the purpose of research and analysis.

So you just need to find the compute. If you have a class of ~30, it should only take about 150 to 450 million.

Or, you could switch your research and teaching to less compute- and data-intensive approaches? Just because OpenAI and DeepMind et al are championing extremely expensive approaches that only they can realistically use, that's no reason for everyone else to run behind them willy-nilly.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: