Another thread on HN (https://news.ycombinator.com/item?id=34653075) discusses a model that is less than 1B parameters and outperforms GPT-3.5. https://arxiv.org/abs/2302.00923
These models will get smaller and more efficiently use the parameters available.
Another thread on HN (https://news.ycombinator.com/item?id=34653075) discusses a model that is less than 1B parameters and outperforms GPT-3.5. https://arxiv.org/abs/2302.00923
These models will get smaller and more efficiently use the parameters available.