From the gpt-3 paper it looks like they have many variants like
- GPT-3-350M
- GPT-3-1.3B
- GPT-3-2.7B
- GPT-3-6.7B
- GPT-3-13B
- GPT-3-175B
Ada, Babbage, Curie and Davinci line up closely with 350M, 1.3B, 6.7B, and 175B respectively. The names are pretty suggestive.