>We performed a blind pairwise comparison between text-davinci-003 and Alpaca 7B, and we found that these two models have very similar performance: Alpaca wins 90 versus 89 comparisons against text-davinci-003.
Obviously not a completely foolproof comparison but it is at least clear it isn't much worse for the types of prompt they were testing.
Obviously not a completely foolproof comparison but it is at least clear it isn't much worse for the types of prompt they were testing.