Useless for what? Are you comparing the base model with chat-tuned models?
Chat-tuned derivatives of LLaMa 2 are already appearing. Given that the base LLaMa 2 model is more efficient than LLaMa 1, it is reasonable to expect that these more refined chat-tuned versions of the chat-tuned versions will outperform the ones you mention.
Try these prompts with different models. LLaMA 2 output is pure garbage:
----1----
On a map sized (256,256), Karen is currently located at position (33,33). Her mission is to defeat the ogre positioned at (77,17). However, Karen only has a 1/2 chance of succeeding in her task. To increase her odds, she can:
1. Collect the nightshades at position (122,133), which will improve her chances by 25%.
2. Obtain a blessing from the elven priest in the elven village at (230,23) in exchange for a fox fur, further increasing her chances by additional 25%
Foxes can be found in the forest located between positions (55,33) and (230,90).
Find the optimal route for Karen's quest which maximizes her chances of defeating the ogre to 100%.
----2----
Write a python code using imageio.v3 to create a PNG image representing the map way-points and the route of Karen in her quest, each way-point must be of a different color and her path must be a gradient of the colors between the waypoints.
------------
I have a lot of cases those I test against different models ...
GPT-4 since one week is really degraded, GPT-3.5 became a little bit better, and LLaMA2 is garbage.