Tried the free version on OpenRouter with pi.dev and it's competent at tool call...

admiralrohan · 2026-04-01T21:09:44 1775077784

What kind of creative writing are you doing? Fiction or non-fiction like blog posts?

sunaookami · 2026-04-02T08:38:18 1775119098

Fiction. One of my "benchmarks" is giving the model a bunch of (self-made) text and having it simulate a 4chan thread about it. This tests tool use (calling the APIs), some skills, censorship and general creativity. Some models refuse every new turn after reading real 4chan threads ;) Claude is especially good at this surprisingly while GPT fails spectacularly and Gemini is just lazy (and barely usable since it's constantly overloaded). Qwen (coder-model from Qwen CLI, so Qween 3.5) is also very good but sadly not usable in Pi (they detect and block calls outside their CLI).

admiralrohan · 2026-04-02T18:05:36 1775153136

Interesting. Are you running something like Autoresearch loop for writing fiction? How will the agent determine whether the output is good as this is subjective.

sunaookami · 2026-04-03T08:00:15 1775203215

I don't have any advanced setup, creative writing is always subjective. I just one-shot most of the time.

skysniper · 2026-04-01T19:13:02 1775070782

it's actually pretty good at openclaw type of tasks for non technical users: lots of tool calls, some simple programing

sunaookami · 2026-04-01T20:24:36 1775075076

Yeah this kind of stuff. I have no experience with OpenClaw though.