To quote myself from a comment on sora: Iterations are the missing link. With Ch...

Workaccount2 · on March 25, 2025

You can do that with Gemini's image model, flash 2.0 (image generation) exp.[1] It's not perfect but it does mostly maintain likeness between generations.

[1]https://aistudio.google.com/prompts/new_chat

camel_Snake · on March 25, 2025

Whisk I think is possibly the best at it. No idea what it uses under the hood though.

https://labs.google/fx/tools/whisk

vunderba · on March 26, 2025

DALLE-3 with ChatGPT has been able to approximate this for a while now by internally locking the seed down as you make adjustments. It's not perfect by any means but can be more convenient than manual inpainting.

Ditto Instruct Pix2Pix https://www.timothybrooks.com/instruct-pix2pix

Telemakhos · on March 25, 2025

Reading other comments in other threads on HN has left me with the impression that iterative improvement within a single chat is not a good idea.

For example, https://news.ycombinator.com/item?id=43388114

planb · on March 25, 2025

You‘re right. I’m actually doing this quite often when coding. Starting with a few iterative promts to get a general outline of what I want and when that’s ok, copy the outline to a new chat and flesh out the details. But that’s still iterative work, I’m just throwing away the intermediate results that I think confuse the LLM sometimes.