Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can easily make a RLAIF loop.

- Take a list of n animals * m vehicule

- Ask a LLM to generate SVG for this n*m options

- Generate png from the svg

- Ask a Model with vision to grade the result

- Change your weight accordingly

No need to human to draw the dataset, no need of human to evaluate.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: