Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generative models feel like the wrong abstraction here. I would try extracting keyframes and running them through CLIP or SigLIP to get embeddings. Then you can just do vector search to match the segments. Much lighter on compute.


I was talking to get LLMs to write the code or come up with an approach. I agree that the resulting solution does not need any kind of LLMs or even ML.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: