Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The harder problem isn't finding vulnerabilities — it's preventing AI from violating constraints in the first place. Prompt-level safety is probabilistic. Filesystem-level constraints (mkdir 禁/behavior) are deterministic. The AI can't violate a rule that's physically encoded as a folder path in its system prompt.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: