Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
pxc
32 days ago
|
parent
|
context
|
favorite
| on:
Exploiting the most prominent AI agent benchmarks
There's a difference between a reliable hunch and really knowing something. What is obvious is not always (or even usually) easy to prove. And the process of proving the obvious sometimes turns up useful little surprises.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: