Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To be clear, they didn't compare this to the naive solution. Which is just run recaptcha.render and boom, it issues you a token.

The problem they're solving with RL isn't the "click the tiles with the stop sign" its the "click the checkbox to prove you're a human". The token score is mainly derived from your env (medium impact on score), google cookies (high impact on score), and IP quality (high impact on score). Mouse movement is barely factored in at all, and can be ignored for botting purposes.

So for now, there's no added value here over the status-quo real-world solution. That said, for future systems which use more behavioral analysis, this research might be helpful.



Recaptcha v3 doesn't have any user interaction. It just returns a score and leaves the subsequent decision to the site designer.

What is this about recaptcha.render?


If I'm being honest, I was being charitable to the paper in the spirit of HN guidelines. Figure 1 and Figure 2 clearly shows interaction with reCAPTCHA v2. The language of the paper also evokes reCAPTCHA v2 and suggests that those figures were not just for reader enrichment:

> Abstract: We present a Reinforcement Learning (RL) methodology to bypass Google reCAPTCHA v3. We formulate the problem as a grid world where the agent learns how to move the mouse and click on the reCAPTCHA button to receive a high score.

> 2.2 Settings: To pass the reCAPTCHA test, a human user will move his mouse starting from an initial position, perform a sequence of steps until reaching the reCAPTCHA check-box and clicking on it.

So I was responding for reCAPTCHA v2 where you'd call:

grecaptcha.render('recaptcha-container', { 'sitekey': 'your_site_key', 'theme': 'light' });

But like honestly there are so many methods not discussed in this paper that sort of invalidate the conclusion. They don't "contradict" it, they just dont validate it. One critical one is how they chose their sitekey(s).

Because when a sitekey is first created, it gives 0.9/0.7 scores very often. Then over time, it adapts to the "normal" traffic for that sitekey. If they used a sitekey from an actual site with real traffic (bots and human), then they would need cooperation for that sites recap admin panel. Which they didn't document, so they probably made a fresh sitekey.


> . Mouse movement is barely factored in at all, and can be ignored for botting purposes.

Partially because they also have to do something about low-cost human clickers being hired to complete captchas in India etc. So, besides checking google reputation and the other forms of reputation you've mentioned, Google gets free mturk if the click farms manage to bypass these reputation checks.


Jeez doing captchas all day as a job. That'll be a special kind of hell.


I skimmed through the article twice, trying to figure out how are they using RL to detect fire hydrants, crossroads, bikes, etc. Thanks for explaining it. It's been a while for me since clicking "I am not a robot" resulted in a pass and not a captcha challenge, so I forgot it's actually an option.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: