I'll bite harder. That's how the public Internet works. If you don't trust clien...

tickettotranai · 2025-07-17T04:00:00 1752724800

In fairness this appears to be the direction we are headed anyway

aopwiejfpoieajf · 2025-07-17T15:34:10 1752766450

This is how it's going. Half the websites I go to have Cloudflare captchas guarding them at this point. Every time I visit StackOverflow I get a 5 second wait while Cloudflare decides I'm kosher.

Sohcahtoa82 · 2025-07-17T18:58:54 1752778734

Are you using TOR or a VPN, spoofing your User-Agent to something uncommon, or doing something else that tries to add extra privacy?

That kind of user experience is one that I've seen a lot on HN, and every time, without fail, it's because they're doing something that makes them look like a bot, and then being all Surprised Pikachu when they get treated like a bot by websites.

EPendragon · 2025-07-17T20:07:37 1752782857

I started having similar experiences when I switched to using Brave browser that blocks lots of tracking. Many websites that didn't show me those captchas and Cloudflare protection layers now have started to pop up on a regular basis.

71bw · 2025-07-18T11:30:10 1752838210

I use regular Edge with uBlock and get cloudflare crapchas all the time.

tomrod · 2025-07-20T04:33:53 1752986033

I've found Vivaldi to be a much better experience, chrome based, and supports ublock in all its glory

71bw · 2025-07-25T10:03:15 1753437795

I tried it and didn't like it. Currently migrating to Firefox though, their vertical tab implementation is really decent.

ponooqjoqo · 2025-07-18T03:16:58 1752808618

I get this and I assume it's because I clear cookies pretty frequently. It used to be the case that that didn't matter, but nowadays everyone shields their websites using JS.

__loam · 2025-07-17T03:13:33 1752722013

It sucks that we're living in a landscape where bad actors take advantage of that way of doing things.

sltkr · 2025-07-17T03:42:37 1752723757

The really bad actors are going to ignore robots.txt entirely. You might as well be nice to the crawlers that respect robots.txt.

PeterStuer · 2025-07-17T05:00:18 1752728418

Even if you want to play nice, robots.txt is a catch-22, as accessing it is taken as a signal you are a 'bot' by malconfigured anti-bot 'solutions'.

KTibow · 2025-07-17T04:00:33 1752724833

It sucks more that Cloudflare/similar have responded to this with "if your handshake fingerprints more like curl than like Chrome/Firefox, no access for you".

NoMoreNicksLeft · 2025-07-17T13:38:56 1752759536

I now write all of my bots in javascript and run them from the Chrome console with CORS turned off. It seems to defeat even Google's anti-bot stuff. Of course, I need to restart Chrome every few hours because of memory leaks, but it wasn't a fun 3 days the last time I got banned from their ecosystem with my kids asking why they couldn't watch Youtube.

tomrod · 2025-07-20T04:39:53 1752986393

Where can I learn more about custom bots in JS and Chrome?

edoceo · 2025-07-17T04:14:54 1752725694

Or getting a CAPTCHA from Chrome when visiting a site you've been to dozens of times (Stack Overflow). Now I just skip that content, probably in my LLM already anyway.

codingminds · 2025-07-17T05:45:58 1752731158

Keep in mind that those LLMs are one of the bigger reasons why we see more and more anti bot behaviour on sites like SO.

That aggressive crawling to train those on everything is insane.

realusername · 2025-07-17T05:27:24 1752730044

It's the same thing as the anti pirate ads, you only annoy legit customers, this agressive captcha campaign just makes Stackoverflow drop down even faster than it would normally by making it lower quality.

EPendragon · 2025-07-17T13:47:57 1752760077

There are tools like curl-impersonate: https://github.com/lwthiker/curl-impersonate out there that allow you to pretend to be any browser you like. Might take a bit of trial and error, but this mechanism could be bypassed with some persistence in identifying what is it that the resource is trying to block.

chasebank · 2025-07-17T03:54:22 1752724462

Bad actors will always exploit whatever systems are available to them. Always have, always will.

Perz1val · 2025-07-17T07:56:11 1752738971

Because if they play by the rules, they won't be bad actors