This is pretty much progress on dead internet theory. The only thing I think that can stop this and ensure genuine interaction is with strong, trusted identity that has consequences if abused/misused.
This trusted identity should be something governments need to implement. So far big tech companies still haven't fixed it and I question if it is in their interests to fix it. For example, what happens if Google cracks down hard on this and suddenly 60-80% of YouTube traffic (or even ad-traffic) evaporates because it was done by bots? It would wipe out their revenue.
Disagree. YouTube's revenue comes from large advertisers who can measure real impact of ads. If you wiped out all of the bots, the actual user actions ("sign up" / "buy") would remain about the same. Advertisers will happily pay the same amount of money to get 20% of the traffic and 100% of the sales. In fact, they'd likely pay more because then they could reduce investment in detecting bots.
Bots don't generate revenue, and the marketplace is somewhat efficient.
> YouTube's revenue comes from large advertisers who can measure real impact of ads.
Not necessarily. First, attribution is not a solved problem. Second, not all advertisement spend is on direct merchandising, but rather for branding/positioning where "sign up" / "buy" metrics are meaningless to them.
Making advertising more efficient would also open up opportunities for smaller players. Right now only the big guys have the chops to carpet-bomb the market regardless of bots. Noise benefits those who can afford to stand above it.
Pay per impression is a metric adexchanges use. So it is not in best interest of companies to remove bots. Bots create "economic activity" when such metrics are used.
Yes... but maybe also no. Well measured advertising budgets are definitely part of the game. But so are poorly measured campaigns. Type B often cargo cult A. It's far from a perfect market.
In any case, Adwords is at this point a very established product... very much an incumbent. Disruption generally, does not play to their favor by default.
Advertisers have spent decades pressing Google to do something about fraudulent ad clicks/views, and occasionally tried to sue Google over being billed for the fraud.
And yet Google is still the dominant player here and cranking in billions, so it seems that until some other actual competitor shows up, they have no reason to change their behavior.
Advertisers would have to pay far less, because of fewer fake impressions. Furthermore, advertising would seem to be more effective, since bots don't buy the product. The publishers, however, would hate it.
The problem is, the bots seem like a scam perpetrated by publishers to inflate their revenue.
What on Earth has given so many people in this thread the confidence to assert that marketing departments actually have any real way to gauge the effectiveness of a given ad campaign? It's effectively impossible to adjust for all the confounding variables in such a chaotic system, so ad spend is instead determined by internal politicking, pseudoscientific voodoo, and the deftness of the marketing department's ability to kiss executive ass. This ain't science, it's perversely-incentivized emotion.
> This trusted identity should be something governments need to implement.
Granting the premise for argument's sake, why should governments do this? Why can't private companies do it?
That said, I've long thought that the U.S. Postal Service (and similarly outside the U.S.) is the perfect entity for providing useful user certificates and attribute certificates (to get some anonymity, at least relative to peers, if not relative to the government).
The USPS has:
- lots of brick and mortar locations
- staffed with human beings
- who are trained and able to validate
various forms of identity documents
for passport applications
UPS and FedEx are also similarly situated. So are grocery stores (which used to, and maybe still do have bill payment services).
Now back to the premise. I want for anonymity to be possible to some degree. Perhaps AI bots make it impossible, or perhaps anonymous commenters have to be segregated / marked as anonymous so as to help everyone who wants to filter out bots.
I used to think that, but recently had a really bad experience with a lot of runaround with them when we had to have our mail held for a few weeks while we sorted out a mailbox break-in. We would go to one post office that was supposed to have our mail and be told to go to another post office, then get redirected back to the first post office multiple times. And they kept talking about how they had to work out the logistics and everything was changing over and over. Some of the managers seemed to give my wife the wrong information to get rid of her.
There were a few managers who tried to help and eventually we got our mail but the way everything worked out was absurd. I think they could handle national digital identity except that if you ever have a problem or need special treatment to address an issue buckle up because you're in for a really awful experience.
The onboarding and day-to-day would probably be pretty good given the way they handle passport-related stuff though.
> why should governments do this? Why can't private companies do it?
A private company will inevitably be looking to maximize their profit. There will always be the risk of them enshittifying the service to wring more money out of citizens and/or shutting it down abruptly if it's not profitable.
There's also the accountability problem. A national ID system would only be useful if one system was widely used, but free markets only function well with competition and choice. It could work similar to other critical services like power companies, but those are very heavily regulated for these same reasons. A private system would only work if it was stringently regulated, which I don't think would be much different from having the government run it internally.
> A national ID system would only be useful if one system was widely used, but free markets only function well with competition and choice.
Isn't this also a problem with having the government do it? E.g. it's supposed to prevent you from correlating a certification that the user is over 18 with their full identity, but it's insecure and fails to do so, meanwhile the government won't fix it because the administrative bureaucracy is a monopoly with limited accountability or the corporations abusing it for mass surveillance lobby them to keep the vulnerability.
Don't they? If they promise privacy and then don't deliver it, there are a lot of government agencies and politicians that would be championing the new tool for rooting out crimethink.
It could be done similar to how car inspections are done in Texas: price is set statewide, all oil change places do the service, and you redeem a code after.
The problem with this though is the implications of someone at whatever the private entity is falsely registering people under the table - this would need to be considered a felony in order for it to work.
I think the main argument for having the government do it as opposed to the private sector is that the gov has a lot more restrictions and we, the people, have a say. At least theoretically.
Imagine if Walmart implemented an identity service and it really took off and everyone used it. Then, imagine they ban you because you tweeted that Walmart sucks. Now you can't get a rental car, can't watch TV, maybe can't even get a job. A violation of the first amendment in practice, but no such amendment exists for Walmart.
The government violates or works around the constitution routinely and at will. They can't spy on you? They get other allied countries to do it and report back, or they just buy the info from companies that have collected it. And so on.
This is cynical. There's some truth to this, but the constitution provides a much stronger guarantee of rights as opposed to the Free Market, which guarantees nothing. There's real risk of allowing this to be a fully private-sector endeavor.
There are only two real ways to implement that. One is the "attribute certificate" is still tied to your full identity and then people won't be willing to associate them. The other is that the attribute certificates are fully generic (e.g. everyone over 18 gets the same one) and then someone will post it on the internet and, because there is no way to tie it back to a specific person, there is no way to stop them and it makes the system pointless.
Correct. In practice the latter isn't really possible because the issuer can always record the subject public key info, or the serial number, or a hash of the certificate, and they can then use that to identify the real subject. However for low-value things I might use them.
No, you can do the latter. You literally have a secret that implies the bearer meets the particular characteristic (e.g. is over 18). They don't each get their own certificate, they all get the exact same one down to the last byte, so you can't correlate it with anything other than the group of people who are over 18.
But then there's nothing stopping any of them from sharing the secret with people outside the group.
That's going to make it less economical, but it still doesn't even fix it. Even implausibly assuming the cards are perfectly secure so nobody could extract the shared private key from any one of them, somebody who wants to share their authorization could just plug their card into an internet-connected machine and have it sign for anyone else at will. If you give them the ability to sign you might as well give them the private key.
The basic problem is that there are people who will have the credential but want to thwart the operation of the system. If you can't unmask them then your system is thwarted. If you can, your system is an invasion of privacy that would have chilling effects because you're demanding for people to tie their most sensitive activities to their government ID.
I think on the same lines. Digital identity is the hardest problem we’ve been procrastinating in solving since forever, because it has the most controversial trade offs, which no two persons can agree on. Despite the well known risks, it’s something only a State can do.
Well you're about to find out, because YouTube is doing a massive bot/unofficial client crackdown right now. YTDL, Invidious etc are all being banned. Perhaps Google got tired of AI competitors scraping YouTube.
In reality, as others have pointed out, Google has always fought bots on their ad networks. I did a bit of it when I worked there. Advertisers aren't stupid, if they pay money for no results they stop spending.
I've found the opposite. These days I absolutely can't play a video in YouTube, it always comes back as not available. But I can use JDownloader to grab it unless there are language issues. (It unfortunately might not get the right language for the audio.)
but it seems like YT have various rules for when they do and don't trigger bans. Also this is a new change which they usually roll out experimentally, and per client at that. So the question is only how aggressive do they want to be. They can definitely detect JDownloader as a bot, and do.
Probably the latter. yt-dlp can be detected and it yields account / IP bans, it seems. They've been going back and forth around the blocks for weeks but only by claiming to be different devices, each time they do the checks are added for the new client and they have to move onto the next. There's a finite number of those.
You would assume that Advertising companies with quality ad space would be able to show higher click through rates and higher impression to purchase rates -- overall cost per conversion -- by removing bots that won't have a business outcome from the top of the funnel.
But attribution is hard, so showing larger numbers of impressions looks more impressive.
Attribution is extremely easy, it is a solved problem.
Companies keep throwing away money on advertising for bots and other non-customers because they either:
A) Are small businesses where the owner doesn't care about what he's doing and enjoys the casino like experience of buying ads online and see if he gets a return, or
B) Are big businesses where the sales people working with online ads are interested in not solving the problem, because they want to keep their salaries and budget.
> This trusted identity should be something governments need to implement.
I have been thinking about this as well. It's exactly the kind of infrastructure that governments should invest in to enable new opportunities for commerce. Imagine all the things you could build if you could verify that someone is a real human somehow with good accuracy (without necessarily verifying their identity).
I think that's also part of Facebook's strategy of being as open with llama as possible – they can carve out the niche as the "okay if we're going to dive head first into the dead internet timeline, advertisers will be comforted by the fact that we're a big contributor to the conversation on the harms of AI – by openly providing models for study."
I think Nvidia should publicly declare that they will continue to build (even if Meta decides to stop) open llms so that their hardware is sold. Give away the software so that hardware gets sold. Similar to Google gives away Android to OEMs.
> For example, what happens if Google cracks down hard on this and suddenly 60-80% of YouTube traffic (or even ad-traffic) evaporates because it was done by bots? It would wipe out their revenue.
Nonsense. Advertisers measure results. CPM rates would simply increase to match the increased value of a click.
I've been thinking about how AI will affect ad-supported "content" platforms like YouTube, Facebook, Twitter, porn sites, etc. My prediction is that as AI-generated content improves in quality, or at least believability, they will not prohibit AI-generated content, they will embrace it whole-heartedly. Maybe not at first. But definitely gradually and definitely eventually.
We know that these sites' growth and stability depends on attracting human eyeballs to their property and KEEPING them there. Today, that manifests as algorithms that analyze each person's individual behavior and level of engagement and uses that data to tweak that user's experience to keep them latched (some might say addicted, via dopamine) to their app on the user's device for as long as possible.
Dating sites have already had this down to a science for a long time. There, bots are just part of the business model and have been for two decades. It's really easy: you promise users that you will match them with real people, but instead show them only bots and ads. The bots are programmed to interact with the users realistically over the site and say/do everything short of actually letting two real people meet up. Because whenever a dating site successfully matches up real people, they lose customers.
I hope I'm wrong, but I feel that social content sites will head down the same path. The sites will determine that users who enjoy watching Reels of women in swimsuits jump on trampolines can simply generate as many as they need, and tweak the parameters of the generated video based on the user's (perceived) preferences: age, size, swimsuit color, height of bounce, etc. But will still provide JUST enough variety to keep the user from getting bored enough to go somewhere else.
It won't just be passive content that is generated, all those political flamewars and outrage threads (the meat and potatoes of social media) could VERY well ALREADY be LLM-generated for the sole purpose of inciting people to reply. Imagine happily scrolling along and then reading the most ill-informed, brain-dead comment you've ever seen. You know well enough that they're just an idiot and you'll never change their mind, but you feel driven to reply anyway, so that you can at LEAST point out to OTHERS that this line of thinking is dangerous, then maybe you can save a soul. Or whatever. So you click Reply but before you can type in your comment, you first have to watch a 13-second ad for a European car.
But of course the comment was never real, but you, the car, and your money definitely are.
The real problem is how to prove identity while also guaranteeing anonymity.
Because Neo couldn't have done what he did by revealing his real name, and if we aren't delivering tech that can break out of the Matrix, what's the point?
The solution will probably involve stuff like Zero-Knowledge Proofs (ZKPs), which are hard to reason about. We can imagine a future where all user data is end-to-end encrypted, circles of trust are encrypted, everything runs through onion routers, etc. Our code will cross-compile to some kind of ZKP VM running at some high multiple of computing power needed to process math transactions, like cryptocurrency.
One bonus of that is that it will likely be parallelized and distributed as well. Then we'll reimplement unencrypted algorithms on top of it. So ZKP will be a choice, kind of like HTTPS.
But when AI reaches AGI in the 2040s, it will be able to spoof any personality. Loosely that means it will have an IQ of 1000 and beat all un-augmented humans in any intellectual contest. So then most humans will want to be augmented, and the arms race will quickly escalate, with humanity living in a continuous AR simulation by 2100.
If that's all true, then it's basically a proof of what you're saying, that neither identity nor anonymity can be guaranteed (at least not simultaneously) and the internet is dead or dying.
So this is the golden age of the free and open web, like the wild west. I read a sci fi book where nobody wore clothes because with housefly-size webcams everywhere, there was no point. I think we're rapidly headed towards realtime doxxing and all of the socioeconomic eventualities of that, where we'll have to choose to forgive amoral behavior and embrace a culture of love, or else everyone gets cancelled.
>where we'll have to choose to forgive amoral behavior and embrace a culture of love, or else everyone gets cancelled.
I think it's much more likely that humans would fall into a religious cult like behavior of punishing each other with more byzantine rules and monitoring each other for compliance. Humans are great at creating systems of Moloch.
There is no such thing as a "non-VoIP phone number". All phone numbers are phone numbers. Some people try to ban blocks assigned to small phone providers, but some actual humans use those. Meanwhile major carriers are leasing numbers to anyone who pays from the same blocks they issue to cellular customers. Also, number portability means even blocks don't mean anything anymore.
Large companies sometimes claim to do this "to fight spam" because it's an excuse to collect phone numbers, but that's because most humans only have one or two and it serves as a tracking ID, not because spammers don't have access to a million. Be suspicious of anyone who demands this.
If it’s really important to you then use Apple / Google / GitHub login.
Obviously this has many downsides, especially from a privacy perspective, but it quickly allows you to stop all but the most sophisticated bots from registering.
Personally I just stick my sites behind Cloudflare until they’re big enough to warrant more effort. It prevents most bots without too much burden on users. Also relatively simple to move away from.
Does that really work? I'm trying to build a site with upvotes--wouldn't it be really easy for someone with 100 bought Google accounts to make 100 accounts on my site?
Google is working hard to make it so you shouldn't be able to easily make new accounts. New accounts basically require a phone number and you can only use one phone number so many times before they won't let you use that phone number any more times. Grandfathered accounts don't have this problem yet so this is why Google is trying to crack down on long-unused accounts.
And, annoyingly, being a little too aggressive about it.
Google apparently decided my wife's gmail account was unused. The mail part was other than some forwarding rules (she lives on WeChat, not email.) She's been consistently logged in with YouTube and Translate, though--and now the only way I can get Translate to work is by logging her out.
Services need the ability to obtain an identifier that:
- Belongs to exactly one real person.
- That a person cannot own more than one of.
- That is unique per-service.
- That cannot be tied to a real-world identity.
- That can be used by the person to optionally disclose attributes like whether they are an adult or not.
Services generally don’t care about knowing your exact identity but being able to ban a person and not have them simply register a new account, and being able to stop people from registering thousands of accounts would go a long way towards wiping out inauthentic and abusive behaviour.
I think DID is one effort to solve this problem, but I haven’t looked into it enough to know whether it’s any good:
Agreed that offering an identifier like this would be ideal. We should be fighting for this. But in the meantime, using a passport ticks most of the boxes in your list.
I’m currently working on a social network that utilises passports to ensure account uniqueness. I’m aware that folks can have multiple passports, but it will be good enough to ensure that abuse is minimal and real humans are behind the accounts.
I hope that enough are willing to if the benefits and security are explained plainly enough. For example, I don’t intend to store any passport info, just hashes. So there should be no risk, even if the DB leaks.
First, not everyone has passports - there are roughly half as many US passports as Americans.
Second, how much of the passport information do you hash that it's not reversible? If you know some facts about your target (imagine a public figure), could an attacker feasibly enumerate the remaining info to check to see if their passport was registered in your database? For example, there are only 2.6 billion possible American passport numbers, so if you knew the rest of Taylor Swift's info, you could conceivably use brute-force to see if she's in your database. As a side effect, you'd now know her passport number, as well.
> Second, how much of the passport information do you hash that it's not reversible?
That doesn't even matter. You could hash the whole passport and the passport could contain a UUID and the hash db would still be usable to correlate identities with accounts, because the attacker could separately have the victim's complete passport info. Which is increasingly likely the more sites try to use passports like this, because some won't hash them or will get breached sufficiently that the attackers can capture passport info before it gets hashed and then there will be public databases with everybody's complete passport info.
Less than half of Americans have passports, and of the remaining half, a significant fraction do not have the necessary documents to obtain one. Many of these people are poor, people of color, or marginalized in other ways. Government ID is needed, but you generally find the GOP against actually building a robust, free, ubiquitous system because it would largely help Americans who vote Democratic. This is also why the GOP pushes Voter ID, but without providing any resources to ensure that Americans can get said ID.
To be fair, you generally don't see Dems pushing for such a free and ubiquitous system, either - "voter ID is bad" is so entrenched on that side of the aisle that any talk about such a system gets instant pushback, details be damned.
< you generally don't see Dems pushing for such a free and ubiquitous system, either
Yes, and this seems like a huge missed opportunity for Dems. I would strongly support such a system, and I would be willing to temper my opposition to Voter ID laws if they were introduced after such a system was implemented fully.
Passport might be a bit onerous - it's expensive and painful process and many don't need it.
But it's a hilarious sign of worldwide government incompetence that social insurance or other citizen identification cards are not standard, free, and uniquely identifiable and usable for online ID purposes (presumably via some sort of verification service / PGP).
Government = people and laws. Government cannot even reliably ID people online. You had one job...
When it comes to government-issued IDs, "standard" and "free" is a solved problem in almost every country out there. US is a glaring exception in this regard, particularly so among developed countries. And it is strictly a failure of policy - US already has all the pieces in place for this, they just need to be put together with official blessing. But the whole issue is so politicized that both major parties view it as unacceptable deviation from their respective dogmas on the subject.
> But it's a hilarious sign of worldwide government incompetence that social insurance or other citizen identification cards are not standard, free, and uniquely identifiable and usable for online ID purposes (presumably via some sort of verification service / PGP).
Singapore does this. Everybody who is resident in Singapore gets an identity card and a login for Singpass – an OpenID Connect identity provider that services can use to obtain information like address and visa status (with user permission). There’s a barcode on the physical cards that can be scanned by a mobile app in person to verify that it’s valid too.
In the United States, the lack of citizen identification cards is largely due to Republican opposition. People who lack ID are more likely to be democratic voters, so there is an incentive to oppose getting them ID. There's also a religious element for some people, connected to Christian myths about the end of the world.
It's kind of half true - there is an association between not having an ID and being blue. Because people without IDs are more likely to be people of color or of other marginalized groups, which then are more likely to be blue.
In addition, there's a strong conservative history of using voter id as a means of voter suppression and discrimination. This, in turn, has made the blue side immediately skeptical of identification laws - even if they would be useful.
So, now the anti-ID stuff is coming from everywhere.
It's absolutely not true. People have to supply IDs for tons of activities. They have IDs. We know who they are. They are registered to vote -- how did that happen w/o ID? Of course they have IDs.
The statistics just don't back this up. Plenty of, predominantly poor, people don't have driver's licenses. And that's typically the only ID people have. Also, poorer people may work under the table or deal in cash.
Link the stats please. There are ID types other than driver's licenses. In fact, the DMVs around the country issue non-driver IDs that are every bit as good as driver licenses as IDs.
Many Americans do not have ID. I don't know why that's so controversial to say.
You don't need an ID to get a job, or rent, or do much of anything. Typically, a bill + address suffices.
You're correct SOME states offer ID that IS NOT a Driver's License. However, there's no reason to get this - why would you? Again, you don't need it for anything so why bother?
Thank you for providing the data to back up my original unsourced claim.
America is a very diverse nation, and people live very different lives across the country. Yet all of them have a right to vote. I would expect that 99+% of people on this site have government-issued IDs, but we in the 1% of technical expertise here.
Listen to the stories of people who were affected by the Hurricane in western North Carolina last week and you can start to understand how different some people's lives are.
Where do you get this idea that you need to have an ID card in order to register to vote? It's certainly not a federal requirement.
In NY, you can register with ID, last 4 digits of your social, or leave it blank. If you leave it blank, you will need to provide some sort of identification when voting, but a utility bill in your name and address will suffice.
On the other hand I think the best social media out there today is 4chan. Entirely anonymous. Also, the crass humor and nsfw boards act as a great filter to keep out advertising bot networks from polluting the site like it did with reddit. No one one wants to advertise on 4chan or have their brand associated with it, which is great for quality discussion on technical topics and niche interests.
4chan is actually one of the worst social media out there. They are responsible for a hell of a lot of hate campaigns out there. Anonymity breeds toxicity.
Anonymity breeds veracity. As soon as you force people to identify themselves they start lying to you whenever the truth would be controversial. They refuse to concede when someone proves them wrong because now they're under pressure to save face. It's why Facebook's real name policy causes the place to be so toxic.
Obviously, the impunity also allows them to say false things they would otherwise be deterred from saying. Why would you assume the impunity leads to more truth and not more lies?
There are three relevant types of statements: Logical arguments, independently verifiable factual claims and unverifiable factual claims.
Logical arguments stand on their own merits. Whether they're convincing or not depends on whether you can find holes in them, not on who offers them. Presenting weak arguments is low value because they're not convincing. But anonymity allows people to present strong arguments that they would otherwise be punished for presenting, not because they're untrue but because they're inconvenient.
Independently verifiable factual claims are the same. You don't have to believe the author because all they're telling you is that you can find something relevant in a particular document or clip and then you can see for yourself if it's there or not. But anonymity protects them from being punished for telling people about it.
Unverifiable factual claims are an appeal to authority, which requires you to be an authority -- it's a mechanism authorities use to lie to people -- which is incompatible with anonymity. If you anonymously claim something nobody can check then you have no credibility.
So anonymity enables people to say verifiably true things they would otherwise be punished for bringing to public attention, but is less effective for lying than saying the lies under an official identity because there is no authority from which to lend credibility to unverifiable claims.
Your typology of statements and reasons for stating them is lacking and unconvincing. Plenty of reason to spread verifiable lies under conditions of anonymity.
People do exactly that under their real names. If anything they do it more as a form of virtue signaling because they have to be seen supporting their tribe's causes.
This trusted identity should be something governments need to implement. So far big tech companies still haven't fixed it and I question if it is in their interests to fix it. For example, what happens if Google cracks down hard on this and suddenly 60-80% of YouTube traffic (or even ad-traffic) evaporates because it was done by bots? It would wipe out their revenue.