On-device models are the future. Users prefer them. No privacy issues. No dealin...

raw_anon_1111 · 2026-03-31T14:26:11 1774967171

Users don’t care about “privacy”. If they did, Meta and Alphabet wouldn’t be worth $1T+.

Users really don’t matter at all. The revenue for AI companies will be B2B where the user is not the customer - including coding agents. Most people don’t even use computers as their primary “computing device” and most people are buying crappy low end Android phones - no I’m not saying all Android phones are crappy. But that’s what most people are buying with the average selling price of an Android phone being $300.

roadside_picnic · 2026-03-31T16:56:21 1774976181

> Users don’t care about “privacy”.

I worked for a research focused AI startup that had a strict "no external LLM" policy for code touching our core research.

You're right that the average consumer doesn't care about privacy, but there are many, many users who do. The average consumer also don't have a desktop with GPU or high end Mac Studio, but that doesn't mean there aren't many people working with AI how do have these things.

If we continue to see improvements in running local models, and RAM prices continue to fall as they have in the last month, then suddenly you don't have to worry about token counts any more and can be much more trusting of your agents since they are fully under your control.

charcircuit · 2026-03-31T17:51:02 1774979462

Those users are addressed by being able to rent their own exclusive machines to run the model on. There will be some compromise that will be made to get access to the best intelligence available.

KerrAvon · 2026-03-31T18:47:10 1774982830

As one of those users: absolutely fucking not.

xprnio · 2026-04-01T11:03:54 1775041434

You will own nothing, and you will be greatful for it

barelysapient · 2026-03-31T14:48:25 1774968505

Different users. Many people care about privacy and aren’t using Meta products. And many businesses care about it too and have information policies to protect their IP.

amelius · 2026-03-31T15:46:37 1774971997

> Different users. Many people care about privacy and aren’t using Meta products.

Yeah but if they can rake in 100x as much by making products for people who don't care about privacy, then why spend time developing stuff for people who care?

There is still a small market left, of course, but that market will not have the billions of R&D behind it.

woopsn · 2026-03-31T16:45:00 1774975500

It's largely out of Meta's hands now anyway. The risk here not so much to privacy (it's Apple) but they'll walled garden the model space somehow for sure.

bigyabai · 2026-03-31T17:31:47 1774978307

> but they'll walled garden the model space somehow for sure.

People have said this since Pytorch was published and it's not any more true now than it was 10 years ago.

raw_anon_1111 · 2026-03-31T14:58:45 1774969125

70% of the world’s population use at least one Meta property at least once per day. How many of the other 30% are too poor/young/computer illiterate to be part of an addressable market?

Every company has dozens of SaaS products that store their business critical information. Amazon installs Office on each computer, Slack (they were moving away from Chime when I left), and the sales department uses SalesForce - SA’s and Professional Services (former employee).

The addressable market of even companies that care about privacy is not a large addressable market. How long will it be before computers become cheap enough that can run even GPT 4 level LLMs that companies will give it to all of their developers?

JambalayaJimbo · 2026-03-31T15:34:05 1774971245

The banking industry absolutely does care about privacy of their business data btw. We do use tools like Confluence but they're all hosted in our own data centers.

Aissen · 2026-04-01T08:28:10 1775032090

What's the plan to migrate off of it ? https://www.atlassian.com/licensing/data-center-end-of-life#...

raw_anon_1111 · 2026-03-31T15:37:36 1774971456

And Capital One and Goldman Sachs are both hosted on AWS…

innagadadavida · 2026-03-31T15:31:05 1774971065

These are all great statistics, but how do you explain ClawdBot explosion. Even in lower income countries like China. So much demand that Apple can’t keep up production of Mac Minis. Why aren’t these folks going towards cloud solutions? Is it cost or is there some consideration for having more control over their data?

zozbot234 · 2026-03-31T15:34:37 1774971277

ClawBot doesn't generally run the model locally, it just talks to remote APIs. No different than any other agentic harness. You could run a local model on the same Mac Mini as your agent, but it wouldn't be very smart and many agentic tasks around computer GUI/browser use, etc. would be out of reach.

bigyabai · 2026-03-31T17:33:28 1774978408

> Why aren’t these folks going towards cloud solutions?

They are. The majority aren't doing inference on a Mac Mini, but instead using it as a local host for cloud-based inference. You could have the same general experience on a $200 Chromebook or $300 Windows box.

JSR_FDED · 2026-04-01T14:39:12 1775054352

Not if you want to use Messages to talk to it.

bigyabai · 2026-04-01T17:49:42 1775065782

If you refuse to abandon iMessage for $300 in MSRP savings, you are beyond helping.

victorbjorklund · 2026-03-31T18:23:44 1774981424

They are running cloud models in almost all cases. Like saying it isn’t cloud when you use the Facebook app on your phone (it is ON your phone and running there).

raw_anon_1111 · 2026-03-31T15:36:19 1774971379

And people using Clawdbot are still not using local inference for the most part…

They aren’t buying high end $2000+ Mac Minis.

abu_ameena · 2026-03-31T15:01:27 1774969287

I see it as a long-term tradeoff on user freedom. You pay upfront for a capable hardware, you get your services running locally (you don’t pay subscriptions). Or you buy cheap hardware, you still need the same services “running in some cloud” for $X monthly. X goes up depending on the corporate bottom-line

raw_anon_1111 · 2026-03-31T15:07:59 1774969679

In the history of cloud computing, prices have mostly only come down especially as inference becomes a commodity. Realistically, just looking at Mac prices, the cost of a computer with decent local inference would be around $6000 per person.

The world is not moving back to on prem.

Aurornis · 2026-03-31T17:19:50 1774977590

> Realistically, just looking at Mac prices, the cost of a computer with decent local inference would be around $6000 per person.

As someone who has hardware in that price range and plays with local LLMs: The gap between Opus or GPT and the local models is still very large for work beyond simple queries.

Self-hosted also starts making my office hot due to all of the power consumption when I use it for anything more than short queries. If you haven't heard your Mac's fans spin up much yet, running local LLMs will get you acquainted with the sound of their cooling systems at full blast.

esseph · 2026-03-31T15:17:30 1774970250

> The world is not moving back to on prem.

Lol, you should tell my customers (that are moving back on prem) that!

You should also tell Microsoft, who just yesterday said they are going back to focusing on local apps.

raw_anon_1111 · 2026-03-31T15:39:04 1774971544

Your customers are an anecdote, now compare that to the publicly reported numbers from AWS, GCP and Azure where they all say the only thing keeping them from growing more is the chip shortage.

esseph · 2026-03-31T15:41:48 1774971708

Oh I'm sure they'll continue to have some cloud services, no doubt. But look at VMware for example, even after the insane price increases. Nutanix also seems to be doing quite well. I'm seeing a fair amount of on-prem bare metal k8s too.

raw_anon_1111 · 2026-03-31T16:01:59 1774972919

Again - anecdotes is not data. We have data. That would be about as silly as me citing my own experience as proof that “everyone is moving to AWS” when I work for a company that is exclusively an AWS partner consulting company.

esseph · 2026-03-31T21:34:15 1774992855

> Again - anecdotes is not data. We have data.

You have data showing growth in cloud, which I expect and don't disagree with. The data I come across shows this too!

What I disagree with, from my own experiences and all the data I can seem to find online is that the growth rate in repatriation is MUCH higher than the growth in cloud.

It has flipped over the last 3yr.

US Enterprises, Fortune 100, especially. Also a lot of public entities (gov).

"In 2025, repatriation is still generally an upward trend. Data from the end of 2024 showed that 86% of CIOs planned to move some public cloud workloads back to private cloud or on-premises — the highest on record for the Barclays CIO Survey."

"Real examples of cloud repatriation include Dropbox, Adobe, and GEICO. All three companies moved a significant portion of their infrastructure onto public cloud before moving it to a combination of on-premises and hybrid cloud providers."

Noted: SaaS accounts for 46.10% of market revenue, while PaaS is the fastest-growing segment at 21.35% CAGR

raw_anon_1111 · 2026-03-31T22:18:06 1774995486

Again, anecdotes. I have public company quarterly statements - you have unsourced quotes. You can quote Geico - I can quote Netflix. If on prem was really growing, I wouldn’t expect Intel to be in the shitter and I would expect Capex to be focused on Colo centers not cloud.

Also when I searched for your quotation the very next paragraph was

“ This trend does not represent a rejection of cloud computing. Organizations continue investing heavily in cloud services, with Gartner forecasting that global cloud spending will reach approximately $723 billion by the end of 2025.”

esseph · 2026-04-01T02:19:08 1775009948

That doesn't dispute what I said, in fact it agrees with what I said. Read it again.

> You have data showing growth in cloud, which I expect and don't disagree with. The data I come across shows this too!

Nevermark · 2026-03-31T17:53:26 1774979606

Have you done A/B tests to see if consumers prefer Facebook with or without privacy?

No? What? Oh, you can't?

Neither can consumers. Most consumers are very aware of the lack of privacy, the manipulation, and have very cynical feelings about Facebook and similar companies. But it's where their friends and family are.

For most people the web is a mine field maze where basic things they want are compromised everywhere. And they are routinely creeped out by ads that reveal they know them far too personally.

You are mistaking network capture for preference.

Another telling example. Lots of privacy valuing technical people, who would never have a Facebook account, send unencrypted text emails.

It is network capture, not preference.

raw_anon_1111 · 2026-03-31T18:43:42 1774982622

Consumers pro actively tell Facebook their age, sexual preference, race, relationship status, likes and dislikes, they check in to where they are and who they are there with…

They are choosing to give Facebook info.

Nevermark · 2026-03-31T19:07:44 1774984064

> They are choosing to give Facebook info.

Yes, they do. That's is exactly the phenomena my comment addressed.

But the way you wrote that implies an improbable motivation or choice framing.

Perhaps their real motive/choice is to share with other people on the site.

It is called a network effect.

If (1) Facebook had been the surveillance/manipulation capital of the world from inception, (2) an equally inviting privacy protecting site took off at the same time, and (3) everyone chose Facebook over E2EE anyway, then sure, we could throw up our hands! Those silly users!

The term I have for when people discuss choices involving many-dimensional criteria, as if the choice involved just one or two selected dimensions, is "dimension blindness". It happens in a lot of heated discussions about phone choices too.

raw_anon_1111 · 2026-03-31T19:35:58 1774985758

Wouldn’t the most obvious way for people to protect their privacy while using FB if they cared and still wanted to use FB be not to proactively give them information? You don’t have to share everything I mentioned just to be involved in a group.

Nevermark · 2026-03-31T19:50:54 1774986654

> You don’t have to share everything I mentioned just to be involved in a group.

This is clearly true. There is an implied point here but I am not sure what.

They share in their profile what they want other people to see. And often choose to not fill out everything. Nobody signs up to share with Meta, Inc.

Most people would love a "[ ] Do not share with Facebook".

People choosing an imperfect option, from imperfect options, are not demonstrating evidence they don't care about the imperfections.

raw_anon_1111 · 2026-03-31T20:09:48 1774987788

They are explicitly adding their information to FB why do they need a button to not share the information? Would the button disable them from checking in and updating their profile?

Nevermark · 2026-03-31T20:26:46 1774988806

An E2EE system (e.g. as offered by Apple iCloud). Or a terms of service guarantee. (e.g. Dropbox, Anthropic and 1000 other companies that partition sharable user content from non-support divisions.)

> Would the button disable them from checking in and updating their profile?

No.

raw_anon_1111 · 2026-03-31T23:40:07 1775000407

When you post a check in, your relationship status, your pictures without setting your sharing preferences and update your profile - you are specifically doing with the intention to share. WhatsApp is E2E encrypted

Nevermark · 2026-04-01T02:46:56 1775011616

I already gave you a storage/share E2EE example.

I suggest that when things keep going over your head, like they did here, just Google the topic.

And when people are kind enough to reply to your confusion, read with a little more care.

raw_anon_1111 · 2026-04-01T03:27:29 1775014049

I am not confused at all.

You’re arguing that people care about their privacy when they are explicitly sharing private information above what is needed to participate in FB.

You are completely wrong and your argument is illogical. People may not know that FB is making a profile of you based on your behavior. But logically, if I add to my profile that my favorite site is “grandma-midget-porn.com” [1], that I care that people don’t know I like senior citizen midgets

[1] Please don’t let that be a real website.

Nevermark · 2026-04-01T03:56:08 1775015768

Please reread what i wrote from the start more carefully.

I am not claiming what percentage of people care or not. I made a valid point of what is evidence or not, for not caring.

You also responded to my E2EE reference without absorbing the example.

It isn’t a big deal. We can move on.

raw_anon_1111 · 2026-04-01T06:15:03 1775024103

Because it is irrelevant to whether people are purposefully explicitly sharing their likes and dislikes, and other information to let FB know more about them.

Nevermark · 2026-04-02T11:28:38 1775129318

Of course. Nobody disputes whether.

api · 2026-03-31T17:42:04 1774978924

"Users" is a large set of people. Many don't care about privacy, but some do. There's also a difference between where you post random social media stuff vs what you run with something like OpenClaw and give access to your machine.

barkerja · 2026-03-31T17:57:02 1774979822

User's care about privacy when they understand the threat and impact. The issue is most user's don't understand this, especially when it comes to use of products like Meta where on the surface, everything appears harmless.

duxup · 2026-03-31T20:46:31 1774989991

Yeah I agree, I fear users don’t care “enough” about privacy that it will matter. :(

Care at all sure, but enough to make a difference, the history of the web and recent computing history indicates otherwise.

Angostura · 2026-03-31T17:33:55 1774978435

It’s not all or nothing there ads trade offs. The fact that Apple still bothers to expend marketing effort on its privacy chops suggests significant numbers of people still do care.

ilovecake1984 · 2026-03-31T17:37:06 1774978626

Users here probably means corporations. I still don’t see much use of LLMs in my personal life, other than one thing. Googling stuff in a foreign language.

DesiLurker · 2026-03-31T15:29:01 1774970941

you are missing a but 'given a choice' disclaimer. Meta is pretty much a monopoly in social space. So is Android. given a choice people will absolutely gravitate towards not-always-snooping device. most people with resources anyway, who matter for the AI adoption.

Oh an wait till ad companies start selling your healthcare data and you will see how fast things turn 'given a choice'.

raw_anon_1111 · 2026-03-31T15:30:15 1774971015

People A) don’t have to use Meta and B) do have a choice between not using a mobile phone by an ad tech company.

nozzlegear · 2026-03-31T17:46:23 1774979183

People don't have a choice between Facebook and not-Facebook-but-still-has-all-of-your-friends-and-family. Abstinence isn't a choice here any more than shutting off your cell phone service is a choice; true in the literal sense, but only if you don't mind being unreachable to everyone who still has a phone.

raw_anon_1111 · 2026-03-31T18:47:43 1774982863

And they do have a choice on proactively giving FB more information than just what it infers

sowbug · 2026-03-31T17:26:58 1774978018

I am concerned that local models will never benefit from the training on live requests that is surely improving cloud-only models.

This might be the cost of privacy, and it might be worth paying, unless cloud models reach an inflection point that make local models archaic.

port11 · 2026-04-05T16:05:54 1775405154

There’s been some success training models on top of differential privacy.

I imagine that with live requests it would be quite challenging but not impossible, assuming you could somehow sanitize all sorts of private data that people throw at these prompts.

throwawayq3423 · 2026-03-31T18:15:27 1774980927

Technologists make the same mistake over and over in thinking the better technology will win. vhs vs betamax, etc.

Actual consumers not only don't care, they will not even be aware of the difference.

mrinterweb · 2026-03-31T17:08:47 1774976927

I think two recent advances make your statement more true. The new Qwen 3.5 series has shown a relatively high intelligence density, and Google's new turboquant could result in dramatically smaller/efficient models without the normal quantization accuracy tradeoff.

I would expect consumer inference ASIC chips will emerge when model developments start plateauing, and "baking" a highly capable and dense model to a chip makes economic sense.

fauigerzigerk · 2026-03-31T19:30:42 1774985442

Who will be funding state of the art local models going forward? AI models are never done or good enough. They will have to be trained on new data and eventually with new model architectures. It will remain an expensive exercise.

I could be wrong because I'm not following this too closely, but the open weights future of both Llama and Qwen looks tenuous to me. Yes, there are others, but I don't understand the business model.

testing22321 · 2026-03-31T14:44:02 1774968242

I see all these LLM posts about if a certain model can run locally on certain hardware and I don’t get it.

What are you doing with these local models that run at x tokens/sec.

Do you have the equivalent of ChatGPT running entirely locally? What do you do with it? Why? I honestly don’t understand the point or use case.

svachalek · 2026-03-31T16:17:25 1774973845

1. There are small local models that have the capabilities of frontier models a year ago

2. They aren't harvesting your data for government files or training purposes

3. They won't be altered overnight to push advertising or a political agenda

4. They won't have their pricing raised at will

5. They won't disappear as soon as their host wants you to switch

jkl5xx · 2026-03-31T21:38:15 1774993095

Good points. What local models have you found work best for your use cases? I feel like if we get to opus 4.6 level intelligence running on local hardware, we’re in the clear for a lot of day to day use cases.

testing22321 · 2026-03-31T20:47:26 1774990046

Thanks. I understand that.

What are you doing with it?

Why do you want it?

samuel · 2026-03-31T14:50:59 1774968659

Chat is certainly an option, but the real deal are agents, which have access to way more sensitive information.

testing22321 · 2026-03-31T20:48:31 1774990111

Thanks. What do you do with such an agent? What is the use case?

dec0dedab0de · 2026-03-31T15:18:00 1774970280

most of the llm tooling can handle different models. Ollama makes it easy to install and run different models locally. So you can configure aider or vscode or whatever you're using to connect to chatgpt to point to your local models instead.

None of them are as good as the big hosted models, but you might be surprised at how capable they are. I like running things locally when I can, and I also like not worrying about accidentally burning through tokens.

I think the future is multiple locally run models that call out to hosted models when necessary. I can imagine every device coming with a base model and using loras to learn about the users needs. With companies and maybe even households having their own shared models that do heavier lifting. while companies like openai and anhtropic continue to host the most powerful and expensive options.

roboror · 2026-03-31T18:13:16 1774980796

What models have you found capable? I was recently recommended Qwen3 Coder Next and I did not find it very successful. I have a good amount of VRAM/RAM so would love to run something locally.

testing22321 · 2026-03-31T21:36:20 1774992980

Thanks.

I still don’t understand. What are you using this long you’re running locally to actually do?

What is the use case?

derangedHorse · 2026-04-01T11:16:42 1775042202

Qwen3.5 is like an old version of ChatGPT and I can use it the same way I used GPT4 — writing emails, reading documentation and answering questions about it, reviewing code, answering trivia, etc.

jesse23 · 2026-03-31T16:11:13 1774973473

Yes so far do we have a working practice that, with a given local mode, any infra we could use, that provide a good practice that can leverage it for local task?

whazor · 2026-03-31T18:20:46 1774981246

Obviously hardware wise the real blocker is memory cost. But there is no reason why future devices couldn't bundle 256GB of mem by default.

michaelmior · 2026-03-31T18:21:20 1774981280

> no reason why future devices couldn't bundle 256GB of mem by default

Cost is a pretty big reason.

thefourthchime · 2026-03-31T16:25:08 1774974308

Maybe some more distant future. For me, I'm still struggling with the hallucinations and screw-ups that the state-of-the-art models give me.

mgaunard · 2026-03-31T17:17:22 1774977442

These local models are far behind the capabilities of latest Gemini Pro, Claude Opus or GPT.

Why waste time with subpar AI?

Lucasoato · 2026-03-31T17:18:42 1774977522

They will eventually catch up, that’s the hope to avoid a techno feudalism in which too much power is in too few hands.

abu_ameena · 2026-03-31T17:42:59 1774978979

Yes, but you don’t always want the power/expense of these models for the task at hand. A hammer is good enough to push a nail inside a wall. Save the nail gun for when you are building a house.

anon373839 · 2026-03-31T20:52:40 1774990360

They’re not far behind, unless you mean for “vibe coding”. And for probably 85% of queries that people use LLMs for, you can’t even really perceive the difference between frontier and local.

sbassi · 2026-03-31T19:23:15 1774984995

It's a trade off.