I cannot express how dirt cheap that pricepoint is for what's on offer, especially when you're comparing it to rackmount servers. By the time you've shoehorned in an nVidia GPU and all that RAM, you're easily looking at 5x that MSRP; sure, you get proper redundancy and extendable storage for that added cost, but now you also need redundant UPSes and have local storage to manage instead of centralized SANs or NASes.
For SMBs or Edge deployments where redundancy isn't as critical or budgets aren't as large, this is an incredibly compelling offering...if Apple actually had a competent server OS to layer on top of that hardware, which it does not.
If they did, though...whew, I'd be quaking in my boots if I were the usual Enterprise hardware vendors. That's a damn frightening piece of competition.
> By the time you've shoehorned in an nVidia GPU and all that RAM, you're easily looking at 5x that MSRP
That nvidia GPU setup will actually have the compute grunt to make use of the RAM, though, which this M3 Ultra probably realistically doesn't. After all, if the only thing that mattered was RAM then the 2TB you can shove into an Epyc or Xeon would already be dominating the AI industry. But they aren't, because it isn't. It certainly hits at a unique combination of things, but whether or not that's maximally useful for the money is a completely different story.
You're forgetting what Apple's been baking into their silicon for (nearly? over?) a decade: the Neural Processing Unit (NPU), now called the "Neural Engine". That's their secret sauce that makes their kit more competitive for endpoint and edge inference than standard x86 CPUs. It's why I can get similarly satisfying performance on my old M1 Pro Macbook Pro with a scant 16GB of memory as I can on my 10900k w/ 64GB RAM and an RTX 3090 under the hood. Just to put these two into context, I ran the latest version of LM Studio with the deepseek-r1-distill-llama-8b model @ Q8_0, both with the exact same prompt and maximally offloaded onto hardware acceleration and memory, with a context window that was entirely empty:
Write me an AWS CloudFormation file that does the following:
* Deploys an Amazon Kubernetes Cluster
* Deploys Busybox in the namespace "Test1", including creating that Namespace
* Deploys a second Busybox in the namespace "Test3", including creating that Namespace
* Creates a PVC for 60GB of storage
The M1Pro laptop with 16GB of Unified Memory:
* 21.28 seconds for "Thinking"
* 0.22s to the first token
* 18.65 tokens/second over 1484 tokens in its responses
* 1m:23s from sending the input to completion of the output
The 10900k CPU, with 64GB of RAM and a full-fat RTX 3090 GPU in it:
* 10.88 seconds for "thinking"
* 0.04s to first token
* 58.02 tokens/second over 1905 tokens in its responses
* 0m:34s from sending the input to completion of the output
Same model, same loader, different architectures and resources. This is why a lot of the AI crowd are on Macs: their chip designs, especially the Neural Engine and GPUs, allow quite competent edge inference while sipping comparative thimbles of energy. It's why if I were all-in on LLMs or leveraged them for work more often (which I intend to, given how I'm currently selling my generalist expertise to potential employers), I'd be seriously eyeballing these little Mac Studios for their local inference capabilities.
Uh.... I must be missing something here, because you're hyping up Apple's NPU only to show it getting absolutely obliterated by the equally old 3090? Your 10900K having 64gb of RAM is also irrelevant here...
You're missing the the bigger picture by getting bogged down in technical details. To an end user, the difference between thirty seconds and ninety seconds is often irrelevant for things like AI, where they expect a delay while it "thinks". When taken in that context, you're now comparing a 14" laptop running off its battery, to a desktop rig gulping down ~500W according to my UPS, for a mere 66% reduction in runtime for a single query at the expense of 5x the power draw.
Sure, the desktop machine performs better, as would a datacenter server jam-packed full of Blackwell GPUs, but that's not what's exciting about Apple's implementation. It's the efficiency of it all, being able to handle modern models on comparatively "weaker" hardware most folks would dismiss outright. That's the point I was trying to make.
We're talking about the m3 ultra here, which is also wall powered and also expensive. Nobody is interested in dropping upwards of $10,000 on a Mac Studio to have "okay" performance just because an unrelated product is battery powered. Similarly saving a few bucks on electricity to triple the time the much, much more expensive engineer time spent waiting on results is foolish
Also Apple isn't unique in having an NPU in a laptop. Fucking everyone does at this point.
It almost feels like you're deliberately missing the forest for the trees, in order to fit some argument that I'm not quite able to sus out here.
The point is that, in terms of practical usage, the M3 Ultra is uniquely competent and highly affordable in a sea of enterprise technology that is decidedly not. I tried to demonstrate why I'm excited about it by pointing out the similar performance of a battery-powered, four-year-old laptop and a quite gargantuan gaming PC that's pulling over 500W from the wall, as an example of what several years of additional refinements and improvements to the architecture was expected to bring.
The point is that it's affordable, more flexible in deployment, and more efficient than similarly-specced datacenter servers specifically designed for inference. For the cost of a single decked-out Dell or HP rackmount server, I can have five of these Mac Studios with M3 Ultra chips - and without the need for substantial cooling, noise isolation, or other datacenter necessities. If the marketing copy is even in the same ballpark as actual performance, that's easily enough inference to serve an office of fifty to a hundred people or more, depending on latency tolerances; if you don't mind "queuing" work (like CurrentCo does with their internal Agents), one of those is likely enough for a hundred users.
That's the excitement. That's the point. It's not the fastest, it's not the cheapest, it's just the most balanced.
Apple defenders have some special sauce reasoning that makes no sense to anyone but them.
Are you a boomer?
I have Apple hardware but it sucks for anything AI, buying it for that purpose is just extremely dumb, just like buying Macs for engineering CADs or things of the sort.
If you are buying Macs and it's not for media production related reasons you are doing something wrong.
> Apple defenders have some special sauce reasoning that makes no sense to anyone but them. Are you a boomer?
I continue to be in awe of the lengths some people will go just to fling insults and shake out some salt. We're, what, ten layers deep? With all the context above, the best you have to contribute to the discussion are baseless accusations and ageist insults?
Your finite time would have been better spent on literally anything else, than actively seeking out a comment just to throw subjective, unsubstantiated shade around. C'mon, be better.
Makes no mistake, it's not an insult. I'm saying that precisely because I have been there.
Apple is the master at creating desire and building narrative in their customers' mind about the many things their devices would allow them to do. It's very aspirational and in practice most of the Macs get used for things that could have been done with a much cheaper option.
It may not be obvious to you but it's somewhat funny seeing you rationalise all kinds of dreams of what this machine could potentially be when in practice the people who would really be working on the kind of stuff you are talking about don't even consider them viable for many good reasons.
It's not that those machines cannot potentially do it, it's just that they don't really fit the goal very well.
A lot like people buying Cybertruck to "haul" stuff when they are a lot more option that are just plain better and make a lot more economic/practical sense.
It's OK to desire the thing and be excited about it but it really doesn't serve anyone to rationalise it so hard, you are lying to yourself as much as everyone else, it's not healthy.
If that was not clear, people working on AI stuff professionally really don't have to deal with a Mac Studio, they have access to better stuff. If you want to get one personally to experiment/toy around it's ok but it's not going to be this amazing thing for AI.
Had the M3 GPU been much wider, it would be constrained by the memory bandwidth. It might still have an advantage over Nvidia competitors in that it has 512GB accessible to it and will need to push less memory across socket boundaries.
From my outsider perspective, it's pretty straightforward why they don't.
In Intel's case, there's ample coverage of the company's lack of direction and complacency on existing hardware, even as their competitors ate away at their moat, year after year. AMD with their EPYC chips taking datacenter share, Apple moving to in-house silicon for their entire product line, Qualcomm and Microsoft partnering with ongoing exploration of ARM solutions. A lack of competency in leadership over that time period has annihilated their lead in an industry they used to single-handedly dictate, and it's unlikely they'll recover that anytime soon. So in a sense, Intel cannot make a similar product, in a timely manner, that competes in this segment.
As for AMD, it's a bit more complicated. They're seeing pleasant success in their CPU lineup, and have all but thrown in the towel on higher-end GPUs. The industry has broadly rallied around CUDA instead of OpenCL or other alternatives, especially in the datacenter, and AMD realizes it's a fool's errand to try and compete directly there when it's a monopoly in practice. Instead of squandering capital to compete, they can just continue succeeding and working on their own moat in the areas they specialize in - mid-range GPUs for work and gaming, CPUs targeting consumers and datacenters, and APUs finding their way into game consoles, handhelds, and other consumer devices or Edge compute systems.
And that's just getting into the specifics of those two companies. The reality is that any vendor who hasn't already unveiled their own chips or accelerators is coming in at what's perceived to be the "top" of the bubble or market. They'd lack the capital or moat to really build themselves up as a proper competitor, and are more likely to just be acquired in the current regulatory environment (or lack thereof) for a quick payout to shareholders. There's a reason why the persistent rumor of Qualcomm purchasing part or whole of Intel just won't die: the x86 market is rather stagnant, churning out mediocre improvements YoY at growing pricepoints, while ARM and RISC chips continue to innovate on modern manufacturing processes and chip designs. The growth is not in x86, but a juggernaut like Qualcomm would be an ideal buyer for a "dying" or "completed" business like Intel's, where the only thing left to do is constantly iterate for diminishing returns.
The bargain is the lower price in the UK compared to US, once US sales tax is added. It's not like the pound is strong. It's just cheaper in the UK. And you're right, all Apple products are better value in that UK. I'm not used to any electronics being good value in the UK.