I find the levels bizarre. Chromebooks are highly exposed to physical attack. Keys in the cloud are not nearly as exposed. Yet people seem okay with level 1 for chromebooks but apparently want level 3 in the cloud?
I’d rather see a level 1 or level 2 auditable cloud solution, with at least source available.
This is so weird. The idea of an adversary covertly walking off with an IBM Mainframe or covertly bringing an electronics lab, a microscope, logic analyzers, glitching hardware, etc to the aforementioned mainframe is rather strange. Whereas someone doing that to a phone or a laptop or a game console is very likely.
If I wanted to store an important long term key in a secure facility, I would worry, first and foremost, about software attacks, attacks doable over a network, malicious firmware attacks, and maybe passively observed side channel attacks. Physical attacks would be a rather distant second.
The adversary will show up and badge in just like everyone else. They might have worked there for 20 years, or they might be an outside repair person or external consultant.
They will definitely fit in. They're supposed to be there.
It will be the most normal thing in the world. And you may never know their real purpose.
Sure. But the attacker needs to actually get in, which is considerably harder than getting into a hotel room. But more relevantly, the kinds of countermeasures that get you from level 1 to a higher level don’t seem likely to help at all — if some evil-maids or otherwise fully compromises a machine hosting a FIPS 140-2 level 4 HSM, they likely get the unrestricted ability to perform cryptographic operations using keys protected by that HSM, but they get this by using the HSM’s normal API. If they can convince the HSM to export its keys to another HSM (oops) or to otherwise leak the key material, they get the key material. But this doesn’t seem like it has much to do with physical attacks against the HSM.
Now if someone evil-maid attacks the HSM itself, that’s a different story. Any good HSM should resist this, especially one found in a portable device. And this is because you can steal an entire important corporate laptop or other portable device without necessarily raising an quick alarm, whereas I have trouble imagining someone walking off with the HSM out of an IBM mainframe or with an AWS HSM without the loss being noticed immediately.
(To be fair, in the mainframe case, some crusty corporations seem to have a remarkable ability to fail to notice obvious crypto problems like their public facing certificates expiring. But a loss of an entire HSM from a secure large cloud datacenter will, at the very least, immediately trigger “elevated failure rates” or whatever they like to call it…)
Wiping for no reason: that could well be a difference between the view of the firmware of the world versus your view and I guess they just decided to err on the side of caution?
And low power alarms may well be a variation on that theme. Glitching the power supply has been a tool in the arsenal of reverse engineers for a long time so that sort of sensitivity may well make sense. Voltage spikes and drops can be very short, short enough for you not to see them on a DVM but on a memory scope with a trigger value set much lower than you might expect they'd show up with alarming regularity in some hardware that I've worked on. And that explained some pretty weird instability issues. Good power is rare enough that really sensitive hardware usually has power conditioning circuitry right up close to the consumer.
Wiping for no reason: that could well be a difference between the view of the firmware of the world versus your view and I guess they just decided to err on the side of caution?
No. I said I've been in touch with technical support, and the manuals, docs, and their support is clear. It should not be wiping, it has a backuo battery too.
We've spent hours and hours testing, to validate the issue, and cause.
They likely have a firmware bug, or bad board design. And we've seen this from cards from different batches, bought years apart.
Their support is incompetent, and I say that with 30+ years of dealing with, and providing tech support. They fail to read tickets, and even spend (supposedly) weeks running tests, while ignoring vital data in tickets, and conveyed in support calls.
They. Are. Incompetent.
In terms of "issues with power", no. Not over dozens of servers, in different datacentres, and even just with the card at rest, out of server, on battery.
Understand, their job is to provide stable. HSM cards are useless, if they randomly wipe when in use, while under power "just cause".
I find it weird that you're playing devil's advocate here, describing how hard this is, this is an enterprise grade card, and people have been making reliable, and safe HSMs for decades.
Hehe, ok! Clear case of faulty product then. Thanks for the extra context.
I'm not so much playing devils advocate as that I'm aware how hard making such devices is and the difference between 'user error' and 'incompetent staff/faulty product' can be hard to distinguish in a comment.
Also, the Cavium one was the fastest one on the market the last time I looked at this. Thales, Safenet and IBM also had them..