The reason why volatile variables aren't extended to 64 bit is because in embedded and systems programming, it is often used to access memory mapped IO registers - so any sign extension or truncation might lead to a wrong value being sent over the bus. GCC also has a keyword (register) to make sure that the variable is sized exactly right and fits in a register (iirc, there was a way to even ask the compiler for a specific register to be used for the variable in case of inline assembly being used).
I'm new to this area of work, but has anyone stepped back and thought: "this is not the right way to build eBPF programs"? Would it be better to create a new high-level language and toolset? All this voodoo hackery to try to trick a C compiler into making eBPF-compatible code feels unsustainable.
I believe he is saying that eBPF will disallow certain code to run if it doesn’t meet the standards listed by the parent comment. The article also mentions that eBPF will check whether code is “worthy” or running or rearrange it so that it is worthy.
The idea is to compile to assembly which both fixes the bug, and is runnable by eBPF.
Apologies. Indeed, not all generated clang eBPF is loadable eBPF. So we can sum up:
- Clang can compile C
- eBPF backend can generate the eBPF bytecode for subset of IR (AFAIU not all IR, not all "C" is okay for eBPF)
- verifier can accept or reject the generated eBPF.
The goal is to write C in such a way that will be generally sensible C but that will consistently yield loadable eBPF programs. This is not easy! Just recently I was debugging code where clang optmizer decided to merge two loads of 2 bytes into one load of 4 bytes. This made the verifier unhappy - the specific loads are rewritten to helper calls by verifier. The verifier knows what to do on 2-byte load at offset X, and 2-byte load at offset X+2. But it has no idea what to do with 4-byte load at offset X!
Writing C programs that will compile to valid, loadable eBPF requires a fair amount of voodoo.
You end up with more eBPF instructions if you don't use the frontend optimizer but the code doesn't need any hacks. Perhaps it can be tweaked further by applying frontend optimizations selectively?
So thanks for the suggestion. It's another working solution.
The verifier rewrites pointer arithmetic to be safe. There's a bug where it misclassifies 64-bit integer arithmetic to be pointer arithmetic, AIUI the bug doesn't affect 32-bit arithmetic (i.e. it should still detect and rewrite 32-bit pointer arithmetic, so the Spectre mitigation isn't being bypassed)
Ha! Thanks for asking. We have a number of tests in place to test this eBPF. We are proud of them since, well testing eBPF is hard.
Sadly, the tests aren't really worth much here - the bug was in the _kernel_ not the eBPF code! Once again - we'd need to run our tests on the affected kernels to see the problem. We're not there yet, our automatic testing infra runs on another set of kernels than production. Oops on our side.
But the fact is - testing new eBPF features is hard. You need to run a rich eBPF bytecode, on a modern kernel. I think the proper approach is to run usermode linux, and run tests _inside_ of it. Not sure how hard would that be to pull it off though.
I am pretty sure that if you rewrite code, you should test rewritten version for equivalence. GCC has about 300K tests for a reason. I would not say kernel is less important.
Also, I'm not talking about theoretical proof of equivalence, I'm talking about simple test like
assert func(params) == rewrite(func)(params)
for some reasonable set of func and params. Totally enough to catch problems like this. I'd say [unit-]testing that subtraction operator works is not that hard.
That argument applies to pretty much every single unit test ever written. A function running on a single long can take 2^64 possible values. Impossible to test by your logic. Yet they're tested without issues constantly.
What you do is put together a long list of sample functions and sample arguments that covers the expected edge cases and then test those for equivalence. Hardly impossible. Just takes time. Not bulletproof but better than nothing.
We're talking about eBPF programs here. Those halt by design as you cannot loop (jump backwards) and have a limited number of instructions you can use per program. So the halting problem isn't really and issue here in this particular case.
You can just run the original and rewritten code in an eBPF VM with mock inputs and compare.