eBPF Can't Count?

zeusk · on May 4, 2019

The reason why volatile variables aren't extended to 64 bit is because in embedded and systems programming, it is often used to access memory mapped IO registers - so any sign extension or truncation might lead to a wrong value being sent over the bus. GCC also has a keyword (register) to make sure that the variable is sized exactly right and fits in a register (iirc, there was a way to even ask the compiler for a specific register to be used for the variable in case of inline assembly being used).

jkbs · on May 6, 2019

The MMIO restrictions make sense. Thanks for the explanation!

justinclift · on May 4, 2019

There's a part which seems strange, where clang is used with "-O2" to generate code:

  $ clang -O2 -target bpf -Xclang -target-feature -Xclang +alu32 -c sub64.c -o - | llvm-objdump -S -

> Apparently the compiler decided it was better to operate on 64-bit registers and discard the upper 32 bits.

The workaround was to use the `volatile` keyword.

The problem kind of sounds like one of the LLVM optimisation passes made the change.

http://releases.llvm.org/8.0.0/docs/Passes.html#transform-pa...

Wonder if disabling optimisations ("-O0") would have also worked?

praseodym · on May 4, 2019

Looks like it:

  $ clang -O2 -target bpf -Xclang -target-feature -Xclang +alu32 -c sub32.c
  $ llvm-objdump -S -no-show-raw-insn sub32.o
  
  sub32.o: file format ELF64-BPF
  
  Disassembly of section .text:
  0000000000000000 sub32:
         0: w0 = w1
         1: w0 -= w2
         2: exit
  
  $ clang -O0 -target bpf -Xclang -target-feature -Xclang +alu32 -c sub32.c
  $ llvm-objdump -S -no-show-raw-insn sub32.o
  
  sub32.o: file format ELF64-BPF
  
  Disassembly of section .text:
  0000000000000000 sub32:
         0: w3 = w2
         1: w4 = w1
         2: *(u32 *)(r10 - 4) = r1
         3: *(u32 *)(r10 - 8) = r2
         4: r1 = *(u32 *)(r10 - 4)
         5: r2 = *(u32 *)(r10 - 8)
         6: w1 -= w2
         7: w0 = w1
         8: *(u32 *)(r10 - 12) = r3
         9: *(u32 *)(r10 - 16) = r4
        10: exit

majke · on May 4, 2019

Compiling to eBPF is hard! The compiler must: - avoid loops (backwards jumps) - unroll everything - inline everything - no function calls

Basically, it's impossible to use clang to generate eBPF without -O2. Sorry.

_ivvf · on May 4, 2019

I'm new to this area of work, but has anyone stepped back and thought: "this is not the right way to build eBPF programs"? Would it be better to create a new high-level language and toolset? All this voodoo hackery to try to trick a C compiler into making eBPF-compatible code feels unsustainable.

asgeir · on May 4, 2019

bcc https://github.com/iovisor/bcc if you need low-level abstractions or bpftrace https://github.com/iovisor/bpftrace if you need something simpler (similar to dtrace)

justinclift · on May 4, 2019

>Basically, it's impossible to use clang to generate eBPF without -O2. Sorry.

That's a bit confusing, as the comment by praseodym shows the output from doing just that.

???

Dangeranger · on May 4, 2019

I believe he is saying that eBPF will disallow certain code to run if it doesn’t meet the standards listed by the parent comment. The article also mentions that eBPF will check whether code is “worthy” or running or rearrange it so that it is worthy.

The idea is to compile to assembly which both fixes the bug, and is runnable by eBPF.

majke · on May 4, 2019

Apologies. Indeed, not all generated clang eBPF is loadable eBPF. So we can sum up:

- Clang can compile C

- eBPF backend can generate the eBPF bytecode for subset of IR (AFAIU not all IR, not all "C" is okay for eBPF)

- verifier can accept or reject the generated eBPF.

The goal is to write C in such a way that will be generally sensible C but that will consistently yield loadable eBPF programs. This is not easy! Just recently I was debugging code where clang optmizer decided to merge two loads of 2 bytes into one load of 4 bytes. This made the verifier unhappy - the specific loads are rewritten to helper calls by verifier. The verifier knows what to do on 2-byte load at offset X, and 2-byte load at offset X+2. But it has no idea what to do with 4-byte load at offset X!

Writing C programs that will compile to valid, loadable eBPF requires a fair amount of voodoo.

justinclift · on May 4, 2019

Thanks, that makes sense. :)

Kind of sounds like having a specific eBPF transformation pass available in LLVM would be useful.

eg having it generate known-to-work-with-eBPF output

lukego · on May 4, 2019

Interestingly, you are basically describing the way that LuaJIT compiles Lua code :)

jkbs · on May 5, 2019

Cool idea. I played around with it.

As majke mentioned we need optimizations or the backend compiler will complain:

  $ clang -O0 -target bpf filter_alu32.c -emit-llvm -S -c -o filter_alu32.ll
  $ llc -march=bpf -mattr=+alu32 -filetype=obj filter_alu32.ll
  LLVM ERROR: Unsupported relocation: try to compile with -O2 or above, or check your static variable usage

The code has to be non-trivial to hit the error. I think its the use of eBPF maps that triggers it.

What if we turned off the optimizations in the frontend and ran just the backend optimizer, though? I had some success with this approach:

  $ clang -O0 -target bpf filter_alu32.c -emit-llvm -S -c -o filter_alu32.ll
  $ opt -O2 filter_alu32.ll | llc -march=bpf -mattr=+alu32 -filetype=obj -o filter_alu32.o
  $ llvm-objdump -S filter_alu32.o | grep -e -=
        45:       1c 21 00 00 00 00 00 00         w1 -= w2
        53:       1c 21 00 00 00 00 00 00         w1 -= w2
        65:       1c 21 00 00 00 00 00 00         w1 -= w2

You end up with more eBPF instructions if you don't use the frontend optimizer but the code doesn't need any hacks. Perhaps it can be tweaked further by applying frontend optimizations selectively?

So thanks for the suggestion. It's another working solution.

You can find the code for this experiment here: https://github.com/jsitnicki/cloudflare-blog/commit/acf28b69...

setheron · on May 4, 2019

Great post. Detailed enough to keep it informative but light enough to read along.

I've been doing some eBPF work at the moment but the CloudFlare guys are at another level.

I'd love to take it up a notch like them. eBPF is very exciting and I'm hoping to remain in the space.

yoodenvranx · on May 4, 2019

I love stories like this, thx for posting!

antpls · on May 4, 2019

If you made it works, does it mean the Spectre mitigation is basically useless, as it can be bypassed ?

andreareina · on May 4, 2019

The verifier rewrites pointer arithmetic to be safe. There's a bug where it misclassifies 64-bit integer arithmetic to be pointer arithmetic, AIUI the bug doesn't affect 32-bit arithmetic (i.e. it should still detect and rewrite 32-bit pointer arithmetic, so the Spectre mitigation isn't being bypassed)

adontz · on May 4, 2019

Looks like there are no unit tests in place? Sad.

majke · on May 4, 2019

Ha! Thanks for asking. We have a number of tests in place to test this eBPF. We are proud of them since, well testing eBPF is hard.

Sadly, the tests aren't really worth much here - the bug was in the _kernel_ not the eBPF code! Once again - we'd need to run our tests on the affected kernels to see the problem. We're not there yet, our automatic testing infra runs on another set of kernels than production. Oops on our side.

But the fact is - testing new eBPF features is hard. You need to run a rich eBPF bytecode, on a modern kernel. I think the proper approach is to run usermode linux, and run tests _inside_ of it. Not sure how hard would that be to pull it off though.

Approach similar to this project: https://source.android.com/devices/architecture/kernel/netwo...

jkbs · on May 5, 2019

If you have the Linux kernel eBPF verifier in mind, it actually has a quite comprehensive test suite:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

Tests for the corner case that has caused the problem have been added shortly after it was fixed:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

lixtra · on May 4, 2019

Depending on the numbers in the test case it wouldn’t have triggered. Depending on the user, it wouldn’t have triggered.

The other problem is that eBPF is a far way from KISS. But then that’s what’s needed for power and performance.

EDIT after RTFA: and security against spectre and friends.

adontz · on May 4, 2019

I am pretty sure that if you rewrite code, you should test rewritten version for equivalence. GCC has about 300K tests for a reason. I would not say kernel is less important.

lixtra · on May 4, 2019

Testing for equivalence? Don’t you need to solve the halting problem for that?

EDIT: I’m curious if a down voter can provide a piece of code that tests whether two compilates are equivalent.

The root cause of the bug in the article was introduced by a patch to automatically modify an arbitrary program to harden against spectre.

adontz · on May 4, 2019

Also, I'm not talking about theoretical proof of equivalence, I'm talking about simple test like

    assert func(params) == rewrite(func)(params)

for some reasonable set of func and params. Totally enough to catch problems like this. I'd say [unit-]testing that subtraction operator works is not that hard.

lixtra · on May 4, 2019

Except, that in this case params is a string containing an arbitrary program.

It would help the discussion if you read the TFA (edit: more carefully).

Rewrite() in this case modified the params program to harden it against Spectre.

marcinzm · on May 4, 2019

That argument applies to pretty much every single unit test ever written. A function running on a single long can take 2^64 possible values. Impossible to test by your logic. Yet they're tested without issues constantly.

What you do is put together a long list of sample functions and sample arguments that covers the expected edge cases and then test those for equivalence. Hardly impossible. Just takes time. Not bulletproof but better than nothing.

Quekid5 · on May 4, 2019

I think the poster is talking about something along the lines of bisimulation[1]. No need to solve the halting problem for that.

[1] https://en.wikipedia.org/wiki/Bisimulation

rndgermandude · on May 4, 2019

We're talking about eBPF programs here. Those halt by design as you cannot loop (jump backwards) and have a limited number of instructions you can use per program. So the halting problem isn't really and issue here in this particular case.

You can just run the original and rewritten code in an eBPF VM with mock inputs and compare.

adontz · on May 4, 2019

If we are talking about arithmetic operations, then I believe no.