Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
LibOS – A library operating system for Linux (lwn.net)
160 points by conductor on March 24, 2015 | hide | past | favorite | 58 comments


Now this is userspace networking I can get behind!

Rather than re-implement a new network stack and solve all the same problems again, this builds on the years of good work already done and leverages the existing time investment within the existing stack.

"Librarising" existing stacks to userspace, great idea, this is the way of the future.


The NetBSD rump kernel (rumpkernel.org) does this for every driver in the kernel, and is part of the OS, so they are maintained (there are lots of stacks pulled out but not maintained).

Thats why it is important that this project becomes part of upstream so it can be maintained as the stack changes.


Reminded me of that quote:

"An operating system is a collection of things that don't fit into a language. There shouldn't be one." --DanIngalls


There are environments that do not have a strong distinction between OS and language. While adored by their users, for some reason they are never very popular. Oberon, Smalltalk and the old Lisp Machines come to mind. (Maybe Forth too, though most "Forth OSes" never tried to do multitasking or even file systems.) I suspect there are anti-network effects holding them back. They support one language and one language only. You can't easily bolt them into whatever OS you already use. None of your familiar tools run on them either. Adoption rates are poor with no way to dip your toe into the environment.

What would a "no OS" system that equally supports multiple languages look like?

edit: Multiple languages that have nothing in common. Think about the insane variety of languages you have available on *nix.


What would a "no OS" system that equally supports multiple languages look like?

Racket?


We are getting there with the rump kernel (rumpkernel.org). It now runs PHP, Lua, LuaJIT, C, C++. This is very much work in progress and it is not particularly user friendly, but it is improving rapidly. I plant to look at Go soon, which has built in syscall traps, which need replacing with library calls. There are a whole bunch of environment assumptions, like existence of dynamic linkers, build systems and so on that need to be worked around.

OSv is another option, it makes life easier by supporting Linux binaries modified to be dynamic libraries, and runs several languages too, although it was initially mainly targeted at JVM languages.


Erlang on Xen[1], mayhap?

[1]: http://erlangonxen.org


Java comes to mind. It's almost an OS, and easily could be with some additional parts bolted into the JVM. Very popular.


Using a standalone Forth is more or less a rejection of the utility of an operating system. My years of using a stand alone Forth for day to day computing instilled in me a deep cynicism when it comes to the complexity of operating systems.


I've been reading kragen mailing lists where they bootstrap a lot of 'utilities' on metal with asm (using alt keycodes as 'editor', funky). It's very inspiring and relaxing to see how few things you need to start using a computer. It flatten the whole abstraction stack into a thin core.


IIRC the Lisp Machines supported C in addition to the system's dialect of Lisp.


> IIRC the Lisp Machines supported C in addition to the system's dialect of Lisp.

IIRC the Lisp Machines (well, CADR at least) compiled code down to some pretty high-level machine code implemented in microcode. How many standard C idioms could actually be expressed in that kind of code?

I'm also not seeing how it would give a speed-up, which is the usual reason to code in C.

Was the C compiled to microcode?


No reason you can't compile C down to that. It's just that there will be less of a clear performance advantage for C as some of the semantics will be slightly awkward to map to that instruction set.

You might still get a speedup by writing C in some areas, but I think the real reason for supporting it is that you have a bunch of legacy software and drivers written in C, and it's easier to get a C compiler working on that platform than to port all that software.


The reason for C, Pascal etc. on the Lisp Machine was not speed. There were two reasons:

1) using the Genera development environment, where you could interactively/incrementally develop/debug C

2) using some C/Pascal software on a Lisp Machine. Examples were the MIT X11 server and TeX.


I think pjmlp showed some Lisp Machine bytecode and it was indeed taylored to lisp, and impractical for C semantics.


https://en.wikipedia.org/wiki/Genera_%28operating_system%29#...

So that's several variants of lisp, as well as fortran, c, pascal, prolog and ada. Thats pretty much most of the languages in use in the 80s.


a CLR on metal ?

I know all these names, unfortunately only by reading and not by experience. Some lispers are also keen on bypassing OS interface and emit binary code on the go.


Microsoft has already put CIL on metal, using Bartok¹. They used that to build a whole OS, called Singularity².

¹http://en.wikipedia.org/wiki/Bartok_%28compiler%29

² http://en.wikipedia.org/wiki/Singularity_%28operating_system...


A native JVM system?



This is excellent.

Anyone who's ever faced the daunting task of bolting TCP on top of custom datagram protocol can certainly confirm how useful a library like this is.


Why isn't that "just" either using lwIP, or using your host's TCP running against a TUN/TAP device and then encapsulating the data you read from /dev/net/tun into your custom datagram protocol?

(This is not meant to be dismissive, just inquisitive. I'm sure it can't be anywhere near as easy as what I said, I'd just like to learn why.)


As someone who's currently facing that daunting task myself, may I ask what you ended up doing? Did you have to roll your own reliable stream-oriented protocol?


Stupid question: why does the Linux Kernel handle the networking in the first place? Why not implement everything in userspace from the start?


For a number of reasons:

- It used to be impossible to implement securely. Until very recently, there was no hardware support for virtualizing the network buffers, which would mean emulating the network hardware in userspace. This would be very slow.

- Because even today, many devices don't have the required hardware virtualization support. For example, many (most?) ARM devices. If you give direct DMA access, you might be allowing anyone to splatter whatever code they want across whatever memory they want.

- Without kernel arbitration of some sort, there's no way to do load balancing across services, throttle, or firewall effectively.

- The kernel is designed to provide a uniform interface for all programs to the hardware. Putting networking in userspace gets rid of this abstraction, and means that every program has to be aware of the network hardware; software that works with file descriptors directly can't use the same abstraction for files and network.

- Because there are no hardware limitations on the amount of multiplexing. I don't remember how much muxing various hardware supports, but hardware has limits for this sort of thing. If you want more than N processes using the virtual network, you might be SOL with a virtualized userspace network stack.


These are valid objections to in-process access to the networking hardware. Not to running the networking stack in user space, in a privileged process.

For the latter, performance concerns would be a bigger issue.


Not a stupid question at all. It's kind of like the question why did we have to program computers in Assembler (which was great fun, BTW) and not in high level languages. It wasn't feasible. Assembly coding did continue for quite some time in the interest of performance but once the proverbial turn is taken, there is no looking back. That's where networking is now. What we still call (rigid) "protocols" should and indeed can become freestanding application code See my paper "Tearing Down the Protocol Wall" at https://www.linkedin.com/in/yitzhakbg#background-publication...


I like the following interview.

"One could start arguing about it from the historical viewpoint where network packet creation was a holy operation."

https://fosdem.org/2015/interviews/2015-antti-kantee/


To multiplex multiple users and applications. Which is less necessary in these days of virtual network devices etc, eg SR-IOV can make a PCI device appear as say 64 network devices so every application can have its own.


It will be interesting to see how this evolves; hopefully it will make it easier to build a userland TCP/IP stack, and maybe even to make it easier to ship a new transport protocol (this might make it easier to push for HTTP over SCTP instead of TCP, which makes far more sense, really). Also this will be nice if it allows userland applications to utilize ICMP (traceroute/nmap would no longer be nerfed without root access). NUSE alone should be incredibly handy.


You still need root access to connect to an outside network interface from the userland stack, eg a raw socket...

Linux has unpriveleged icmp sockets but no one has ever used them apparently - IPPROTO_ICMP eg see [1]

[1] http://stackoverflow.com/questions/14018584/python-with-unpr...


Uh, not sure what's up with lkml.org -- I see some G+ +1 buttons but no actual text, in both Firefox and Chrome. Presumably the article is this:

https://lwn.net/Articles/637658/

in which case it sounds awesome!


Thanks, we updated the link.


My first thought was that this sounds like an idea taken from microkernels. Is this a correct analogy? Would it be sufficient to use libos and fuse to call the Linux kernel a microkernel? What would be missing?


For one, the microkernel would be missing ;)

More seriously though, you're absolutely right in that it's a step in the direction of an "optional" microkernel architecture. That's actually how rump kernels on NetBSD started: running the kernel file system driver as a library in userspace on top of a FUSE-like subsystem. It's pretty useful functionality, since it allows you to handle untrusted file system images safely in userspace, while not imposing performance penalty on the trusted images which can be handled by the same driver running in the kernel. Unlike with FUSE-specific drivers, you don't run into issues with unsymmetric driver support in userspace vs. kernel.


My first thouvht on hearing the name "libOS" was that someone had gotten Linux running on an exokernel, which is in some ways a variation on the microkernel theme.

Linux on a microkernel has been done before. The MkLinux projext is one example, though it also doubled as a port to PowerPC. I haven't heard of any exokernel attempts, though.


How might this relate to rump kernels and unikernels?


The design is quite heavily influenced by the rump kernel design, although Linux has its own issues that make it a bit different. Obviously its only the network stack.


The other closest thing is probably an exokernel, which is essentially a rump kernel with all the of the stuff that's optionally in userspace removed from the kernel. MIT did a lot of experimentation there.


Or microkernels.


I have to credit this blog for much of the material I've absorbed over the years, it's been fantastic how much I've learned here.

I've always managed to learn something new from articles, all this to say that today is different... I really do not understand this library at all.

Could someone explain me different possible uses for this library? What would be the uses of this out of the box?


1. Calling LWN a "blog" is vaguely like calling Ars or even the NYT a "blog" because they have textual online content in article format.

2. This isn't even really LWN content, it's a mailing list post. You can click "Archive-link" to see it on Gmane.


You have a very peculiar comment history - https://news.ycombinator.com/threads?id=agashka


thanks for bringing that up! Adobe air, bitcoins, and signals. I still don't really get signals, but I'd like to understand this library lol


Why are you thanking me?


alright got it, thanks everyone, i'll stay quiet the next time


I'm aware of Freescale's USDPAA that runs network stack in user space to achieve wire-speed.


The author forgot to add a license for this project. What shall we assume?


It was submitted as a patch series to the Linux kernel mailing list, with proper "Signed-off-by" headers in the commit messages. So presumably it's usable under GPLv2.

EDIT: I was referring to the kernel support, but it looks like you might have been talking about the code on Github which uses the LibOS API. That does indeed seem to be missing a license.



I was talking about the repository linked from the original mailing list post: https://github.com/libos-nuse/linux-libos-tools


If you're in the US, lack of a license generally means standard copyright: http://choosealicense.com/no-license/


It's a patch to apply to the linux kernel. I guess you can assume GPLv2.


Does this mean we get to link the kernel as a shared library?


If you want that, there's always OSv:

https://github.com/cloudius-systems/osv


Lennart Poettering would do well to pay attention to people like this, continuing to honor the POSIX way.


Sorry but that comment is just BS.


why?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: