Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The simple, everything is a file, model of Plan9 is what makes the namespaces API as clear and as general as it is. All the objects export the same file API. Every interaction with the OS objects is done through file open, create, read, write, etc. But it has it's drawbacks.

First, it's not always easy to map every object operation into either an open read or write. With time, we should have seen a lot of ugly interfaces resulting from this limitation.

Second, hardware progress, the web, etc., introduced a lot of heterogeneity and complexity. People could no more keep up with simple general designs. And to squeeze every bit of performance, everyone was doing things different based on the hardware and the workloads. They use whatever makes their software, drivers, and OS objects work as fast as they could.

And this is how we ended with the extreme fragmentation and heterogeneity we have in Linux, which explains the complex and less general implementation of its namespaces and its other features.

Edit: fix typos



> First, it's not always easy to map every object operation into either an open read or write.

Heck, even for basic files on disk, and even more so for sockets, the traditional open/read/write/close is starting to feel not so great. There is reason why iouring is hailed as the second coming, and it solves just part of the problems; stuff like fsync apocalypse comes to mind.

And ioctls are imho completely disgusting hack.

Ultimately IO is intrinsically complex topic, and trying to paper over that complexity with simple interfaces is disingenuous and falls flat on edge cases.


io_uring is a more modern API to file access. Actually I think it would be great if everything was a file and the communication with the kernel was only with io_uring.


> it would be great if everything was a file

This is a bad design. Process is not a file because you cannot send signals to a file, or cannot debug a file. Network socket is not a file because you cannot get file's peer address. Shared memory is not a file. And so on.


Why would (pseudocode, nonexisting but possible example) write(SIGKILL, "/proc/12345/signals") not be possible? For the other direction, there is signalfd in Linux. And of course you can get a peer address from /dev/tcp (https://andreafortuna.org/2021/03/06/some-useful-tips-about-... ). Yes, /dev/tcp is not an OS primitive but a bash builtin, but there isn't really a reason you cannot do this in the OS. Shared memory can be a file, you just have to mmap() it. mmap is always left out of the open/close/read/write-enumeration of the traditional file API, but I think it is actually extremely useful and should be the fifth alongside those.

Debugging is hard to imagine, yes. One could look at /proc/12345/mem, write breakpoints in there, but I'm not sure about how to do the more exotic things.


This means layering additional protocols on top of the file APIs. Of course you can do that, but eventually there will be a similar explosion such as on top of `ioctl`, and it's doubtful whether the resulting interfaces will be any easier to use than the existing ones.


On the other hand, layering tons of stuff on unsuitable interfaces has been all the rage in the last 25 years. Just think of everything-over-http, stuff-it-in-xml/json, program-it-in-yaml and similar industry trends. ;)


I can’t wait till we’re configuring our OS using CSS-in-JS and booting using webpack.

npm install -g styled-linux


As always, Fabrice Bellard is ahead of his time: https://bellard.org/jslinux/ :)


Plan 9 does this. The resulting interfaces are indeed easier to use, forward over the network, and redirect or emulate for testing.


Exactly? The alternative is ioctl whereas Plan 9 does everything over serialized streams to files.

The serialized streams make it easier to think of these ioctl-alikes as something you can easily access over a network (9p).

And that’s how resource sharing is done in plan 9.


Let me introduce you to the Linux Audio Stack. Laying protocols on top of each other, when everything is just files, is no different from layering protocols on top of each other when some things are files and some thing aren't.

The question isn't "will it be just as bad" but "would some things become easier, and if so, how many things would become easier?".

Because if the latter, then it's a worthwhile topic to think about even if we never use it in any commercial sense of of the word.


The main problem with Linux audio is the bloom of APIs that appeared over the years. While a file-based interface would be cool, it would be just yet another API that competes with all the others for its place in the ecosystem. Fortunately, ALSA and PulseAudio seem to dominate right now.


mmap() is very different in its semantics from read() and write(). The latters have a clear, precise definition, a byte stream received by your file driver. mmap() is a different beast, it can be a ring buffer, it can a spinlock, it can be anything. The moment you have mmap(), the "eveything is a file" model is already broken and the namespace API can longer work reliably over the network. How would you transport an mmap() over the network? if /dev/tcp in plan9 uses mmap to asynchronously write the packets, you can no longer rely on the 9p filesystem protocol to implement a proxy by mounting the remote computer's /dev/tcp, because there is no clear and reliable way to map remote memory. It's no more about transporting a random sequence of bytes to another computer. You need more than that to make it work. It's just one example where this "everything is a fike" model is overly simple for the harsh real world.


> write(SIGKILL, "/proc/12345/signals")

Outside of lots of non-obvious problems with synchronization, this is your example that best fits the idea. This interface is probably a good one.

> And of course you can get a peer address from /dev/tcp

You will have lots and lots of problems with access controls if this is your only interface.

> Shared memory can be a file

Coercing random access memory into a serial file just to go and emulate a random access over that file is... not a great way to deal with a high-performance primitive.


> Coercing random access memory into a serial file just to go and emulate a random access over that file is... not a great way to deal with a high-performance primitive.

The file doesn't need to have an on-disk representation. I don't see why an mmap-ed file should behave any different from a SHM segment. They are basically the same thing, the SHM segment even has a file descriptor. It just doesn't have a name somewhere in the filesystem hierarchy. https://man7.org/linux/man-pages/man7/shm_overview.7.html


To reverse that question, what do you gain by defining a `read` and `write` API over that memory segment?

Because if you just add some high-level interface without any concern for performance, yeah, you get what Linux does today. But if you make them a core concern of your shared memory interface, you will certainly lose performance on the cases it's mapped as memory. And "everything is a file, but this one here is actually all about random access" doesn't give you much abstraction.

As somebody already said on the comments, the nice (maybe IMO, I'm not sure) thing about Plan9 is that every resource is named somewhere in a tree. The fact that those things are "files" only detracts from the value and makes the system less fit for modern usage.


Yeah, "everything has a filename" and "everything is a file" are very similar concepts but they aren't quite the same, and it might be that most of the value comes from the former.


Having a cursor to access files is mostly legacy no? The difference between memory and storage is getting very small, e.g. ssds and optane


Not exactly legacy, as SSDs are not completely random access, and disks still exist. But yes, the stream abstraction is losing relevance for files.

But well, if the proposal is to unify everything, you will have to unstream network connections too.


It can have a name in the file system: /dev/shm


> This interface is probably a good one.

This interface is a bad one because if you want to filter syscalls then it will be difficult to distinguish write to a file from sending a signal.


You will filter syscalls by filename anyway.

Try `strace|grep open` on any program, and you will be spammed with a number of shared libraries. You need to filter them out anyway.


For one, encoding and decoding text is slower than binary calls to a function with solid parameters that don't need to be converted.

It's far too easy to pretend reality is not complex and that "elegant" solution somehow will fit everything


Netlink sockets in Linux input and output packed C structures. Doesn't get more efficient than that. Such an interface definitely doesn't have to be strings-only. But of course, you cannot use it with just 'echo' in that case.


Passing data in registers is more efficient, that's what you lose with serialization. Serialization gets you generality at the cost of some performance.


Right, registers are even faster.


Why not make real syscalls rather than emulate them using socket-like interface? Doesn't make much sense to me. For example, if you want to filter syscalls then it becomes more difficult (need to remember which type of socket it is, need to parse the structures and so on).


The original idea for netlink sockets (where the name comes from) is to be able to do some network packet processing in userspace. E.g. to do stuff like virus-scanning on TCP connections.

The situation there is exactly the opposite of a syscall, it is rather that the kernel calls into userspace to perform a helper function.

Communication with the socket is still read()/write()/..., so there are still syscalls. The userspace program will do a read() to get the next struct+packet out of the socket.

The modern, syscall-less interface for stuff would be io_uring. There, you do not need to read(), you can just get your data written into a userspace buffer that you can mwait or poll on.


/proc pseudofiles could have a mode (perhaps on open) that determines whether the protocol is ascii or packed/binary. there could even be a side-band interface that provides the packed schema (assuming not everything is exploded to atomic type-evident items).


How is that "simple and elegant" design going for you?


When everything is a file, what is a "file" becomes flexible.

In Plan 9, you send signals and messages to a file by writing to it. And read messages by reading from it.

It's in essence no different than OOP or actor model or what have you.


An attempt is being made to reduce everything to:

    interface UniversalInterface {
        fun open(…)
        fun read(…)
        fun write(…)
        fun close(…)
    }
If you went to a software engineering design review meeting, proposing that several different kinds of objects representing everything from files, network sockets, to arbitrary devices should all use the interface above, you’d be laughed out of the room.

Ultimately, when someone does end up implementing something like the above as the sole interface for some major component, it’ll result in libraries that return a wrapping object with a more useable and pleasing interface that abstracts away the ugliness of using UniversalInterface to interact with said component.

I see two ways forward with this. Either: (a) present the option of using UniversalInterface to interact with X object, in addition to interface that’s much more idiomatic and closer to how X object actually behaves.

Or, (b) come up with an alternative universal (or flexible) object interaction interface that’s much more flexible than UniversalInterface.


> If you went to a software engineering design review meeting, proposing that several different kinds of objects representing everything from files, network sockets, to arbitrary devices should all use the interface above, you’d be laughed out of the room.

You're describing a meeting in which people seem unaware of the purpose of uniform interfaces. Not sure we could call such a meeting "engineering design" meeting, because engineers tend to know better.

If I went up and proposed the same interface for all cables, displays, mice, keyboards, speakers, hard drives, phones... and called it USB, would I also be laughed out of the room?

There's no benefit to reinventing open/close/read/write in 50 different ways. The goal is to do it once, and then build more complex interfaces on top of it. That is, unless you're paid by the number of lines of code you write.


> The goal is to do it once

There is an universal interface, it is called "system call". Rather than build unnecessary layer on top of it, improve the syscalls if you don't like them, and get rid of ioctls, /proc, /sys and other pseudo-syscall abstractions.


A file descriptor is equivalent to the Object universal base class in many OO languages [1] and represents an handle to an OS resource. Now in UNIX you can have anonymous [2] resources, but in the Plan9 model most resources have a name and you can get an handle to it via open(<resource-name>).

Hence open is not part of your UniversalInterface, but it is a way to obtain a reference to it. It seems reasonable to have a generic way to dispose of an UniversalInterface (hence close). read/write are simply a generic ways to send and receive messages from UniversalInterface, not unlike a dynamically typed object. Ideally you would do a checked down cast to your actual interface [3], but this was designed to work with C so you have to make do.

[1] I don't subscribe to the Everything is an Object in the OO sense, but having a common base class for most OS resources seem a reasonable solution.

[2] or at the very least there isn't always a cross-resource-type namespace.

[3] Not unlike COM QueryInterface, and in fact UniversalInterface is equivalent to IUnknown


That is quite literally how Kubernetes works at large, which Plan 9 is to Unix as Kubernetes is to Linux, sort of. When the system is built from the ground up to be distributed across heterogeneous systems, you basically must build higher order protocols on top of really really simple ones. It's not that opening a byte-stream socket is the best interface for everything, it's that it's the lowest common denominator that everything can agree on no matter what.

This is very obviously an acceptable solution because the whole world runs on TCP.


> Either: (a) present the option of using UniversalInterface to interact with X object, in addition to interface that’s much more idiomatic and closer to how X object actually behaves.

> Or, (b) come up with an alternative universal (or flexible) object interaction interface that’s much more flexible than UniversalInterface.

That’s, basically, what COM/Corba/… are. There, UniversalInterface has an additional call “If you are a Foo, give me your Foo interface”, with Foo as an argument to that call.


And dbus as well on Linux. I think that that style is much better than everything is a file. Everything is a typed object.


> Everything is a typed object.

That's really good. The fact that the file interface just gives you a string of bytes, with no concept of structure or type safety in the interface itself is a major flow of it. Type safety is immensely valuable.


That's not far from the CRUD of a database, or POST/GET/PUT/DELETE of HTTP. It's also not far from what you'd expect at the transport layer of an RPC / IPC mechanism.

The metadata around files - ownership, permissions, creation time - apply to a lot more than files.

I don't really see it being laughed out of the room.


>Ultimately, when someone does end up implementing something like the above as the sole interface for some major component, it’ll result in libraries that return a wrapping object with a more useable and pleasing interface that abstracts away the ugliness of using UniversalInterface to interact with said component.

Yes? All interfaces do that, especially system interfaces. It's the whole purpose of interface - to provide implementation logic in a usable form. Try to call the kernel by hand and see the difference.


> you’d be laughed out of the room

Doubtful. Why are the main paradigms?

Everything is an object

Everything is a function

Everything is a resource (REST)

Uniform interfaces are common for good reason: you gain a lot of flexibility for redirection, introspection and policy control.


When everything is a file, what is a "file" becomes flexible.

It’s really “Everything has a File Descriptor”, nothing about the concept of a file has changed.

It's in essence no different than OOP or actor model or what have you.

I’m sure there’s an isomorphism that could be drawn but it does nothing to show that it’s an equally good paradigm to write software in. IMO, it’s a tortured abstraction.


I wouldn't sit here and claim Plan 9 was perfect, because if it was, we'd be using it. In particular, the issue is that you're reading raw stream of content on the way in and out of those file descriptors.

This is akin to how shell piping in Unix is also just... text and bytes. This is limiting and produces many ad-hoc protocols.

But take what Plan 9 was trying to do, and add to it what Microsoft's PowerShell tried to do, where you stream objects, structured information, instead of just bytes, on the way in and out of commands and files...

And suddenly... we got ourselves an Erlang.


>> I wouldn't sit here and claim Plan 9 was perfect, because if it was, we'd be using it.

Many perfect and awesome systems have been built that never saw significant adoption.

I agree with ESR's observation: "Plan 9 failed simply because it fell short of being a compelling enough improvement on Unix to displace its ancestor. Compared to Plan 9, Unix creaks and clanks and has obvious rust spots, but it gets the job done well enough to hold its position. There is a lesson here for ambitious system architects: the most dangerous enemy of a better solution is an existing codebase that is just good enough." Source: https://www.catb.org/~esr/writings/taoup/html/plan9.html

Another amazing OS that never caught on was BeOS: https://en.wikipedia.org/wiki/BeOS Neal Stephenson's description of BeOS as "fully operational Batmobiles" is accurate and it is sad that BeOS did not become more mainstream. See https://people.cs.georgetown.edu/~clay/classes/spring2010/os...


To elaborate on those points a bit further, past what I already said about pidfd being introduced exactly to treat processes as files:

1. ioctls can make any "syscall" on a file.

2. a process does not have to be a singular file. All processes have most if not all their attributes exposed as files /proc/$PID/ as files, and can have this arbitrarily extended. In plan9, passing a signal (technically a note) is done by writing to /proc/$PID/note, and there is no technical reason for not allowing the same on Linux.

3. the entire concept of memory mapping is based around files, with anonymous memory - i.e., non-disk-backed memory - just being a subset of this. POSIX shared memory (shm_open) is provided through /dev/shm, which is a tmpfs folder and is indeed just files.

4. sockets are file descriptors, and file descriptors is what makes a file, and as such you can get a peers address of a file descriptor when such is present. Ways to expose creating sockets in the filesystem also exist, and not just for plan. The special socket-bits could easily be made less special, with the only justification for the current BSD socket API being that it became dominant and so everyone copied it.


This is an absurd interpretation of "everything is a file." The kind that makes you go "arghwhat?!"

A file descriptor is exactly nothing like a file. You cannot write to a pidfd. You cannot waitid an eventfd. You cannot getsockopt on a regular file. The only operations that all file descriptors have in common is close, dup, poll and some other basic operations.

So "file descriptor" basically just means "kernel interface object" and the available operations depend on the type of object.

A file is a container for arbitrary data that I can read from, write to, and reposition the read/write cursor in. If you call anything else a "file" then you haven't made everything a file, you've just redefined "file" to mean "thing."

What business does a socket have in a physical, on-disk filesystem? (Let alone a clunky hack to invoke the much simpler `signal` syscall in a roundabout way?) The socket "file" is completely meaningless unless the process that opened it is currently alive and still listening on it. So why the fuck should it get written to a persistent storage device?

How do I specify the socket type, which is a meaningless concept for an actual file, when I open a socket "file"? Oh that's right, I don't. Because I don't open a socket. I bind or connect. I don't use "file" APIs because they're not applicable. I use a dedicated socket API that's fit for the purpose.

Pidfd was not introduced to treat processes as "files," it was introduced so they could share the operations that they do meaningfully share with other kernel objects (e.g. poll).

The overloaded ioctl syscall is bad design. The proc "filesystem" is bad design. /dev/shm is ridiculous design. So I have to mock a fake filesystem in memory so I can create a fake file in that "filesystem" just so I can get the same memory pages mapped into my virtual address space as some other process, all of which has absolutely nothing to do with files or a filesystem (and is much lower level than that). lolwat?


A file descriptor is a handle to a file, and anything you have an fd to is a file. This file carries a vfs implementation, such as that of pidfd, a device driver, or disk storage. The kernel does not distinguish between these.

If not being able to write makes it not a file, then files stop existing when a disk is full, and means that /dev/zero and /dev/null are not files - despite being at the heart of the whole "everything is a file" paradigm.

Pidfd was not made to make processes behave like files - that is what /proc is - but to solve problems with process related syscalls and PIDs, which are flawed and racey. The solution to that was to make APIs that treat processes as files, which gives you the ability to poll it like a file.

A streaming socket is exactly like a normal file. You read, write and poll. The only thing that is special is how to create it, but that is a design decision, not a technical limitation - see the plan9 file based API for making TCP sockets, which is trivially implementable in Linux.

Domain sockets are a bit different because of their side channel and would require more ctl files, but Linux's API is 99% magic files and ioctls so this is not that weird.

ioctls are not themselves bad design. In fact, scoping kernel functionality onto file handles is a great design and why that's almost the entirety of the kernel (device driver calls dwarf syscalls). The problem is not the design itself, but the fact that ioctl was not originally meant for it and got overloaded through several design iterations. This is what happens when you do organic design through more than 3 decades.

If you start out by defining a way to do file-scoped syscalls - and file does and always will mean "an fd" to a kernel - then you wouldn't have that awkwardness. That is what plan9 did: Take the learnings, and implement them clean instead of on legacy.


> A file descriptor is a handle to a file, and anything you have an fd to is a file. This file carries a vfs implementation, such as that of pidfd, a device driver, or disk storage. The kernel does not distinguish between these.

Which is what I said. It's a "file" in name only.

> If not being able to write makes it not a file, then files stop existing when a disk is full, and means that /dev/zero and /dev/null are not files - despite being at the heart of the whole "everything is a file" paradigm.

Great example that showcases the idiocy of Everything Is A File. /dev/null and /dev/zero are basic parts of the Unix API, so the basic OS API is broken-by-default at boot until one mounts a file system that had these dummy "device" nodes at a specific path that's hardcoded everywhere.

Instead of providing a sensible API like memfd_create or timerfd_create, for example.

> A streaming socket is exactly like a normal file. You read, write and poll.

That doesn't make it a file, that makes it an object that shares common traits with file objects. Datagram sockets do not read/write because they're not bound to a fixed remote address. And that's perfectly fine. They're not files.

> The only thing that is special is how to create it, but that is a design decision, not a technical limitation

And it's a good design decision. The socket API is pretty decent except for the dumb "file" nodes it creates when listening on standard unix sockets.

> see the plan9 file based API for making TCP sockets, which is trivially implementable in Linux.

But thank God it's not implemented in Linux.

> ioctls are not themselves bad design. In fact, scoping kernel functionality onto file handles is a great design

Agreed, the only bad part is that too much was shoehorned into the same syscall. It's certainly better than magic "files" that pretend to be "files" by having you read and write structs from, but you're only allowed to read and write whole structs per syscall, which has no resemblance whatsoever to how reading from and writing to a file work. (Maybe Plan9 doesn't have this limitation and tries harder to keep up the charade, I wouldn't know.)


> Which is what I said. It's a "file" in name only.

No, it is the very definition of a file from the OS perspective. There is no other applicable definition to the OS. You seem to conflate files with disk storage, in which case not just plan9, not just Linux but the entirety of UNIX history seems to have flown past you.

That files are a nothing but abstract handles that implements the VFS interface to serve every conceivable function - where regular file is treated no differently than a device driver - is the entire point of modern UNIX. If this is the part you are stuck on it is not a surprise that both the existing Linux kernel APIs and trivial (and quite frankly, perfectly ergonomic and efficient) alternatives like the plan9 API seem so foreign to you.

"read and write structs" is the most normal thing for an application to do, whether you are reading JSON from disk, communicating over UNIX domain sockets with raw C structs, or sending protobuf over the network.

"But thank God it's not implemented in Linux" - Linux has many of these APIs already and it constantly grows, see for example all of /dev, /sys and /proc, not to mention FUSE and support for the 9P protocol to use all of plan9's services as-is.


I couldn't care less about the weird nomenclature, but it seems to have confused a lot of people since they insist that just because every object is called a "file," everything needs to be shoehorned into the concept of file nodes in a virtual filesystem tree.

You don't get one object == one file node, you get one object == a subdirectory with lots of file nodes that you have to open and close independently. That should tell you that your pattern doesn't work. (With enough effort, you can of course always shoehorn everything into an ill-conceived concept (and you did), but if you have to bend over backwards to make it fit, you should just admit it doesn't fit.)

It works for /dev, but definitely not for the mess that is /sys and /proc.

"Reading and writing structs" is the most natural thing to do when you're forced to serialize your data. You then design wrappers around that serialization that expose a sane, type-safe API on each end.


Also, I remembered, there is /dev/pts, a weird file-based API that you need to use to create pseudo-terminals.


I am probably looking at it from a too high-level perspective, but the RESTful paradigm was widely adopted in every domain and proved that you can model pretty much any concept as resources, with CRUD primitives and hyperlinks between them.

Wouldn't the same be applicable to files, processes, devices and so on?


It is. I don't think people are well familiar with what Plan 9 calls "files". They're objects or resources, which also happen to be files. REST/OOP/Actors/Plan9 are very similar systems.

And like it or not OOP shows that it's possible for one idiom to describe all the things when it's flexible enough.


Yeah, it's not such a laughable idea when you consider just how much of the dynamic behavior of a software system can be modeled as a series of messages between independent resources. This is part of why UML was so pervasive. People had spent intense amounts of energy modeling systems using messaging and interconnection, and UML gave us a uniform way of doing this.

I've got a lot of experience working in systems that model everything as a series of resources passing messages and it works very well. The entire QNX operating system right down to its POSIX support uses this underlying primitive and, while you would never see it in your own code their system-wide profiler leans on this design to make it easy to see how control flow moves between isolated threads and processes within the software system.


These are called "special files" in Linux. "Device files" are just a special case.

Both "special" and "device" files are still "files".


>Process is not a file because you cannot send signals to a file

https://man7.org/linux/man-pages/man2/pidfd_send_signal.2.ht...

Also sockets have the concept of ancillary channels used to deliver out of band data (see recvmsg). If every system resource had such channel, it could be used for control similar to how sockets do it.


Maybe instead of saying everything is a file, perhaps it should be "all resources are organized in a tree structure". The nodes of the tree can be different things where different operations apply.


We now have pidfd, because we actually need process handles and treating processes as files is the best way to do so.


> io_uring is a more modern API to file access.

With a remarkable similarity to certain mainframe systems from half a century ago.


it's a more modern API to system calls in general, though most of the focus is on IO.


> Ultimately IO is [an] intrinsically complex topic, and trying to paper over that complexity with simple interfaces is disingenuous and falls flat on edge cases.

I’m not against pointing out the edge cases in the Plan 9 file model[1,2], but the thing is, I haven’t seen complex I/O interfaces that aren’t a horror show, either. Granted, I haven’t seen that many of those at all, but I’ve had a thorough look at the ones in OS/2, Win32, and NT, and none of them seem particularly inspiring.

I’d very much like to see some nice alternatives, to be clear!

The reference to io_uring also doesn’t seem all that strong of an argument, honestly. I’d like to say there are three layers to the idea of “traditional” “Unix” “files” as an OS (not storage) interface:

- System and user resources you have access to are identified by unforgeable references (called “fds”; the merits of allowing userspace to control their naming as opposed to having the kernel assign the names are debatable). You can feed bytes into these, (ask to) get bytes out of them, and perhaps have a out-of-band call to e.g. transmit one of the other references you hold to a peer. You can of course also delete a reference.

So far this is just dynamically typed object-capabilities by another name. It’s going to require higher-level protocols on top, but so does basically everything else on this level of generality.

Plan 9 mostly (if not completely) eliminated ioctls here by using separate control files instead.

(The part where the OS merges the payloads of write calls into a byte stream then cuts it back up into reads is more opinionated, but I don’t think even Bell Labs systems ever adhered to that principle strictly[1].)

- You obtain (most of) these references by navigating a stringy hierarchical namespace. There are additional calls to do so and to modify that namespace.

This is less of an obviously correct least common denominator, and as history shows the consensus is less strong here as well. Mountpoints, symlinks, namespaces per TFA, even the *at() calls all change how this part functions. On the other hand, I don’t think anybody longs for version numbers or nesting limits (or even drive letters) of other systems that have used naming approaches similar enough for a comparison.

- You access these services through synchronous system calls write(), read(), ioctl(), close(), open(), etc.

This is the part that is changed by the introduction by io_uring... But I don’t feel it’s all that important for the conceptual model, unlike the preceding points.

[1] Cutting a bytestream into packets is as always a tedious slog of buffering so some of the protocol implementations use write() / read() boundaries thus actually (depend on being able to) use the API in a datagram-like fashion (which 9P enables IIUC).

[2] Auth is based on a /proc/self-like hack wherein the kernel-side implementation of the “file” inspects the opening process through kernel-side knowledge you can’t access nor proxy from userspace.


BeOS was designed for multimedia and audio from the start - I had many friends excited about it's approaches to I/O.


I have never found the Be Book to be particularly engaging reading, but this might finally give me a reason to work through some parts. Thanks!

I’m not sure how much stock to put into the multimedia claims, however—a lot has changed since then, both in the state of human knowledge about low-latency multimedia, A/V sync, network streaming, etc., and in what we can and can’t afford on machines we perform multimedia processing on. How relevant and how commonly known are the insights that BeOS incorporated today?

(You can see that I expect the answers to be “not very” and “extremely”, but sometimes life surprises us. For example—did you know that Microsoft shipped a renderer for 2D animation based on FRP ideas, designed with the direct participation of Conal Elliott himself, in 1998? It was called DirectAnimation and released as part of DirectX 5.)


Wait until you realize you need asynchronous I/O operations, and then you also need to manage caching on I/O operations, and you also need to distinguish appends from over-writes, and that you also need to care about who allocates memory for the buffer being written to / read from...

And then comes concurrency and exceptions... UNIX "design" anticipated none of the above. And it's not like these things were somehow unknown at the time. The "designers" thought they are making something remarkable by cutting corners and making a "simple" (but really a half-baked) OS.

It's such a shame that UNIX became the vector for the spread of the Internet and eventually infected virtually every computer system on Earth. It's even a greater disappointment that its "design" decisions are still revered as the holy Bible in academia and in the industry.


What's an existing example of a non-'half-backed' OS?

I personally liked VMS for concurrency, but for dealing with code and coding there was no contest that Unix was far better. I think the reality is they're all half-baked since none can satisfy every need.


The Windows NT kernel is surprisingly well-designed. Everything is an 'object' to the kernel (or file descriptor). Concepts like ACLs apply to all kernel objects. I summarized about some of its capabilities previously: https://news.ycombinator.com/item?id=34914776

One concrete example of its design: exercising administrator (sudo) permissions causes a User Account Control (UAC) dialog to pop up, which takes over the screen from the current WindowStation -- making it impossible for any software to mess with approval or password entry [1].

IO Completion Ports (IOCP) also stand out as a powerful way to perform asynchronous IO that NT has had since ... I'm not sure how long, but I believe probably since the 90s. As one HN commenter wrote in 2016, "IOCP is the top item on my (short) list of things Windows simply does better. The performance boost you see from designing a server for IOCP from the ground up is jaw-dropping." [2] And it works for a variety of different IO tasks (including disk IO).

Windows gets a lot of hate, but most of the criticism you read about it online is not usually a well-considered critique of how its kernel APIs operate.

[1] https://learn.microsoft.com/en-us/windows/security/applicati...

[2] https://news.ycombinator.com/item?id=11867345


Isn't the kernel API banned from usage by userspace applications that aren't kernel32.dll (plus deliberately breaks ABI all the time) ? I can appreciate the strengths of the NT kernel, but if nobody is allowed to use it... (I suppose kernel drivers can, but that's still a very narrow range of applications)

Also, while IOCP has the advantage of being older w.r.t. availability, I wonder how io_uring competes - I haven't used IOCP myself, but from what I've seen io_uring seems to fill its role quite well and I've seen several people complain about various problems with IOCP (in particular poor documentation)


There's a quote that goes something like "the last piece of software ever written for a novel OS is the UNIX compatibility layer". kernel32.dll is essentially that, except it isn't even UNIX compatible!


These are all things exposed through the application API.


Windows NT was designed by Dave Cutler who also designed VMS. And, I agree, for many things it is a very nicely designed system and doesn't deserve a lot of hate.

But, what I said about developing software stands: I'd much rather be using a unix .


I used "half-baked" in the context of UNIX history, where the legend goes that UNIX was the stop-gap solution for AT&T failing to come up with a better designed system on schedule.

Unfortunately, today, UNIX is by and large all we have. Other things are either some kind of UNIX but with a twist, or very underdeveloped.

Looking into the future, I'm very enthusiastic about OS-as-a-library approach, but I don't think we have a solid contender there yet.


I think one approach in (re)designing an OS should be to go back to fundamentals: what makes an Operating System?

I think fundamentally, there are only a few things it has to do: Allow applications to run (on the CPU/other hardware), Allow applications to communicate between themselves (establish communication standards), Allow access to disk, Manage the time given to each application well/fairly, Define permissions around resources (who can talk to whom and who can access what). Depending on who you ask, shells/desktop environments should be part of the OS too.

When I think of the minimum that can accomplish that, I think something like a simple communication standard (with authentication mechanisms) would be interesting (with possibility to support more specialized standards). A standard for defining app. share of resources (CPU/disk). Maybe hardware resources like disks, devices like cameras, etc.. could be treated as applications via a driver (or application <-> disk application <-> disk driver). Then instead of 'opening' files, you're just communicating your intentions with a disk driver, and you can do essentially anything. Maybe then instead of a Filesystem Hierarchy standard, there could simply be "OS (standard) applications" that for example list what users are in the OS, what applications are available, and so on (without a filesystem hierarchy at all, if you wish!). An FHS could be provided for legacy reasons.

I also think permissions should take a more fine grained approach than Unix/Linux does (closer to Android permission systems), I think by default applications should have minimum permissions and should be whitelisted as needed.


Theseus OS is a Safe-Language OS which solves almost all problems of today’s OSes by leveraging Rust’s compiler guarantees.

https://www.theseus-os.com/Theseus/book/index.html


It was not Unix that become platform for Internet, but Posix.

Linux is not derived from Unix, but supports Unix (Posix) APIs. As does modern MS Windows, and Mac OS X.

This also means that Google could experiment with OSes like Haiku or Fuchsia that support Posix, but are built in a different way.


You should fill us in on the right operating system and it's architecture.


> First, it's not always easy to map every object operation into either an open read or write.

It doesn't seem like it. Linux has a habit of multiplexing alternate functions through a single handle with additional and somewhat scary methods like ioctl. Plan9 manages this with servers, directories, and more than one path available for a single resource depending on what you're trying to access. This is far more sane.

> With time, we should have seen a lot of ugly interfaces resulting from this limitation.

They don't seem any more complicated than they need to be. Compare implementing a fuse server vs implementing a plan9 server. Yet I don't see where all of this complication adds anything or enables implementation of technologies that couldn't be implemented on plan9 with a few additional paths.

> People could no more keep up with simple general designs. And to squeeze every bit of performance,

These are contrary goals, and I'm not sure what you mean people can't "keep up" with "general designs." What is there to "keep up" with? And in exchange for that performance we got one of the most insane /class/ of unfixable CPU bugs ever imagined.

> And this is how we ended with the extreme fragmentation and heterogeneity we have in Linux,

And yet.. many of these systems are now being unified into generalized file descriptor based systems, that have wacky open methods, but boil down to allowing simpler interfaces through read(2) and write(2).


And in exchange for that performance we got one of the most insane /class/ of unfixable CPU bugs ever imagined.

What do Spectre and Meltdown have to do with “everything as a file” system architecture?


The implication was that everything as a file _had_ to be abandoned to make way for performance "improvements." Which ended up just being a new class of CPU bugs. So.. was the trade worth it?


For an example, most high performance devices expose functionality as memory locations mapped into the CPU address space. In many cases it's necessary to allow userspace direct access to (part of) the memory the device exposes, such as with GPU's, RDMA NIC's and so forth. Not sure you could get high performance with a read/write stream based interface, as conceptually elegant such an interface is.


Devices have been doing that since forever, so Plan9 has an mmap equivalent with segment/segattach, where you use file I/O (only) to define a memory mapping and attach it to the current process. Everything from that point on is regular memory I/O.


> alternate functions through a single handle with additional and somewhat scary methods like ioctl. Plan9 manages this with servers, directories, and more than one path available for a single resource depending on what you're trying to access. This is far more sane.

Given how horribly asynchronous filesystems are I take one handle owned by the current process over a dozen free hanging files that might be reused for different resources at any time. Hell I am quite sure Linux had to kill suid on scripts because the kernel could not guarantee that the script file wasn't swapped out before the interpreter would load the path.


Storage Combinators manage to unify a lot of this with a somewhat nicer, REST-based and object-oriented interface (they came out of something I called in-Process REST). Byte-oriented interface are much easier to fit on top of that than the other way around.

Almost more importantly, they don't claim to be universal. Instead, you can still send messages where that makes sense. The approaches are much simpler and powerful when combined than when each tries to be everything. Yes, you are allowed to say "synergies".

With the corresponding object streams (that can be specialised to byte-streams as well), we have something I like to call Plan A from Userspace.

https://2019.splashcon.org/details/splash-2019-Onward-papers...

https://objective.st


I wrote a while ago about how Go is more UNIX than UNIX, which in context really meant is more Plan 9 than Plan 9:

https://www.jerf.org/iri/post/2931/

The idea there is that the particular way interfaces work in Go is possibly the way that Plan 9 should have worked. In reality, trying to fit everything into a file is still non-functional, because not everything is a file. But if you instead have a hierarchy of interfaces, starting at the very bottom with "this is a stream", working up to "this is a stream you can close", and so on and so forth up through "this is a seekable, sparse, appendable chunk of bytes that can have ACLs set and has the following ioctls", you can get what you're looking for out of common interfaces, while at the same time not having to run all around the system putting "do nothing" methods on things just to conform to interfaces. ("Do nothing" methods are a valid tool for a bit of fitting into an interface, and actually quite important, but only when the methods are themselves something that can be fulfilled by a do-nothing implementation. "Set this ACL" shouldn't be satisfied by a do-nothing method.)

For many things even a file is overkill; what you care about is that you can stream bytes in or out once you have it, not whether you can change the ownership of the thing you are working with. And with ioctls you see cases where files aren't anywhere near good enough.

(Note this is not advocacy for Go as a language you might want to program in; it's more a suggestion for a Plan 10 by drawing on Go as a particular combination of features. It would take non-trivial work to figure out how to turn this into an OS feature, but it's not inconceivable amounts of work IMHO.)


Well, Go is a descendant of Plan9's main programming language.


so golang is a poor man's erlang?


Are you asking for six pages of text? I'm probably the worst person in the world to say that to.

That said, I have no idea how you get "poor man's Erlang" out of this specific post. As a practical matter, Erlang is a standard dynamically-typed language in this matter; as a theoretical matter it has some limited support for protocols as used by things like gen_server but I don't think I ever saw a single use outside of the standard library, and I'm about 90% sure there's no implicit satisfaction of them; you must declare what you are implementing. It is irrelevant to my point as Python or Perl. We already know what this sort of dynamically-typed interface looks like in those systems, namely, "a lot less nice in practice than in theory but still useful enough most of the time". Nice for writing scripts, sufficient for reasonably-sized programs, not a sufficient foundation for an OS with Plan 9's level of aspirations.


> First, it's not always easy to map every object operation into either an open read or write.

Now a days people are mapping everything to JSON, and it works. The Plan9 directory structure was basically the JSON of the 80s, a hierarchical data structure were the hierarchy is easily navigable in a standard way, and the leafs can then hold any non-standard stuff you may need.


Mapping to JSON, "works." And usually not in an interop kind of way.


In plan9 this is dealt with through a similar workaround as we use in Unix: ctl files with a per-file protocol. Linux uses a mix of this (sys and proc files) and ioctls (which are also per-file protocols) for many things.

There is no real practical difference in capability between this and dedicated syscalls. For Linux, the distinction is usually just whether the functionality is "global" or isolated to a certain area like devices or drivers.


> And this is how we ended with the extreme fragmentation and heterogeneity we have in Linux

Tangentially, I've long wondered why the Linux "API" is such a mess, in particular why are there multiple tracing frameworks? Multiple security frameworks? Which to choose??

It turns out, this is a consequence of Linux's stand-alone development, along with their (good) "never break userspace" mantra. The two together mean they can never deprecate an API.

Contrast this with the *BSD's development, where the kernel is developed alongside the libc and (a core set of) user-space applications. This allows them to evolve and deprecate their APIs, because they can update the clients.

(From a runtime perspective, all is not lost in Linux, as I suppose they can move an API to a module or allow "users" (distros etc) to disable it at compile-time.)


Isn’t everything in Unix/Linux already a file? Isn’t that the power of Unix? There’s even a Wikipedia article about it: https://en.m.wikipedia.org/wiki/Everything_is_a_file#:~:text....


> it's not always easy to map every object operation into either an open read or write

A lot of what defines REST works well here. The "Uniform Interface" creates a simple API that can scale to a surprising amount of functionality.

Not saying it doesn't have it's downsides, but you can accomplish a lot with minimal increase in the surface area of that "uniform interface".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: