Don't know much about this project but interesting to see the name they chosen. "Sanakirja" means dictionary in Finnish, literally "a word book". Finns often name tech something English sounding, since they are worried if others will find it too strange sounding.
This seems to be a component of a version control system with a very interesting data model. Check it out at the main page https://pijul.org if you hadn't heard about it like me.
To make these transactional too, they are actually stored in a different B tree. Cycles are avoided by not storing reference counts less than or equal to 1.
How does that work? If you don't store a refcount when it equals 1, how do you know not to delete the thing? And if you never delete anything, why keep refcounts?
Good question. The structure is a B tree. When you traverse it, and arrive to a block, but don't see it in the reference counting base, you know that it is referenced in the tree, so its reference count is at least 1.
Then, if you want to free it, you have to delete all references to it in the tree. Maybe that blog post could have been more verbose about another feature of Sanakirja, the list of free pages.
Author here. So I thought when I started the project. But when you think more about it, Rust cannot allocate in an mmapped file, so you're basically on your own, and have to do plain old C-like memory management in the file, except that unlike in C, there is the extra constraint that everything is transactional, pulling the machine's plug needs to "unfree" all freed pages, and "unalloc" all allocated pages.
The more I see from this project, the more I like it. The Pijul team seems just completely serious about building system software right. Also while I could wish for more doc, what's there is a really good overview.
It would be useful to be able to browse the sourcecode online, without actually cloning it.
It seems somewhat weird to use the VCS itself for the VCS being developed. Well, maybe that's just me. Until you offer the ability to browse sourcecode online, maybe mirror the repository to some online source hosting site and disallow issues tracking there?