Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

By way of simple argument: would anyone advise using a source control system that only kept the last few hundreds of commits? Why do we treat our data differently?

All mainstream database products allow you to retain the entire history of every change ever made in the form of transaction logs. With those logs you can recover your database to any specific point in time, and to unwind specific transactions.

Of course most discard transaction logs because the volumes tend to be huge in many mainstream, operating databases. I worked on one system where the main database barely pushed 10GB, but there were 100s of GBs of transaction logs generated daily.

However the core reason that mainstream databases discard with historical data is performance, not size -- if your current users overwhelmingly only care about the current state of the data, and it is rare that you need to go back into the past (which is what transaction logs and snapshots provide), paying an extremely high performance penalty to retain all of that historical data does not pay off. And there are no immutable products that offer similar performance to mainstream products under most usage scenarios, instead forcing you into extremely confined uses.

It'd be nice to not suffer scornful surface criticisms from people who base their career around the status quo.

This is a garbage non sequitur, as an aside. It is a disgraceful attempt -- as so commonly happens in such discussion -- to try to attach agenda to an opinion that one doesn't like.



Well, I have worked in a few companys with big SQL databases. You say the only care about the current state, but inreally EVER big database I have seen actually impnents some kind of backup thingy. Mostly handcrafted stuff.

The diffrence between Datomic and just storing your transaction log is that datomic gives you first class access to your hole history, you can work with it as easly as with the current data. I have at least not seen that in any other database.

So for me at least it seams a database who just stores everything and gives access to it is what most people should use for most problems. For me at least Datomic is the new standard and only if I have a special use case I would go to something else.


you can work with it as easily as with the current data. I have at least not seen that in any other database.

At the cost that working with your current data is as hard as working with the historical data. This does not come for free, and the only way to have such a versioned history is at a significant cost to generalized query performance -- sure certain things (like a single scalar lookup) can be fast, but generalized queries will be terrible. I'm sure there will be some map/reduce claimed solution.

For me at least Datomic is the new standard and only if I have a special use case I would go to something else.

This is an incredible and rather ridiculous statement.


"The only way to have such a versioned history is ..." is incorrect, and trivially so.

Datomic provides one existence proof of this: Datomic's history data is kept in distinct data structures, so it is "pay as you go" -- querying history is more expensive, because there is more stuff. Querying the present is cheaper.

Datomic queries are datalog, and do not require writing map/reduce jobs.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: