Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lately I've seen posts about DuckDB, which looks really cool, but Sqlite seems to be compileable with WASM so it can be used in some kind of container. How do the two compare?


DuckDB is an OLAP (analytical) query engine, sqlite is an OLTP (transactional) database. Modern OLAP engines store and represent data in columnar formats, which makes them very fast at queries that touch many rows (particularly if only a few columns are needed). Queries like "sum all sales for the past month by store."

But they're slow (or incapable) of doing inserts, updates, and deletes, because the columnar formats are typically immutable. They're also relatively slow at operations that need to look at all of the data for a particular row.

OLTP databases are much better for use cases where you're frequently inserting, updating, and accessing individual rows, as for the database backing a web application.

A common pattern is to use an OLTP database (like postgres) to back your application, then replicate the data to an OLAP store like Clickhouse or a data lake to run analytical queries that would overwhelm postgres.


DuckDB crushes SQLite in heavy data workloads according to ClickBench by 915x. (Link below since it's looong.)

DuckDB also has a WASM target: https://duckdb.org/docs/stable/clients/wasm/overview.html

I don't know enough about DuckDB to understand the tradeoffs it made compared to SQLite to achieve this performance.

https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQWxsb3lEQi...


As I understand it DuckDB stores columns separately (column major), whereas SQLite stores rows separately (row major). DuckDB is like structure or arrays and SQLite is like array of structs.

So which is faster depends on your access pattern. There are dumb stupid terrible names for "access all of one row" (OLTP) and "access all of one column" (OLAP) type access patterns.


"By 915x" doesn't seem remotely plausible.

Maybe there's some edge case they've found where SQLite is badly optimized and DuckDB is totally optimized, but that's absolutely not the general case.

Databases are primarily limited by disk/IO speed. Yes there are plenty of optimizations but they result in 10% improvements, not 915x.


DuckDB is an in-memory columnar OLAP database. It is going to be much faster at analytics queries than disk-based OLTP database. It is optimized for fast queries but can't write or handle large data.


Oh, got it, thanks. So it's a totally different product, not an alternative. Yes, that kind of speedup can be explained by using memory instead of disk -- like I said, it's disk/IO speed. Thanks!


Not necessarily. If your table is very wide but you're only reading one column, you'll do massively less I/O with a columnar or hybrid structure. And that's even before other tricks like storing the min/max values of each column in the pages (so you can skip pages for range queries) or SIMD.


Why still use SQLite then?

But how does WASM DuckDB store files in IndexedDB? Any info on that?


I believe the locking models are different making DuckDB less suitable for concurrent read/write but you will have to look up the specifics. As always, for a server environment SQLite should be set to WAL mode and comparisons should be made against that rather than the much older, less concurrent default.

As I recall duckdb’s concurrency model did not sound viable for a web server but I may be behind the times or outright wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: