Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In DuckDB you can do the same but export to Parquet, this way the data is an order of magnitude smaller than using text-based SQL statements. It's faster to transfer and faster to load.

https://duckdb.org/docs/stable/sql/statements/export.html



you can do it with a command line like this:

   duckdb -c "attach  'sqlite-database.db' as db;  copy db.table_name to 'table_name.parquet' (format parquet, compression zstd)"


in my test database this is about 20% smaller than the gzipped text SQL statements.


That's not it. This only exports the table's data, not the database. You lose the index, comments, schemas, partitioning, etc... The whole point of OP's article is how to export the indices in an efficient way.

You'd want to do this:

     duckdb -c "ATTACH 'sqlite-database.db' (READ-ONLY); EXPORT DATABASE 'target_directory' (FORMAT parquet, COMPRESSION zstd)"
Also I wonder how big your test database is and it's schema. For large tables Parquet is way more efficient than a 20% reduction.

If there's UUIDs, they're 36 bits each in text mode and 16 bits as binary in Parquet. And then if they repeat you can use a dictionary in your Parquet to save the 16 bits only once.

It's also worth trying to use brotli instead of zstd if small files is your goal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: