Datasette Lite: a server-side Python web application running in a browser

simonw · on May 4, 2022

You can try it out here: https://simonw.github.io/datasette-lite/

Or if you have a SQLite database hosted online somewhere with open CORS headers you can feed it the URL using ?url=... - like this:

https://simonw.github.io/datasette-lite/?url=https%3A%2F%2Fc...

tomthe · on May 4, 2022

Hi, I work in an academic setting and I found datasette always interesting to publish datasets, but it is hard to host a python web application on institutional servers (would have to convince several people who would have to maintain this). Hosting static files is very easy - time to try datassete!

simonw · on May 4, 2022

Awesome! I'm really keen to see Datasette used more in academia, I hadn't thought about how much easier it would be to deploy static files in that context.

riyadparvez · on May 4, 2022

Hi Simon, thank you for the great work. Do you have any plan to support DuckDB in future?

simonw · on May 4, 2022

I am seriously considering adding a plugin hook to support alternative database backends - PostgreSQL and DuckDB are the two I'm most interested in for that.

thadk · on May 4, 2022

How close is Python SQLite and Datasette Lite to accessing a hosted SQL database using HTTP range requests as can be done in sql.js like https://github.com/phiresky/sql.js-httpvfs?

I put together a Pyodide-based web app where users need a few indexed queries from a 600mb SQLite database but it isn't very practical for them to download the whole thing into the browser. I would datasettify it and access it by API, if I wasn't relying on a pyodide library that needs direct SQLite access.

https://observablehq.com/@thadk/life

simonw · on May 4, 2022

I have an open issue for that here: https://github.com/simonw/datasette-lite/issues/28

My initial hunch is that this will be really difficult - probably require a fork of something like https://github.com/coleifer/pysqlite3 then compiled for WebAssembly.

I'm confident it's feasible, but I don't have the skills to figure it out myself.

db65edfc7996 · on May 4, 2022

This was also my immediate thought. Being able to statically deploy larger databases without a server process would be an absolute dream. Approaching dangerously high levels of dark magic.

eatonphil · on May 4, 2022

My biggest issue with Pyodide (which is of course an awesome project/build on its own) is the long wait times. I haven't figured out a way around a ~5 second load time where the entire UI hangs every single time you load the page.

My app (similar to Simon's, a lite mode of a data IDE): https://app.datastation.multiprocess.io.

My code: https://github.com/multiprocessio/datastation/blob/main/shar....

StreakyCobra · on May 4, 2022

If you use it in a web worker the UI does not hang. It requires a bit more setup though:

https://pyodide.org/en/stable/usage/webworker.html

Edit: typo

kencausey · on May 4, 2022

Related: https://news.ycombinator.com/item?id=31259027

simonw · on May 4, 2022

Running in a Web Worker can help a lot here - you can at least avoid locking the main browser UI thread and show your own loading indicator.

gavinray · on May 4, 2022

Damn, I had an idea to build a very similar thing since last week. There's nothing new under the sun.

Really impressive app though, great job.

pwdisswordfish9 · on May 6, 2022

> I haven't figured out a way around a ~5 second load time where the entire UI hangs every single time you load the page

Stop using compile-to-browser junk and be a browser "native" instead.

andrewmcwatters · on May 4, 2022

The datasette tools are really great. I use them for pulling GitHub statistics, such as the distribution of GitHub followers, from their REST API into an SQLite database with sqlite-utils(1).[1]

Simon is in the 99th percentile (PR=0.998, n>25,000+) of GitHub users by followers.

Thanks for such great tools, Simon!

[1]: https://github.com/andrewmcwattersandco/github-statistics

simonw · on May 4, 2022

Wow, I didn't know SQLite databases could compress that well:

github-users.tzst - 96MB

Uncompressed:

github-users.db - 992MB

robbintt · on May 5, 2022

It seems like either sqlite file format is already optimized for read or write at the cost of size, or there is an opportunity there.

gavinray · on May 4, 2022

It really blows my mind that you could (mostly) just up and shove the Datasette code into a browser. What a time to be alive.

simonw · on May 4, 2022

Yeah, ditto!

tomatowurst · on May 4, 2022

absolutely insane. seems like its installing pip packages and running localhost in the browser? I don't 100% grasp what is happening underneath the hood. But usually you would launch the local server, browse over to http://localhost to view it in your browser. This is running the web server inside Web Assembly inside your browser? Is it then possible to expose that local web server inside web assembly to the internet? Crazy stuff! looks like its possible to speed up the initial load time by using web workers, very much eager to see that in action.

    Loading...
    distutils already loaded from default channel
    Loading micropip, pyparsing, packaging
    Loaded micropip, pyparsing, packaging
    Loading ssl, openssl
    Loaded ssl, openssl
    distutils already loaded from default channel
    pyparsing already loaded from default channel
    Loading setuptools
    Loaded setuptools

simonw · on May 4, 2022

That's pretty much what it's doing!

It runs everything in a web worker at the moment, not sure if there are optimizations I could make by running more of them in parallel?

It doesn't exactly run a localhost server. It uses a mechanism from Datasette's internals that lets you simulate an HTTP request through the Datasette application and get back the result. Then it sends that result back as a message from the worker to the parent page.

It's using this API here for that: https://docs.datasette.io/en/stable/internals.html#datasette...

detaro · on May 4, 2022

Please re-read the site guidelines and don't do this: https://twitter.com/simonw/status/1521878251848683520

simonw · on May 4, 2022

For anyone wondering, my tweet here (which I deleted just before I saw this comment) was asking for help with votes on Hacker News.

tomatowurst · on May 4, 2022

I don't see anything wrong with this. We regularly see YC companies getting upvoted.

pstuart · on May 4, 2022

SQLite is so beloved that it's a natural for upvotes here ;-)

pseudosavant · on May 4, 2022

Really cool idea and implementation! It is really making me think about if this can be done in a service worker so that you could expose what would look like a standard (REST?) API?

It seems like a client-side non-JS server backend would be incredibly useful. Just drop the service worker in and point it at your data source.

simonw · on May 4, 2022

I tried building this with a Service Worker first and it didn't work, because Pyodide needs XMLHttpRequest.

I opened an issue about that here: https://github.com/pyodide/pyodide/issues/2432

rmnclmnt · on May 4, 2022

Pyodide is unlocking a new era for the Python data ecosystem! So awesome Datasette is jumping on the train!

nineteen999 · on May 4, 2022

I wondered for a moment why it was named after the Commodore Datasette:

https://en.wikipedia.org/wiki/Commodore_Datasette

I hadn't heared of datasette.io before. Looks like an interesting product.

almost · on May 4, 2022

That's really cool

iamjbn · on May 5, 2022

I might be dumb, but Why?

simonw · on May 5, 2022

There's a section in my blog entry answering that question.

Short version: I want people to be able to use my software if they don't have the ability to spin up a Python web server.

pwdisswordfish9 · on May 6, 2022

Easy solution: start off by not writing your software in something as hobbled as Python.

robbintt · on May 5, 2022

Only serving static assets and letting users bring their own compute is world changing.