> when doing IO calls in Python so GIL is usually released so the kernel can already schedule another thread while waiting for IO
This is true, but scheduling another thread through the kernel can have higher overhead since it requires context switches. Running multiple threads also has other potential issues with lock contention; how problematic they are will depend on the use case.
The potential advantage of scheduling another thread is, of course, that it can do CPU bound work; but in Python, unfortunately, doing that means the GIL doesn't get released so that thread will prevent any further network I/O while it's running, the same as would happen in an async framework if a worker did a lot of CPU work. So Python doesn't really let you realize the advantages of threads in this context.
> doing that means the GIL doesn't get released so that thread will prevent any further network I/O while it's running, the same as would happen in an async framework if a worker did a lot of CPU work. So Python doesn't really let you realize the advantages of threads in this context.
> Computing intensive, no. Code that is doing a CPU intensive computation but makes no system calls will never release the GIL.
Any code that does not involve Python objects can release the GIL, no matter whether it makes system call or not.
For example, NumPy the most popular scientific computation package in Python, on which many other popular packages like Pandas are based, releases the GIL when doing operation on matrix. This is documented at https://numpy.org/doc/stable/reference/internals.code-explan...:
> If NPY_ALLOW_THREADS is defined during compilation, then as long as no object arrays are involved, the Python Global Interpreter Lock (GIL) is released prior to calling the loops. It is re-acquired if necessary to handle error conditions.
And does not involve running Python bytecode. Yes, numpy and other packages that provide C extensions do this when they are doing computations that don't require running Python bytecode.
there is an advantage to threads in the CPU bound case which is that the work of other threads will not be blocked for a CPU-intense operation. With an IO-event based scheduler, your CPU bound task will not context switch leading to network logic elsewhere to simply time out. A particularly acute example is something like a network library logging into the MySQL database which gives the client a ten second window to respond to the initial security challenge. It was both an extremely difficult bug for me to diagnose as well as helpful for my role at work that I was able to track that one down in Openstack :).
This is true, but scheduling another thread through the kernel can have higher overhead since it requires context switches. Running multiple threads also has other potential issues with lock contention; how problematic they are will depend on the use case.
The potential advantage of scheduling another thread is, of course, that it can do CPU bound work; but in Python, unfortunately, doing that means the GIL doesn't get released so that thread will prevent any further network I/O while it's running, the same as would happen in an async framework if a worker did a lot of CPU work. So Python doesn't really let you realize the advantages of threads in this context.