Python parallel processing
I came across this function called parallel
in fastai, and it seems very interesting.
from fastcore.all import parallel
from nbdev.showdoc import doc
doc(parallel)
As the documentation states, the parallel
function can run any python function f
with items
using multiple workers, and collect the results.
Let's try a simple examples:
import math
import time
def f(x):
time.sleep(1)
return x * 2
numbers = list(range(10))
%%time
list(map(f, numbers))
print()
%%time
list(parallel(f, numbers))
print()
The function f
we have in this example is very simple: it sleeps for one second and then returns x*2
. When executed in serial, it takes 10 seconds which is exactly
what we expect. When using more workers(8 by default), it takes only 2 seconds.
Let's see how parallel
is implemented:
parallel??
??ProcessPoolExecutor
As we can see in the source code, under the hood, this is using the concurrent.futures.ProcessPoolExecutor class from Python.
Note that this class is essentially different than Python Threads, which is subject to the Global Interpreter Lock.
The ProcessPoolExecutor class is an Executor subclass that uses a pool of processes to execute calls asynchronously. ProcessPoolExecutor uses the multiprocessing module, which allows it to side-step the Global Interpreter Lock but also means that only picklable objects can be executed and returned.
This function can be quite useful for long running tasks and you want to take advantage of multi-core CPUs to speed up your processing. For example, if you want to download a lot of images from the internet, you may want to use this to parallize your download jobs.
If your function f
is very fast, there can be suprising cases, here is an example:
import math
import time
def f(x):
return x * 2
numbers = list(range(10000))
%%time
list(map(f, numbers))
print()
%%time
list(parallel(f, numbers))
print()
In the above example, f
is very fast and the overhead of creating a lot of tasks outweigh the advantage of multi-processing. So use this with caution, and always take profiles.