Multiprocessing

For long-running tasks, using separate processes to allow parallelism can yield significant speed ups. While Python has limitations with threads, it supports separate processes.

With map in multiprocessing.Pool, we can run a method on different processes with minimal setup code. An argument (like a string) can be passed to the methods.

Example

To begin, we import the multiprocessing module at the top (along with time, which is used to call the time.sleep method).

Step 1 We create a string list that contains the arguments to the methods we want to run on multiple processes.

Step 2 With time.time() we record the current time—this is done to ensure the processes are all run in parallel.

Step 3 We use a with-statement to access multiprocessing.Pool, and then call map on the pool to specify the method we want to run in parallel.

Step 4 We print a message indicating a method was started, and also print the string argument to the method.

Step 5 After sleeping for 1 second, we print a message indicating the method is done executing.

Step 6 We print the elapsed time for all 3 methods to run, and the total time is 1 second, which means all methods ran in parallel.

import multiprocessing, time

def example(name):
    # Step 4: print the start message.
    print(name + " started")

    # Step 5: sleep for a while and print a done message.
    time.sleep(1)
    print(name + " done")

# Step 1: argument to the function we will call on multiple threads.
all_names = ["A", "B", "C"]

# Step 2: start measuring time.
start = time.time()

# Step 3: call pool.map on all the list elements with multiprocessing.Pool.
with multiprocessing.Pool() as pool:
    pool.map(example, all_names)

# Step 6: print elapsed time.
print(time.time() - start)A started
B started
C started
B done
A done
C done
1.0340125560760498

With multiprocessing.Pool, we can use the map() method to call a function (with differing arguments) on multiple processes. This is an effective way to introduce parallel in Python programs.