Python: Multi-threaded programming


Multi-thread programming is an important component of a programming platform which allows specific functions in the code to execute without blocking, also known as asynchronous execution. Threads are also used for parallel, or near to parallel (in case threads execute in the same core/CPU) execution of a function in the code. Threads essentially solve two problems:

1. Make the application and user interface responsive by executing I/O tasks in a separate thread
2. Make the application and user interface responsive by execution CPU intensive algorithms in a separate thread

In Python, multi-threading is only used in #1, since #2 is not possible due a mutex that protects all Python objects (a.k.a. Global Interpreter Lock or GIL). GIL is necessary since memory management in Python is not thread safe, and now so many features of Python depend on GIL that it is no easy task to remove it.

There is some good news. GIL is only used in the CPython implementation of Python and the other implementations of Python like JPython and IronPython don't use it, hence allowing threads to be used for both cases #1 and #2. Bad news is, CPython is the most common implementation of Python and the one we are using in this series of articles.

Another limitation of multi-threading in the CPython implementation is that it can only execute in a single core. So if you have an Intel i7 with 4 cores, the execution of each thread will happen on the primary core while the remaining 3 cores will sit just twiddling their thumbs. JPython and IronPython can run threads in multiple cores. However all is not lost for CPython users, they can choose to run processes in specific cores, support of which is provided through the multi-processing library. However unlike threads, these are processes which have their own memory space so you will have to resort to Inter-Process Communication (IPC) to share data between them. We discuss Python multi-processing in another blog article.

Coming back to multi-threading, the support for it is provided by the threading module in Python 3.x. The threading module provides a base class threading.Thread which needs to be inherited by our own custom class, and each thread creates a new object of the threading class. This class has to have a function Run which contains the code that needs to be executed in parallel (generally I/O code). The constructor of the customer class is used to initialize the base class as well as set internal variables.

The following code applies the Python threading module. The myThreadClass inherits the threading.Thread class , and the createAndRunThreads class has two functions one to create and run the threads and one to display the thread count. Note that code in the createAndRunThreads function will run asynchronously since it is creating and running threads. Since the printThreadCount function is called immediately after createAndRunThreads, the former will execute before the latter has completed execution. The time delay in the run function insures that threads will take some time to run, giving printThreadCount time to complete execution before the all threads complete.

import time
import threading

# MyThreadClass inherits from threading.Thread
class myThreadClass(threading.Thread):
    def __init__(self, name, delay):
        threading.Thread.__init__(self)
        self.name = name
        self.delay = delay

    # This is the function that is run by the thread
    def run(self):
        time.sleep(self.delay)
        print("I am in thread: ", self.name)

# NOTE: Due to Global Interpreter Lock (GIL) Lock CPU tasks are executed serially
# For CPU operations Python threading is not useful and actually
# case more delays. Python threading does not leverage other cores
# (like Tasks in .NET) and also executes only one thread at a time
# Use the Multi-Processing (available through multiprocessing) class for
# process and core level parallelism.
# Only I/O operation can run in parallel
class demoThreadingClass():
    def createAndRunThreads(self):
        for value in range(6):
            thread = myThreadClass("Thread::"+str(value), value/2)
            thread.start()

    def printThreadCount(self):
        print("------------- printThreadCount")
        print("Total number of threads: ", threading.activeCount())
        print("This message will print while threads are running")
        print("---------END printThreadCount")

dtc = demoThreadingClass()
# Execution will not be blocked at the following line as it executes threads
dtc.createAndRunThreads()
# The following line will be executed while threads are still running
dtc.printThreadCount()

The following is the output of the above code. Note that the output of printThreadCount comes in between the output from the threads (which are spawned from createAndRunThreads). This is a tell-tale sign of asynchronous execution, which in this case is the result of running threads.

I am in thread:  Thread::0
------------- printThreadCount
Total number of threads:  6
This message will print while threads are running
---------END printThreadCount
I am in thread:  Thread::1
I am in thread:  Thread::2
I am in thread:  Thread::3
I am in thread:  Thread::4
I am in thread:  Thread::5

Process finished with exit code 0


Comments

  1. This comment has been removed by a blog administrator.

    ReplyDelete

Post a Comment

Popular posts from this blog

Part III: Backpropagation mechanics for a Convolutional Neural Network

Introducing Convolution Neural Networks with a simple architecture

Deriving Pythagoras' theorem using Machine Learning