Futures: Concurrency in Python 3, Part 1

02 Jun 2012

This series of articles is on writing concurrent software in Python 3 using the concurrent.futures module.

In this article I describe the problem that I'm trying to solve and talk about the factors which lead me to a threaded, concurrent design. If you're not interested in background and reasoning, skip down to the last section of this article.

For readers who may not be familar with the idea of concurrency in general, I recommend the opening slides for Rob Pike's excellent Concurrency is not Parallelism for a pure and simple (and, indeed, almost storybook) introduction. After slide 25 the talk moves into the context of the Go language -- which is great, but outside the scope of this article.

For readers who may not be familiar with threads, give the Wikipedia article a read.

The Problem

I'm working on a messaging system named Roundabout. The intention is for it to be multi-language (I plan to use this as the backbone for a lot of projects), with Python as the initial/reference implementation.

Roundabout is a spoke-and-hub network whose nodes are processes communicating via TCP. This means non-trivial Roundabout applications exist as coördinated flocks of processes. Additionally, the specification defines keepalives and their attendant inactivity timeouts. These are all factors of policy (they say what is to be done), but taken together they strongly inform the software's implementation (how the policy is to be realized) in the following ways:

The implementation should have a concurrent design, and the implementation should be multithreaded.

Why a concurrent design?

Networking and user interfaces are classic situations where experience has shown that concurrent designs are the proper and correct choice. (Indeed, until rather recently, these were probably the only areas where most programmers would ever delve into concurrency.)

It is extremely undesirable for networking software to ever stumble or pause. Data may arrive at any time, and may need to be sent out at any time. Getting some data, handing it to the code responsible for that kind of data, and then doing nothing while waiting for that code to process the data and return a result (i.e. "blocking") is simply unacceptible. The routines for handling I/O, the routines for data processing, and the routines for system housekeeping must be concurrent -- if you block, the delays will propagate across the network.

Roundabout is a toolkit for building applications from discrete processes which communicate over the network, enabling high quality via low coupling. It's also designed to do this with very little input or oversight from the programmer, which means it maintains a lot of state and does a lot of housekeeping. It's also designed to be fault-tolerant, and will always back up and retry in whatever way is necessary to avoid loss of data. I mention all this to show that:

And given all those decisions, the implementation must not throw away performance advantages which can found elsewhere. Concurrency is an enormous performance advantage.

Why threads?

Threading is not popular in the Python world. It might be closer to the truth that since David Beazley gave his 2009 talk Inside the Python GIL, conventional wisdom has been to avoid threads in Python at all costs.

However, Roundabout apps are already multiprocess, so I don't want the individual nodes of those apps to be composed of even more processes. This means the multiprocessing module is not an option in this case (though it is a very nice tool, and is currently the go-to answer when someone asks about threading in Python).

Also, since Roundabout has hard requirements for maintaining communication with the network, it would be extremely poor design to let nodes block while performing work or waiting on local I/O. There must be more than one thing happening at once within each node. And since I have decided not to use processes to achieve this, I must use threads.

The (Hypothesized) Solution

Python 3.2 introduced a redesigned (for the first time since since its implementation in v1.4) GIL handler, written by Antoine Pitrou. These changes seem to, in the vast majority of cases, make threading in Python a viable approach, with no exceptional slowdown.

Also introduced in Python 3.2 was concurrent.futures. This module is essentially a task manager which can operate using pools of either threads or processes. It's a port of the Java module of the same name.

The remainder of this series will investigate using futures as an effective tool for concurrent programming in Python. As of the writing of this article, I have written only a tiny testbed program, which does nothing but prove that futures works as advertised in a trivial case, and which is the basis for the next article. Each successive article will explore another step toward trying to build the concurrent core of a messaging system.

Continue to part 2 of this series.