CPython and the Global Interpreter Lock
I've been using Python extensively over the past two months. I've also been playing with the Spotify library via the PySpotify project.
In this post I'd like to share insight into a fix for the PySpotify library that was necessary to prevent a deadlock issue. The credit here goes to Steve Laverty, who helped break this down for me and make sense of CPython's Global Interpreter Lock.
First, let's lay the groundwork for the setup we're dealing with. PySpotify is a Python wrapper around libspotify -- Spotify's C API package. If you're interested in integrating with Spotify using Python, you'll be able to grab the source from GitHub and then perform your standard python setup.py install to build the .egg (a zip file containing the compiled source; similar to .jar, .zip, etc) and have it imported in your PYTHONPATH so you can import spotify and get to coding.
The PySpotify project is essentially a C Module with bindings from PySpotify to Spotify's C library. I won't go into how the bindings are created, but suffice it to say that there exists C code that defines all of the libspotify objects (i.e. Track, Album, Playlist, Artist, Image, User, Session and so on) utilizing the Spotify library interface, and the PySpotify Python library leverages the C code as a static library. If you're coming from a Java background like myself, I liken this to JNI (Java<-->C) in that a bridge is created between the two languages; in this case Python<-->C.
The issue cropped when fetching the album art image from Spotify. In order to provide a reasonable user experience around displaying multiple results, one strategy is to prefetch album art in batches. We pursued this and ended up successfully retrieving album art, but sporadically our program would hang in deadlock. The method that was culprit was session.image_create in session.py, which delegates to Session_image_create in session.c and then to sp_image_create in libspotify which is probably performing some blocking IO operation (network request, file IO, whatever). So, how do we debug this? What is the fix?
With an understanding of the setup, it's time to take a detour through threaded programming in Python, the CPython interpreter and the Global Interpreter Lock (GIL).
Python is an interpreted language. In the case of the CPython byte code interpreter (the most widely used implementation of the Python language) threads do not run concurrently in one process. CPython can sort of support multiple threads with the understanding of a construct known as the Global Interpreter Lock. The GIL is a mutual exclusion lock held by Python's interpreter thread. This lock was designed to make operations like code execution and garbage collection thread safe as well as the benefit of enhanced performance of single threaded execution. The GIL is specifically an implementation artifact in the CPython interpreter, and it's worth noting that other interpreters like IronPython or Jython do not have this construct and can take advantage of true concurrency. Moreover, the concept of a GIL also exists in other interpreted languages like Ruby.
Understanding the presence of the GIL in this deadlock issue is critical. The C code that delegates to libspotify -- which in turn performs some blocking IO operation, and isn't a thread safe library -- had acquired the GIL and subsequent requests during its execution were cause for the deadlock issues. In order to make fetching the album art resource thread safe, releasing the GIL and then re-acquiring it after libspotify's execution is what solves the issue. In a C extension, there are two helpful macros that accomplish this: Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. In the PySpotify fix, simply wrap the blocking code with these macros.
The following addition in session.c took care of the issues:
static PyObject * Session_image_create(Session * self, PyObject *args) { byte *image_id; size_t len; sp_image *image; if (!PyArg_ParseTuple(args, "s#", &image_id, &len)) return NULL; if (len != 20) { PyErr_SetString(SpotifyError, "Image id length != 20"); return NULL; } Py_BEGIN_ALLOW_THREADS; image = sp_image_create(self->_session, image_id); Py_END_ALLOW_THREADS; return Image_FromSpotify(image); }
If you're interested in an in-depth understanding of the GIL and the performance of concurrent programming using CPython be sure to check out this article from Jesse Noller. Hopefully you found this useful.











