Python C-extensions with Cython

Cython is a package for developing C-extensions that can be easily used in python. The primary benefit is that the code compiled in Cython can (optionally) written in pure python, just as it would be used in a python script. Significant improvements in speed can be achieved by modifying the code to behave more c-like (but still written in python syntax) and making use of specific c-type definitions for certain variables. Some quickstart guides can be found here: basic or numpy-related.

Cython can be installed with “pip”:

csh
sudo pip install cython

Note that pip is run from within “csh” above because my STSCI_PYTHON installation has its environment variables set in the c-shell, referring to the python2.5 version installed in /usr/stsci.

The typical workflow seems to be as follows:

  1. Make a pyx file (e.g., mytest.pyx) that may or may not contain cython-specific code
  2. Create the c file translated from python: cython -a mytest.pyx
  3. Compile the c code, e.g., gcc -o mytest.o mytest.c
  4. Import the compiled module within python to use it: import mytest

Running Step #3 as above will generate something like the following error:

gcc -o mytest.o mytest.c
mytest.c:4:20: error: Python.h: No such file or directory
mytest.c:6:6: error: #error Python headers needed to compile C extensions, please install development version of Python.

This is because the compiler will likely require specific CFLAGS and linked libraries, especially if you want to link, e.g., numpy, routines. This can be painful, but there is an easy way to automatically generate all of the necessary flags by using the python-provided “distutils” package as helpfully described here by “Rob Wolfe”. For this, put the following python code in a file like “mytest_setup.py”:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext as build_pyx

setup(name = 'mytest', ext_modules=[Extension('mytest', ['mytest.pyx'])], cmdclass = { 'build_ext': build_pyx })

To compile, you would then run

csh
setenv CC gcc-4.0
python mytest_setup.py build_ext -i

Without the “gcc-4.0” line above, I get the typical cc1: error: unrecognized command line option "-Wno-long-double" error mentioned in some of the other posts below. This is likely a result of my (outdated) STSCI_PYTHON distribution. The first time I tried to include numpy in a pyx file, I got an error like arrayobject.h not found. This was a result of the numpy include files not living in the include paths provided by distutils, which I solved with:

cd /usr/stsci/pyssg/Python-2.5.4/include/
sudo ln -s /usr/stsci//pyssg/2.5.4/numpy/core/include/numpy .

With the above setup, you should be able to compile and import both simple and numpy-specific cython source code. A detailed description on some techniques to increase the speed of cython code is provided at the link at the top of the page. The basic lesson is that you can improve speed by providing c-like data types for variables used within a function and by making your loops as c-like as possible without using python-specific (or numpy-specific) calls within the loops. An example is provided below, where we generate a large 2D array filled with random integers from 0 to N and then simply count the number of times each integer appears in the array. While this example is a bit silly, the operation of looping through 2D arrays occurs frequently when processing astronomical images and very simple cython code can be much faster at doing this than even using clever boolean operations with python, which is already reasonably fast.

To start, setup your matrix filled with random numbers:

import numpy as np
NX, NY = 1000, 1000
N = 10
rand = np.cast[np.int32](np.random.random(size=(NY, NX))*N)

Since the random distribution is uniform from 0-9, there should be (NX*NY)/N instances of each number in the full array. To check this, here is simple python code where you loop through the test integers and sum the boolean array (rand == i), which is interpreted as zero where false and one where true:

def sum_boolean(rand, N):
    sums = np.arange(N)
    for i in range(N):
        sums[i] = np.sum(rand == i)
    return sums

print sum_boolean(rand, N)
[100420  99438 100235 100205 100245  99812 100321  99729  99897  99698]

You could make the code more “c-like” by actually looping through the indices of the 2D-matrix. This is normally a very bad idea in Python (or IDL for that matter):

def sum_clike(rand, N):
    sums = np.zeros(N, dtype=np.int32)
    NY, NX = np.shape(rand)
    for i in range(NX):
        for j in range(NY):
            ival = rand[j,i]
            sums[ival] += 1
    return sums

print sum_clike(rand, N)
[100420  99438 100235 100205 100245  99812 100321  99729  99897  99698]

You could put the `sum_clike` function directly in a cython pyx file and compile it as above, but you’ll notice that it takes about the same time to run as the version compiled directly in Python. To really see the speedup, you need to use the c type definitions with “cdef” within the cython file (mytest.pyx):

import numpy as np
cimport numpy as np
INT32 = np.int32
ctypedef np.int32_t INT32_t

def sum_cython(np.ndarray[INT32_t, ndim=2] rand, unsigned int N):
    cdef np.ndarray[INT32_t, ndim=1] sums 
    cdef unsigned int NX, NY, i, j, ival
    
    sums = np.zeros(N, dtype=INT32)
    NY, NX = np.shape(rand)
    
    for i in range(NX):
        for j in range(NY):
            ival = rand[j,i]
            sums[ival] += 1
    
    return sums

Note that the other than the type declarations at the top of the file and the “cdef” lines within the function itself, the code syntax is identical to that in the pure-Python “sum_clike” function. Compile the pyx file with the “mytest_setup.py” script as above and then run it with:

import mytest
print mytest.sum_cython(rand, N)
[100420  99438 100235 100205 100245  99812 100321  99729  99897  99698]

Finally, put a timer around the function calls to compare them:

import numpy as np
import time

import mytest

N = 10
NX, NY = 1000, 1000
rand = np.cast[np.int32](np.random.random(size=(NY, NX))*N)

t0 = time.time()
s_bool = sum_boolean(rand, N)
t1 = time.time()

s_clike = sum_clike(rand, N)
t2 = time.time()

s_cython = mytest.sum_cython(rand, N)
t3 = time.time()

print 'Bool  : %.3f \nC-like: %.3f  \nCython: %.3f\n' %(t1-t0, t2-t1, t3-t2)

===============

Bool  : 0.157 
C-like: 2.578  
Cython: 0.021

The Boolean operation is much faster than the python loop, but the cythonized loop is nearly 8 times faster still (and 100 times faster than the identical code compiled and executed in pure python). That is, the loop executes nearly as fast as it would in C, with the extreme benefit of trivially passing/retrieving of the input and output data (N-dimensional arrays in particular require significantly more code overhead in standalone C).

There is a handy cython option that can help you to analyze the “c-ness” of your cython code by specifying the “-a” option at the command line when converting the pyx file:

cython -a mytest.pyx

This generates an HTML file with the more-pythonic and less efficient lines highlighted in increasing intensities of yellow. Ih the example below, you can see that there are still two pythonic lines within the optimized “sum_c” function (np.zeros and np.shape). These operations are fast compared to the “for” loops, so they don’t slow things down significantly. In the version of the function without the variable definitions shown at the bottom, everything is highlighted and will be interpreted in python, and the loop will run very slowly even though it has been compiled through cython.

In summary, while there appear to be numerous ways to incorporate C-optimized extensions into python code, Cython seems to provide a simple way to both maintain the read- and write-ability of Python syntax while obtaining significant increases in the code execution speed for particular algorithms.

Advertisements
This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s