Using Numeric Arrays in C Extensions
Numeric Arrays are really nice on the Python side of things. They are faster and much more efficient then lists. They tend to provide some difficulty for people when they are working with both SWIG'ed and handrolled Python/C extensions however.
While Numeric Arrays can really improve performance of some applications it often isn't enough. If you need the best performance possible it sometimes becomes necessary to rewrite parts of your Python program as C Extensions. (Numeric is a C Extension actually). For example, perhaps you want to extend Numeric by adding a new operation to it. So you'd like to write a routine that takes a Numeric array and manipulates the C array structure inside of it.
Into C
I apologize in advance to any Python programmers offended by seeing C code and to any C programmers offended by seeing my C code. I'm not going to cover all the nuts and bolts of building and writing this sort of thing, at least not right at the moment.
The Module
First, a quick overview of the files involved:
steder@Penzilla numpyc $ ls
myarray.py Makefile mymodule.c
Simple C Extension: myarray.c
This is the code necessary to write a single extension function callable from Python that can access and manipulate the data inside of a Numeric array.
This code is standard for CPython extensions. It's also close to the minimum necessary code to yield a useful module.
#include <Python.h>
#include <Numeric/arrayobject.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
static PyObject *pyError;
/*
static PyObject *identity(PyObject * self, PyObject * args);
static PyObject *create_array(PyObject * self, PyObject * args);
*/
#include "identity.h"
#include "create_array.h"
static PyMethodDef myMethods[] = {
{"identity", identity, METH_VARARGS, IDENTITY_DOC},
{"create_array", create_array, METH_VARARGS, "create_array"},
{NULL, NULL, 0, NULL} /* Sentinel */
};
void init_myarray(void)
{
PyObject *m;
PyObject *tmp;
import_array();
m=Py_InitModule("_myarray", myMethods);
pyError = PyErr_NewException("myarray.error", NULL, NULL);
Py_INCREF(pyError);
PyModule_AddObject(m, "error", pyError);
}
syntax highlighted by Code2HTML, v. 0.9.1
We'll look at the functions contained in identity.h and create_array.h in just a moment.
You really need to notice just a few things about the above code:
- the #include directives
- the functions (we'll cover them in a bit)
- the myMethods table
- the init function
The #include statements are pretty straightforward. I just wanted to mention that you need both the Python and the Numeric includes.
The methods table just provides a listing of all the functions defined by this module. The first part of each entry is the name of the function that shows up when you use the module in python or run "dir( modulename )" at the interpreter. The second part is the C name of the function. The third part determines what arguments the function takes.
| Type | Use |
|---|---|
| METH_VARARGS | Typical Case: Expects a tuple of arguments(PyArg_ParseTuple(...) |
| METH_KEYWORDS | Like METH_VARARGS but defines a function that parses it's argument list with keywords.(PyArg_ParseTupleAndKeywords(...)) |
| METH_NOARGS | Optimized case for functions that take no arguments |
| METH_O | Optimized case for functions that take a single argument |
The underscore in the function init_myarray is a common convention with extension modules. The C library itself is called _name while the Python interface to this library is usually called simply name. I define a simple myarray.py later on. Note that this convention is optional. You are free to come up with your own naming conventions. However, it's important to note that the name of your init function matches the name of the compiled C extension. (The module you build from init_myarray needs to be called _myarray.)
The Functions
Identity
This function doesn't do anything to the array. However, it shows you how to access the array data if you were interested in manipulating it.
#include <Python.h>
#include <Numeric/arrayobject.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#define IDENTITY_DOC "\
This method simply unpacks the Array (so that it could be modified)\
and then returns it unchanged."
static PyObject *identity(PyObject *self, PyObject *args)
{
int i,n;
PyObject *input;
PyArrayObject *array;
char *aptr;
if (!PyArg_ParseTuple(args, "O", &input ))
return NULL;
array = (PyArrayObject *) PyArray_ContiguousFromObject(input, PyArray_INT, 0, 3);
if (array == NULL)
return NULL;
// Compute Size of Array
if(array->nd == 0)
n = 1;
else {
n = 1;
for(i=0;i<array->nd;i++)
n = n * array->dimensions[i];
}
aptr=(char*)(array->data);
// Do Your Array Operation(s)
// Return the result
return PyArray_Return(array);
}
syntax highlighted by Code2HTML, v. 0.9.1
Create Array
#include <Python.h>
#include <Numeric/arrayobject.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
/* This function takes 1 argument, N, and creates a 1D array of that size
and returns it to python with it's values initialized with 0 to N-1. */
static PyObject *create_array(PyObject * self, PyObject * args)
{
int N, i;
PyArrayObject *result;
int dimensions[1];
int *buffer;
if (!PyArg_ParseTuple(args, "i", &N))
{
return NULL;
}
dimensions[0] = N;
result = (PyArrayObject *) PyArray_FromDims(1, dimensions, PyArray_INT);
buffer = result->data;
for( i = 0; i < N; i++ )
{
buffer[i] = i;
}
return PyArray_Return(result);
}
syntax highlighted by Code2HTML, v. 0.9.1
The important parts of the above examples are
- PyArg_ParseTuple
- PyArray_ContiguousFromObject
- PyArray_Return
- PyArray_FromDims
PyArg_ParseTuple takes a variable list of arguments. It's first argument is a constant string (just like "cilO|si") that defines a set of arguments and optional arguments to this function. ("cilO|s" says, my first argument is a Character, my next argument an Int, then a Long, a Python Object, and optionally ("|" means what follows is optional) a string and another int.) You can get more info on PyArg_ParseTuple, and PyArg_ParseTupleAndKeywords @ http://docs.python.org/api/arg-parsing.html on the Python website.
PyArray_ContiguousFromObject will make any sequence (list, tuple, Numeric Array) into a Numeric Array object. Its first argument is the object you want to convert, the type you want to convert it into, the minimum number of dimensions it should be, and the maximum number of dimensions you'll accept. (If you set the max to 0 then there will be no upper limit.)
PyArray_Return simply does the extra work of properly returning an array. (If an array is size 0, it returns a single element instead of an array).
PyArray_FromDims allolws you to create a Numeric array with unitialized data. The first argument is the size of the second argument (the dimensions array). The dimension array argument is just a 1D C array where each element of the array is the size of that dimension. (int dimensions[2] = { 4, 3 }; /*defines a 4 by 3 array*/.) The third argument is just the desired type.
I hope that the above functions are at least somewhat understandable at this point. I recommend that you download the code (see the link at the bottom of this page) and play around with it a bit.
In the meantime, here is the makefile that you can use to build the above code:
The Makefile
This is a Linux Makefile that builds the above code into a Python module. Because the makefile script below is specific to Linux you might also check out the swig page as it shows the appropriate flags for Mac OS X as well.
Makefile
CC=gcc
PYTHON_INCLUDE=/usr/include/python2.4
PYTHON_LIBRARY=/usr/lib/python2.4
default: myarray.so
myarray.so: mymodule.c
$(CC) -fPIC -I$(PYTHON_INCLUDE) -c mymodule.c -o mymodule.o
$(CC) -shared mymodule.o -o _myarray.so -L$(PYTHON_LIBRARY) -lpython2.4
clean:
rm mymodule.o
rm -rf _myarray.so
syntax highlighted by Code2HTML, v. 0.9.1
Wrapping it up
It's really quite common to wrap your compiled module in a higher level Python module. The goal of this sort of wrapping is to hide some of the ugliness of your C code and provide a more user friendly interface to your users at that Python level.
For a module as simple as this there isn't much of a point. These files can still be useful. For instance, it's much easier to add docstrings and optional arguments at this point. You might also add some error checking or exception handling code at this point. It's generally preferable to catch errors in your arguments before they reach the C program.
A runtime error in Python simply throws an exception.
A runtime error occuring inside of a C module like ours (that doesn't provide error checking or exception handling of it's own) can cause an interpreter crash.
Notice in the if __name__== block, the extra argment passed to Numeric.zeros(...). The Numeric.Int32 argument indicates the C type used to represent the data in this Array. It is important to set the type of arrays that you wish to pass to C. If you don't set it explicitly, Python will guess, and if your C code doesn't account for this (by checking the type itself) your code may run into problems.
Conclusion
I hope that this page was helpful. As this material is a little more difficult, and this is my first time presenting it this way, I would love feedback on this page. So if you read this and and would like to comment or complain I'd appreciate hearing from you. My e-mail is steder@gmail.com. Thanks!
The Source
You can get the source for the examples shown above by downloading numpyc.zip