Python setup with HomeBrew

HomeBrew offers a nice way of installing a Python distribution with a single command.  As usual, it puts the files releated to the installation in /usr/local/Cellar and symlinks the necessary things to /usr/local/[bin/lib/include], but it also is a bit more careful about the site-packages directory so that it and anything you install yourself doesn’t get removed if you upgrade the HomeBrew Python distribution, as described here: https://github.com/mxcl/homebrew/wiki/Homebrew-and-Python.  The result is that pip and easy_install put things in /usr/local/lib/python2.7/site-packages/.

To install:

brew install python

As described in the link above, to be able to use the HomeBrew pip and easy_install, you have to modify your PATH variable as follows, perhaps in ~/.bashrc:

export PATH="/usr/local/bin:/usr/local/share/python:${PATH}"

Numpy, Scipy, Matplotlib

These commonly-used python modules can be easily installed with the new HomeBrew Python installation.  Installing Numpy and Scipy is described here, and detailed below.

## Go to a directory where the downloaded files can be placed
cd /usr/local/src
## Download the module distributions
git clone https://github.com/numpy/numpy.git
git clone https://github.com/scipy/scipy.git

I had to add the following commands needed to get the compiler flags correct with gfortran and gcc, otherwise some of the fortran libraries were being complied with “i386” architecture and though the modules would appear to be installed correctly, Python would complain later with errors like “mach-o, but wrong architecture because the library was compiled with i386”.

export MACOSX_DEPLOYMENT_TARGET=10.6
ARCH="-arch x86_64"
export CFLAGS="${ARCH}"
export FFLAGS="-static -ff2c ${ARCH}"
export LDFLAGS="-Wall -undefined dynamic_lookup -bundle ${ARCH}"

cd /usr/local/src/numpy
python setup.py build
python setup.py install

cd ../scipy
python setup.py build
python setup.py install

Other useful modules appeared to install simply with pip:

pip install ipython
pip install matplotlib
pip install pyfits
pip install sphinx
pip install cython

The result of this is a fresh, clean Python installation that can easily be upgraded and that is easy to install additional modules. I’m still using the STSCI_PYTHON distribution for other things, as I didn’t figure out what would be necessary to plug IRAF and PyRAF into the HomeBrew distribution of python. The two are kept separate by the fact that STSCI_PYTHON is all run in csh, setting all of the necessary environment path variables when you invoke that shell.

I noticed in the HomeBrew installation that there was a newer IPython version (0.12) than the version installed with STSCI_PYTHON (0.10). I used pip to upgrade the STSCI_PYTHON IPython as follows:

$ csh
% pip install --upgrade ipython

However then starting ipython in csh resulted in a rather terrifying error:

Traceback (most recent call last):
  File "/usr/stsci/pyssg/Python-2.7/bin/ipython", line 9, in 
    load_entry_point('ipython==0.12.1', 'console_scripts', 'ipython')()
  File "/usr/stsci/pyssg/2.7/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 299, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/stsci/pyssg/2.7/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 2229, in load_entry_point
    return ep.load()
  File "/usr/stsci/pyssg/2.7/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 1948, in load
    entry = __import__(self.module_name, globals(),globals(), ['__name__'])
ImportError: No module named terminal.ipapp

This resulted from the fact that the old IPython was still sitting around and conflicting with the upgrade. Removing the old version from the path as follows solved the problem:

cd /usr/stsci/pyssg/2.7
sudo mv IPython IPython.10 ## don't remove, just change the path

UPDATE! The above fix allowed me to start IPython with the STSCI_PYTHON, but now “pyraf –ipython” is broken. Be wary of updating the modules piecemeal within the STSCI_PYTHON distribution as some are customized for wrapping the “RAF” part of PyRAF.

UPDATE#2: Easily reverted to ipython-0.10 with

sudo pip install http://pypi.python.org/packages/source/i/ipython/ipython-0.10.tar.gz

and “pyraf –ipython” now appears to work as before.

Posted in Uncategorized | 2 Comments

Switch from MacPorts to HomeBrew

Homebrew appears to be a nice replacement for MacPorts for installing unix applications not provided by Apple. It puts installed libraries in their own folders in /usr/local/Cellar/ and then sym-links the executables to /usr/local/bin/. It also appears to provide a nice way of keeping Python 2.X up-to-date, but I haven’t tried that yet.

Install Homebrew:

/usr/bin/ruby -e "$(/usr/bin/curl -fksSL https://raw.github.com/mxcl/homebrew/master/Library/Contributions/install_homebrew.rb)"

To remove the installed macports, I first just mv the macports directory to a dummy name so that the macports won’t be found in the PATH but that they can easily be reinstated if any problems with homebrew arise:

sudo mv /opt/local /opt/local_macports

If you used the “tar” trick described below to install replace the Mac tar, you need to first revert to the old tar and then use homebrew to re-install gnu-tar:

sudo ln -sf /usr/bin/mactar /usr/bin/tar
brew install gnu-tar
sudo ln -sf /usr/local/bin/gtar /usr/bin/tar

Some useful initial homebrew libraries:

brew install wget
brew install imagemagick
brew install fftw
brew install cfitsio
brew install gnuplot
Posted in install, Tips | Tagged , | 6 Comments

Install PyMC on OSX, Python2.7

I came across a number of compile problems when trying to install PyMC with the latest STSCI_PYTHON and Python2.7. The main problem appears to stem from the version of gfortran that the install scripts found in /sw/bin, perhaps shipped with Apple developer tools. I found a fix by forcing /usr/local/bin/gfortran, which you can obtain from HERE.

cd /sw/bin
sudo mv gfortran gfortran-xx

Then to install:

cd /usr/local/share/  ### place to store downloaded source code

### Get source from http://pypi.python.org/pypi/pymc/
wget http://pypi.python.org/packages/source/p/pymc/pymc-2.1beta.zip
unzip pymc-2.1beta.zip
cd pymc-2.1beta

csh  ### Hack to use the STSCI_PYTHON environment variables
bash  

export F77=/usr/local/bin/gfortran   ### Make sure to use the correct gfortran
ARCH="-arch x86_64"                  ### The following seem to be necessary for building PyMC with setup.py
export MACOSX_DEPLOYMENT_TARGET=10.6
export CFLAGS="${ARCH}"
export FFLAGS="-static -ff2c ${ARCH}"
export LDFLAGS="-Wall -undefined dynamic_lookup -bundle ${ARCH}"

### Run the build script
rm -rf build
python setup.py config_fc --fcompiler gfortran build

### Put it somewhere in PYTHONPATH
sudo cp -R build/lib.macosx-10.6-x86_64-2.7/pymc /usr/stsci/pyssgx/Python-2.7/lib/python2.7/site-packages/

Without the “-arch” commands above that force 64-bit compilation with x86_64, Python2.7 chokes with an error like:
/usr/stsci/pyssgx/Python-2.7/lib/python2.7/site-packages/pymc/flib.so: mach-o, but wrong architecture because the library was compiled with i386, e.g.:

lipo -info /usr/stsci/pyssgx/Python-2.7/lib/python2.7/site-packages/pymc/flib.so
Non-fat file: /usr/stsci/pyssgx/Python-2.7/lib/python2.7/site-packages/pymc/flib.so is architecture: i386

Finally, test it with the simple example script listed at http://github.com/pymc-devs/pymc .

Posted in install, Tips, Uncategorized | Tagged , , | Leave a comment

sudo not using $PYTHONPATH

I’ve come across this problem before, but today I was trying to install some Python packages after downloading the newest version of STSCI_PYTHON. Using easy_install requires root access to write the install files, so one would have to use, for example:

sudo easy_install pip

I got an error like ImportError: No module named pkg_resources, which resulted from the fact that the “sudo” account wasn’t using the PYTHONPATH that was setup in the ~/.bashrc file. You could see this by running:

sudo python
>>> import os
>>> os.getenv('PYTHONPATH')
()

I found a fix for this here:

Edit /etc/sudoers (with sudo) to add the following line:

Defaults env_keep += "PYTHONPATH"

Now the PYTHONPATH variable should be preserved when you invoke sudo and sudo easy_install ... should work.

Posted in Uncategorized | Leave a comment

Python C-extensions with Cython

Cython is a package for developing C-extensions that can be easily used in python. The primary benefit is that the code compiled in Cython can (optionally) written in pure python, just as it would be used in a python script. Significant improvements in speed can be achieved by modifying the code to behave more c-like (but still written in python syntax) and making use of specific c-type definitions for certain variables. Some quickstart guides can be found here: basic or numpy-related.

Cython can be installed with “pip”:

csh
sudo pip install cython

Note that pip is run from within “csh” above because my STSCI_PYTHON installation has its environment variables set in the c-shell, referring to the python2.5 version installed in /usr/stsci.

The typical workflow seems to be as follows:

  1. Make a pyx file (e.g., mytest.pyx) that may or may not contain cython-specific code
  2. Create the c file translated from python: cython -a mytest.pyx
  3. Compile the c code, e.g., gcc -o mytest.o mytest.c
  4. Import the compiled module within python to use it: import mytest

Running Step #3 as above will generate something like the following error:

gcc -o mytest.o mytest.c
mytest.c:4:20: error: Python.h: No such file or directory
mytest.c:6:6: error: #error Python headers needed to compile C extensions, please install development version of Python.

This is because the compiler will likely require specific CFLAGS and linked libraries, especially if you want to link, e.g., numpy, routines. This can be painful, but there is an easy way to automatically generate all of the necessary flags by using the python-provided “distutils” package as helpfully described here by “Rob Wolfe”. For this, put the following python code in a file like “mytest_setup.py”:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext as build_pyx

setup(name = 'mytest', ext_modules=[Extension('mytest', ['mytest.pyx'])], cmdclass = { 'build_ext': build_pyx })

To compile, you would then run

csh
setenv CC gcc-4.0
python mytest_setup.py build_ext -i

Without the “gcc-4.0” line above, I get the typical cc1: error: unrecognized command line option "-Wno-long-double" error mentioned in some of the other posts below. This is likely a result of my (outdated) STSCI_PYTHON distribution. The first time I tried to include numpy in a pyx file, I got an error like arrayobject.h not found. This was a result of the numpy include files not living in the include paths provided by distutils, which I solved with:

cd /usr/stsci/pyssg/Python-2.5.4/include/
sudo ln -s /usr/stsci//pyssg/2.5.4/numpy/core/include/numpy .

With the above setup, you should be able to compile and import both simple and numpy-specific cython source code. A detailed description on some techniques to increase the speed of cython code is provided at the link at the top of the page. The basic lesson is that you can improve speed by providing c-like data types for variables used within a function and by making your loops as c-like as possible without using python-specific (or numpy-specific) calls within the loops. An example is provided below, where we generate a large 2D array filled with random integers from 0 to N and then simply count the number of times each integer appears in the array. While this example is a bit silly, the operation of looping through 2D arrays occurs frequently when processing astronomical images and very simple cython code can be much faster at doing this than even using clever boolean operations with python, which is already reasonably fast.

To start, setup your matrix filled with random numbers:

import numpy as np
NX, NY = 1000, 1000
N = 10
rand = np.cast[np.int32](np.random.random(size=(NY, NX))*N)

Since the random distribution is uniform from 0-9, there should be (NX*NY)/N instances of each number in the full array. To check this, here is simple python code where you loop through the test integers and sum the boolean array (rand == i), which is interpreted as zero where false and one where true:

def sum_boolean(rand, N):
    sums = np.arange(N)
    for i in range(N):
        sums[i] = np.sum(rand == i)
    return sums

print sum_boolean(rand, N)
[100420  99438 100235 100205 100245  99812 100321  99729  99897  99698]

You could make the code more “c-like” by actually looping through the indices of the 2D-matrix. This is normally a very bad idea in Python (or IDL for that matter):

def sum_clike(rand, N):
    sums = np.zeros(N, dtype=np.int32)
    NY, NX = np.shape(rand)
    for i in range(NX):
        for j in range(NY):
            ival = rand[j,i]
            sums[ival] += 1
    return sums

print sum_clike(rand, N)
[100420  99438 100235 100205 100245  99812 100321  99729  99897  99698]

You could put the `sum_clike` function directly in a cython pyx file and compile it as above, but you’ll notice that it takes about the same time to run as the version compiled directly in Python. To really see the speedup, you need to use the c type definitions with “cdef” within the cython file (mytest.pyx):

import numpy as np
cimport numpy as np
INT32 = np.int32
ctypedef np.int32_t INT32_t

def sum_cython(np.ndarray[INT32_t, ndim=2] rand, unsigned int N):
    cdef np.ndarray[INT32_t, ndim=1] sums 
    cdef unsigned int NX, NY, i, j, ival
    
    sums = np.zeros(N, dtype=INT32)
    NY, NX = np.shape(rand)
    
    for i in range(NX):
        for j in range(NY):
            ival = rand[j,i]
            sums[ival] += 1
    
    return sums

Note that the other than the type declarations at the top of the file and the “cdef” lines within the function itself, the code syntax is identical to that in the pure-Python “sum_clike” function. Compile the pyx file with the “mytest_setup.py” script as above and then run it with:

import mytest
print mytest.sum_cython(rand, N)
[100420  99438 100235 100205 100245  99812 100321  99729  99897  99698]

Finally, put a timer around the function calls to compare them:

import numpy as np
import time

import mytest

N = 10
NX, NY = 1000, 1000
rand = np.cast[np.int32](np.random.random(size=(NY, NX))*N)

t0 = time.time()
s_bool = sum_boolean(rand, N)
t1 = time.time()

s_clike = sum_clike(rand, N)
t2 = time.time()

s_cython = mytest.sum_cython(rand, N)
t3 = time.time()

print 'Bool  : %.3f \nC-like: %.3f  \nCython: %.3f\n' %(t1-t0, t2-t1, t3-t2)

===============

Bool  : 0.157 
C-like: 2.578  
Cython: 0.021

The Boolean operation is much faster than the python loop, but the cythonized loop is nearly 8 times faster still (and 100 times faster than the identical code compiled and executed in pure python). That is, the loop executes nearly as fast as it would in C, with the extreme benefit of trivially passing/retrieving of the input and output data (N-dimensional arrays in particular require significantly more code overhead in standalone C).

There is a handy cython option that can help you to analyze the “c-ness” of your cython code by specifying the “-a” option at the command line when converting the pyx file:

cython -a mytest.pyx

This generates an HTML file with the more-pythonic and less efficient lines highlighted in increasing intensities of yellow. Ih the example below, you can see that there are still two pythonic lines within the optimized “sum_c” function (np.zeros and np.shape). These operations are fast compared to the “for” loops, so they don’t slow things down significantly. In the version of the function without the variable definitions shown at the bottom, everything is highlighted and will be interpreted in python, and the loop will run very slowly even though it has been compiled through cython.

In summary, while there appear to be numerous ways to incorporate C-optimized extensions into python code, Cython seems to provide a simple way to both maintain the read- and write-ability of Python syntax while obtaining significant increases in the code execution speed for particular algorithms.

Posted in Uncategorized | Tagged , , , | Leave a comment

Install standalone aXe v2.1 “taxe21”

aXe is a software package developed by the ST-ECF for reducing slitless spectroscopic observations with HST. Version 2.1 is distributed as part of STSDAS 3.12, but I recently needed to install the stand-alone version, called “taxe21”.

The software requires CFITSIO, GSL, and WCSTools. The first two of these are available via MacPorts, but the MacPorts GSL wasn’t working for me and the taxe tools were choking with the following error:


dyld: Library not loaded: /opt/local/lib/libgsl.0.dylib
Referenced from: /iraf/extern/taxe/iraf/bin/aXe_GOL2AF
Reason: Incompatible library version: aXe_GOL2AF requires version 16.0.0 or later, but libgsl.0.dylib provides version 11.0.0

## Set up an install directory
cd /iraf/extern/
mkdir taxe
cd taxe

## Install CFITSIO
sudo port install cfitsio

## Install GSL
wget http://ftp.cw.net/pub/gnu/gsl/gsl-1.14.tar.gz
tar xzvf gsl-1.14.tar.gz
mkdir gsl 
cd gsl-1.14
./configure --prefix=/iraf/extern/taxe/gsl --disable-shared
make 
make install

## Install WCSTools
cd ..
wget http://tdc-www.harvard.edu/software/wcstools/wcstools-3.8.1.tar.gz
tar xzvf wcstools-3.8.1.tar.gz
cd wcstools-3.8.1
make

## Install taxe2.1
cd ../../
wget http://www.stecf.org/software/slitless_software/axe/source/taxe21_taxesim14_src.tar.gz
tar xzvf taxe21_taxesim14_src.tar.gz
cd ccc
./configure --with-cfitsio-prefix=/opt/local --with-gsl-prefix=/iraf/extern/taxe/gsl --disable-gsltest --build=i386-pc-macosx --with-wcstools-prefix=/iraf/extern/taxe/wcstools-3.8.1/
make
make install
cd ../iraf
sudo python compileaXe.py

######    DONE !  ######

To be able to load taxe21 in PyRAF, add the following lines to ~/iraf/login.cl:


reset taxe21 = /iraf/extern/taxe/iraf/
task taxe21.pkg = taxe21$taxe21.cl
reset helpdb = (envget("helpdb") // ",taxe21$lib/helpdb.mip")

reset taxesim14 = /iraf/extern/taxe/iraf/
task taxesim14.pkg = taxesim14$taxesim14.cl
reset helpdb = (envget("helpdb") // ",taxesim14$lib/helpdb.mip")

Posted in Uncategorized | 1 Comment

ESO stand-alone FITS routines

I recently came across a nice simple way to access FITS header keywords at the command line via the “Stand-alone FITS tools” provided by ESO. The combination of dfits and fitsort work like the IRAF hselect routine, but they are run at the command line and therefore have access to all of the shell tools like grep.

To install:

cd /tmp/   ### or anywhere else
mkdir ESOFITS
cd ESOFITS
wget http://archive.eso.org/saft/dfits/dfits.c
wget http://archive.eso.org/saft/fitsort/fitsort.c
gcc -o dfits dfits.c
gcc -o fitsort fitsort.c
sudo cp dfits fitsort /usr/local/bin    ### or somewhere else in your $PATH

Now you can run it with, e.g.,

dfits f140w.fits | fitsort NAXIS1 NAXIS2 EXPTIME
   FILE         	NAXIS1	NAXIS2	EXPTIME        	
   f140w.fits	   7017  	   8361  	4.058728020E+03

Note that dfits also works with wildcards, e.g. dfits ib*flt.fits | fitsort FILTER.

Posted in Uncategorized | 2 Comments