miriad_in_python - BADGrads

Berkeley Quickstart
Relevant Systems
Exploring with IPython
Numpy Tips

This is documentation on my system for interacting with MIRIAD data via the Python language.

Berkeley Quickstart

If you're at Berkeley and want to use Miriad-Python scripts, hopefully all you need to do is:

source /cosmic1/pkwill/usemir.sh

If you're unlucky, the above commands will tell you that there's isn't a build for your machine yet. Peter will happily attempt to build for new configurations though it can take a lot of effort to get all of the necessarily dependencies configured correctly. (If there's no build for your computer and you really need to run the scripts, you can SSH into cosmic and run them there – this will always work, with a hefty performance hit if your data don't live on cosmic.)

Peter's scripts (which serve as examples of how to use Miriad-Python) live at the HCRO MMM SVN repository in this directory for scripts and this directory for modules.

Relevant Systems

Here are some links for the relevant software:

We use MIRIAD data. Familiarity with this is assumed.
We want to use the Python language (http://www.python.org/). ¹⁾ Documentation for Python is all around; try the official documentation site (especially the tutorial) or a google search if you have a specific question. You will need to get the hang of what Python code looks like, but should be able to get away without knowing all of its nuances.
We'll interact with code and data using the nice IPython shell, which provides an IDL-style interactive programming interface. IPython has a documentation site.
Python itself doesn't have IDL-style numerical and array capabilities. These are added to the language with the Numpy module. There is also documentation for Numpy available. The documentation site mentions a Guide to Numpy book; I've bought an electronic copy of this book, and can lend it to you if you want.
The rest is my software, which generally isn't too well-documented. The modules I use include miriad, mirtask, mirexec, and omega.

Exploring with IPython

Here are some generals tips for using IPython. These may not make much sense until you get a basic sense of the elements of the Python language.

Use the built-in help! At the prompt, you can type blah ? to get information on “blah”, whether it's a function, class, variable, or whatever. Sometimes there's no documentation, but usually there is, and it should be enough to get you oriented.
Use tab-completion! If you're typing a line, you can hit the Tab key in the middle of a word to have IPython automatically finish it for you. This is kind of fun in itself, but what's more important is that if there's not a unique valid completion to what you've typed, a second press of Tab will print out the possibilities. This is a great way to explore the structure of unfamiliar code. If you're working with a Spectrum object called spec and don't remember what you can do with it, typing spec . <TAB> <TAB> will print out the various methods you can call and fields you can access. Or % <TAB> <TAB> will print out the various “magic” functions built into IPython.
Use the source. One useful magic function is %pfile blah, which will let you view the file associated with the blah, whether blah is a variable, a class, a function, a module, etc. %edit blah will open up the file in an editor and then reload the definition when you close the editor.
There are a lot more fun features in IPython: logging, saving and restoration of variables, interaction with the Unix shell, and more. Check out the magic functions mentioned above and read their documentation.

Numpy Tips

To actually work with the data, you'll need to know a little bit of the things that Numpy supports. The main tool provided by Numpy is its N-dimensional array, or ndarray. The array objects that you'll be working with are instances of this class, and it pays to be familiar with the tools it offers you. The numpy module (often named as just N) also has a bunch of routines that are useful.

Numpy ndarrays support IDL-style math:

import numpy as N
 
a = N.linspace (1, 100, 30) # create an array of 30 elements spanning from 1 to 100 linearly
a = a + 1 # now 'a' spans from 2 to 101 linearly
a += 1 # shorthand for the above
a = a / 3 # now 'a' spans from 1 to 34 linearly
a *= 3 # now 'a' is back to 3 to 102.
a = a**2 # square every element of a
 
b = N.logspace (0, 3, 30) # create an array of 30 elements spanning from 10**0 to 10**3 logarithmically
 
a = a / b # divide each element of 'a' by the corresponding element of 'b' and store the results in 'a'.
 
print a.ndim # print the number of dimensions of 'a': 1
print a.size # print the number of elemens in 'a': 30
print a.shape # print the size of each dimension of 'a': (30, )
print a.dtype # print the kind of data the 'a' contains: floating-point

It's important to remember that, unlike in IDL, you can do fun things with array objects directly:

print a.mean () # compute the mean of the elements of 'a'
print a.std () # standard deviation of the above
print a.min (), a.max (), a.sum () # and so on 
a.sort () # sort 'a' in-place (i.e., 'a' itself is modified)
if a.any (): print 'something in a is not zero'
if a.all (): print 'everything in a is not zero!'

Numpy also has many complex ways that you can index into arrays:

a[0] # first element of a
a[-1] # last element of a
a[-2] # second-to-last element of a
a[1:3] # a subarray starting at a[1] and going up to, BUT NOT INCLUDING, a[3]
a[p:q] # more generally: a[p:q] has q - p elements and starts at a[p]
a[1:-1] # every element of 'a' except the first and last ones
a[0:1000000] # if you go off the edge, you'll just get everything up until the end.
a[-1:3] # this gives an empty array, since we're trying to go from the end to the beginning
 
b = N.zeros ((20, 10)) # create a 2-dimensional 20-by-10 array of zeros
b.shape # = (20, 10)
b.size # = 200
b.ndims # = 2
 
b[0] # the first row of 'b', a 10-element array
b[0,0] # the first element of the first row of 'b'
b[:,0] # the first column of 'b', a 20-element 1D array
b[:,0] += 3 # add 3 to each element in the first column of b
b[1:5,2:8] # a 4-by-6 subarray of b
 
c = N.ones ((4, 3, 2, 6, 2)) # create a 5-dimensional 4-by-3-by-2-by-6-by-2 array of ones ...

Numpy supports IDL-style where statements, with a similar implementation in fact:

import numpy as N
x = N.linspace (0, 10, 100)
 
x[N.where (x < 4)] = 4 # set every element of x that is less than four, to four.
 
y = x > 8 # y is an array of 100 booleans, True where x > 8, false where not.
x[N.where(y)] = -1
 
condition = N.logical_and (x < 4, x > 2)
x[N.where (condition)] = -17
 
a = N.linspace (0, 10, 100)
b = N.linspace (0, 100, 100)
w = N.where (a > 5) # you can save the result of a 'where' and use it several times
a[w] = 0
b[w] = -1
print w[0] # shows the indices where the condition was true

Finally, a rundown of the interesting functions in the numpy modules:

import numpy as N
 
print N.pi, N.e # constants.
 
x = N.linspace (0, 10, 100) # 100-element array going from 0 to 10
 
N.sin (x) # returns 100-element array of sin values
N.cos (x) # similar. N.tan (x), N.abs (x), N.log (x), etc.
 
counts, edges = N.histogram (x) # compute a histogram of x: see the builtin help on N.histogram
 
b = N.zeros_like (x) # create an array of zeros in the same shape as x