PyCon 2008 ElasticWulf Slides

4 8 15 16 23 42

Here are the ElasticWulf slides from my talk. The video will eventually be posted to the PyCon site.

The cluster management scripts I used to run the EC2 beowulf are hosted on google code:

ElasticWulf Project

I should make the initial checkin by Monday, until then you can try out the 32 bit images from the old EC2 tutorial.

PyCon highlights for me included Guido popping his head in towards the end of my talk (unless I was so tired from last minute preparations that I was hallucinating?) and meeting the great essayist Zed Shaw. The “Birds of Feather” (BOF) sessions were my favorite part of the conference so far. Tonight, I caught the tail end of an interesting Natural Language Processing discussion. Chris McAvoy talked me into holding a Netflix Prize BOF session where we exchanged insights about using Python for collaborative filtering. Later that night, my coworker Chris Gemignani organized a Data Visualization session where he did some cool things with Python and Nodebox. We also hung out with Peter Fein from the job search engine JuJu, who pulled together an engaging Distributed Computing BOF. JuJu just released a neat RESTful python search engine project called GrassyKnoll which you should check out. I’ll post more on PyCon when I get back to DC, along with a tutorial on using IPython1 with ElasticWulf.

Oh yeah, the back of the conference shirt featured the xkcd Python comic:

python

Amazon EC2 Considered Harmful

“The TruckNumber is the size of the smallest set of people in a project such that, if all of them got hit by a truck, the project would be in trouble.” - Portland Pattern Repository

bigbus

I’m taking an “Introduction to Beowulf Design” course this week from the Georgetown University Advanced Research Computing (ARC) division. The class definitely hasn’t been boring. By a strange coincidence, it turns out that the guy sitting next to me is Mike Cariaso, an MPIBlast developer who I have been corresponding with this month in some nodalpoint posts. The course gave us an opportunity to hash out some details around running MPI on EC2. He had just booted up a 10 node Amazon EC2 cluster with MPIBlast when a bus crashed into our building…

(more…)

MPI Cluster with Python and Amazon EC2 (part 2 of 3)

Today I posted a public AMI which can be used to run a small beowulf cluster on Amazon EC2 and do some parallel computations with C, Fortran, or Python. If you prefer another language (Java, Ruby, etc) just install the appropriate MPI library and rebundle the EC2 image. The following set of Python scripts automate the launch and configuration of an MPI cluster on EC2 (currently limited to 20 nodes while EC2 is in beta):

Update (3-19-08): Code for running a cluster with large or xlarge 64 bit EC2 instances is now hosted on google code. The new images include NFS, ganglia, IPython1, and other useful python packages.

http://code.google.com/p/elasticwulf/

Update (7-24-07): I’ve made some important bug fixes to the scripts to address issues mentioned in the comments. See the README file for details

The file contains some quick scripts I threw together using the AWS Python example code. This is the approach I’m using to bootstrap an MPI cluster until one of the major linux cluster distros is ported to run on EC2. Details on what is included in the public AMI were covered in Part 1 of the tutorial, Part 3 will cover cluster operation on EC2 in more detail and show how to use Python to carry out some neat parallel computations.

The cluster launch process is pretty simple once you have an Amazon EC2 account and keys, just download the Python scripts and you can be running a compute cluster in a few minutes. In a later post I will look at cluster bandwidth and performance in detail. If you have only an occasional need for running large jobs, $2/hour for a 20 node MPI cluster on EC2 is not a bad deal considering the ~ $20K price for building your own comparable system.

(more…)

On-Demand MPI Cluster with Python and EC2 (part 1 of 3)

In this post, we will build a 20 node Beowulf cluster on Amazon EC2 and run some computations using both MPI and its Python wrapper pyMPI. This tutorial will only describe how to get the cluster running and show a few example computations. I’ll save detailed benchmarking for a later write-up.

One way to build an MPI cluster on EC2 would be to customize something like Warewulf or rebundle one of the leading linux cluster distributions like Parallel Knoppix or the Rocks Cluster Distribution onto an Amazon AMI. Both of these distros have kernels which should work with EC2. To get things running quickly as a proof of concept, I implemented a “roll-your-own” style cluster based on a Fedora Core 6 AMI managed with some simple Python scripts. I’ve found this approach suitable for running occasional parallel computations on EC2 with 20 nodes and have been running a cluster off and on for several months without any major issues. If you need to run a much larger cluster or require more complex user management, I’d recommend modifying one of the standard distributions. This will save you from some maintenance headaches and give you the additional benefit of the user/developer base for those systems.

The main task I use the cluster for is distributing large matrix computations, which is a problem well suited to existing libraries based on MPI. Depending on your needs, another platform such as Hadoop, Rinda, or cow.py might make more sense. I use Hadoop for some other projects, including MapReduce style tasks with Jython, and highly recommend it. That said, lets start building the MPI cluster…

(more…)