Today I posted a public AMI which can be used to run a small beowulf cluster on Amazon EC2 and do some parallel computations with C, Fortran, or Python. If you prefer another language (Java, Ruby, etc) just install the appropriate MPI library and rebundle the EC2 image. The following set of Python scripts automate the launch and configuration of an MPI cluster on EC2 (currently limited to 20 nodes while EC2 is in beta):
Update (3-19-08): Code for running a cluster with large or xlarge 64 bit EC2 instances is now hosted on google code. The new images include NFS, ganglia, IPython1, and other useful python packages.
http://code.google.com/p/elasticwulf/
Update (7-24-07): I’ve made some important bug fixes to the scripts to address issues mentioned in the comments. See the README file for details
The file contains some quick scripts I threw together using the AWS Python example code. This is the approach I’m using to bootstrap an MPI cluster until one of the major linux cluster distros is ported to run on EC2. Details on what is included in the public AMI were covered in Part 1 of the tutorial, Part 3 will cover cluster operation on EC2 in more detail and show how to use Python to carry out some neat parallel computations.
The cluster launch process is pretty simple once you have an Amazon EC2 account and keys, just download the Python scripts and you can be running a compute cluster in a few minutes. In a later post I will look at cluster bandwidth and performance in detail. If you have only an occasional need for running large jobs, $2/hour for a 20 node MPI cluster on EC2 is not a bad deal considering the ~ $20K price for building your own comparable system.
(more…)