PyCon 2008 ElasticWulf Slides

4 8 15 16 23 42

Here are the ElasticWulf slides from my talk. The video will eventually be posted to the PyCon site.

The cluster management scripts I used to run the EC2 beowulf are hosted on google code:

ElasticWulf Project

I should make the initial checkin by Monday, until then you can try out the 32 bit images from the old EC2 tutorial.

PyCon highlights for me included Guido popping his head in towards the end of my talk (unless I was so tired from last minute preparations that I was hallucinating?) and meeting the great essayist Zed Shaw. The “Birds of Feather” (BOF) sessions were my favorite part of the conference so far. Tonight, I caught the tail end of an interesting Natural Language Processing discussion. Chris McAvoy talked me into holding a Netflix Prize BOF session where we exchanged insights about using Python for collaborative filtering. Later that night, my coworker Chris Gemignani organized a Data Visualization session where he did some cool things with Python and Nodebox. We also hung out with Peter Fein from the job search engine JuJu, who pulled together an engaging Distributed Computing BOF. JuJu just released a neat RESTful python search engine project called GrassyKnoll which you should check out. I’ll post more on PyCon when I get back to DC, along with a tutorial on using IPython1 with ElasticWulf.

I thought it was a nice touch that the back of the conference shirt featured the xkcd Python comic.

  • Steve:


    I'm looking at resurrecting this on Ubuntu now, just want to point readers to your site since I get a lot of questions about running distributed Matlab on EC2: Econ Steve: distributed matlab (part 1 of 3) - compiling.


    One of the posts I'm working on is replicating this microsoft risk calculation example using open source alternatives on Amazon EC2: http://blog.jonudell.net/2008/03/27/cluster-computing-with-large-data-for-the-classroom/


    -Pete

  • Thanks for the quick response. Let me know if I can help with testing or anything. This is really cool stuff.

  • Stephen: Back when I first created the AMI images (late 2006?) Fedora was the most stable base image on EC2. Most of the MPI/NFS docs I ran across were also Fedora / Red Hat based, so it was the easiest option at the time. I'm working with the infochimps.org guys on a new swiss army knife AMI based on Ubuntu, and I'm planning on migrating the elasticwulf launch utilities over as well to take better advantage of boto and EBS data volumes. Stay tuned.

  • Thank you for posting this. I am going to try this out this weekend or next week. I need to install R and snowfall/sfCluster and rebundle, and this should fit my needs just fine. I am going to try some simulation for solving stochastic differential equations. This is just the ticket.


    Quick question: Is there a particular reason why you chose fedora core 6 as opposed to a newer version? I am more of an Ubuntu guy so I don't know the difference in the Fedora versions. Would it be worth trying to recreate the environment in a more recent version?

  • Patrick,


    No problem, the cluster management code is checked in now at google code:


    http://code.google.com/p/elasticwulf/


    You can grab it from the subversion repository with the following command:


    svn checkout http://elasticwulf.googlecode.com/svn/trunk/ elasticwulf-read-only


    Some people have run into an issue where the configuration script asks for a password, I'll try to track down what is causing it and check in a fix.


    In the meantime, there is a workaround in the comments of the old tutorial:


    http://www.datawrangling.com/mpi-cluster-with-python-and-amazon-ec2-part-2-of-3.html#comment-1148


    -Pete

  • I think tomorrow is the promised "Monday after PyCon" ;) Sorry to pester, but just wanted you to know that there's an audience for your cluster management code. Thanks!

blog comments powered by Disqus