It's spring cleaning time at Data Wrangling. I've bookmarked 230 new datasets since publishing my first dataset linkdump in January 2008, so at the request of @mrflip, I've appended them to the original post along with a json dump of the tagged links. Flip and the other Infochimps will be pulling anything they might have missed into the infochimps.org dataset repository.
You can check out the new list of datasets at the same url:
"Some Datasets Available on the Web"
Around 85 of these datasets can be redistributed publicly: http://delicious.com/pskomoroch/redistributable+dataset. The rest are mostly free for academic use, but the license conditions vary. Some appear to adhere to the terms on http://opendefinition.org/
New Video Courses
In addition to the datasets, my bookmarks included 20 new video courses since the original video lecture post was published in April, 2008. These are mostly graduate and advanced undergraduate courses in Physics, Mathematics, and Computer Science. Among these are full video courses in Parallel programming, Loop Quantum Gravity, Machine Learning, Financial Markets, and other fun subjects.
The new videos have been added to the post:
"Hidden Video Courses in Math, Science, and Engineering"
Videos of Talks & Seminars
As an added bonus, here is a completely unorganized list of interesting programming, machine learning, and visualization talks which caught my eye in 2008:
- YouTube - Google University Inaugural Lecture: Expanding the Frontiers of Computer S...
- ResearchChannel @ iTunes U
- Videos Posted by Facebook Engineering Tech Talks: Memcached Tech Talk with Mark Zuckerberg | Facebook
- Cloud Camp Atlanta Recap | IT Management and Cloud Blog
- NIH VideoCasting Past Events
- Applications of Query Mining
- NIPS ´08 Workshop: Machine Learning Open Source Software
- Extending Selenium | Software Development Videos
- Michael Nielsen » Lectures on the Google Technology Stack: Syllabus and Background Reading
- Google News Personalization: Scalable Online Collaborative Filtering
- Predictive Discrete Latent Factor Models for Large Scale Dyadic Data
- Practical Statistical Relational Learning
- doug cutting - Google Video
- Semisupervised Learning Approaches
- Murray Gell-Mann on beauty and truth in physics | Video on TED.com
- reddit all: alien artist blog: startup insights from dharmesh shah - in video!
- YouTube - Similarity Search: A Web Perspective
- Google Technology RoundTable: Map Reduce
- Hadoop Panel Discussion
- Entropy of Search Logs: How Hard is Search? With Personalization? With Backoff?
- YouTube - Slightly Advanced Python: Some Python Internals
- YouTube - BayPIGgies Meeting - SF Bay Area Python Interest Group - Python Callbacks
- YouTube - Knowledge-based Information Retrieval with Wikipedia.
- Some Excellent ISV Advice from Jason Fried | The Balsamiq Blog
- YouTube - DjangoCon 2008 Panel: Django Success Stories
- YouTube - DjangoCon 2008 Keynote: Cal Henderson
- InfoQ: Designing RESTful Rails Applications
- Large Scale Learning Which Is Actually Useful
- YouTube - Explorations of the Mind: Intuition
- Early YouTube Engineer Tells All - GigaOM
- Video of Rich Wolski's EUCALYPTUS talk at Velocity - O'Reilly Radar
- Relational Learning as Collective Matrix Factorization
- Google I/O Sessions (Google I/O Session Videos and Slides)
- Martin Wattenberg, "Money is Beautiful: Looking at Markets in New Ways"
- O'Reilly Money:Tech Conference 2008 — O'Reilly Conferences, February 06 - 07, 2008, New York, NY
- deus ex machine: We Use All 100 Percent of Our Brain
- Confreaks: MountainWest Ruby Conference 2008
- The Science and Art of User Experience at Google
- Code4Lib 2007: Erik Hatcher Keynote
- Amazon Web Services Blog: Use Amazon SQS to Build Self-Healing Applications
- LinuxConf.Au: Puppet: A System Administration Abstraction and Automation Framework :: Tech Videos, Screencasts, Tutorials, Webinars, Techtalks, Tutorials
- YouTube - Supporting Scalable Online Statistical Processing
- next.yahoo » Blog Archive » Big Data: Viewpoints from the Facebook Data Team
- Scaling Mania at MySQL Conference 2008 | High Scalability
- Hadoop Summit and Data-Intensive Computing Symposium Videos and Slides | Yahoo! Research
- Biodefense Video Lectures and Seminars
- Google Developers Day US - Theorizing from Data
- YouTube - Visual Perception with Deep Learning
- Learning with Large Datasets [Video]