2014 highlight (2): On of the best courses on Big Data and Data Mining

I already mentioned the Hastie & Tibshirani course on statistical learning as one of my personal highlights in data science last year. My second highlight is also an online course, also by leading experts on their field (this time: Big Data and data mining), also based on a (freely available) book and also by Stanford University professors: Jure Leskovec, Anand Rajamaran and Jeff Ullman’s course on “Mining Massive Datasets”.


If you’re interested in data science or data mining, chances are high that you have already been in touch with their book. It can safely be considered a standard work on the fascinating intersection of data mining algorithms, machine learning and Big Data. The 7 week course is the online version of the Stanford courses CS246 and the earlier version of CS345A.

mmds_cover_v21The course is very dense and covers a lot of territory from the book, for example:

  • How does Map Reduce work and why is it important?
  • How can I retrieve frequently appearing combinations from very large sets of items such as shopping baskets?
  • How to retain information about a datastream that does not fit in memory?
  • What are the most common tasks in supervised machine learning and how to implement them?
  • How do I program an intelligent system for recommending movies?
  • How to compute optimal placements of online advertisements?

Some of the lectures are on a beginners to intermediate level, but some lectures cover very advanced topics. What I especially liked about this course is that a lot of the material covered really is state-of-the-art in data mining. Some algorithms – e.g. the BIGCLAM community detection and CUR matrix decomposition – had only been developed about year ago.

So, take a look at the book, and if you haven’t already: enroll at the Coursera course website to make sure you won’t miss the next session of this course.

New Podcast on Machine Learning

talkingmachinesThis new machine learning podcast¬†“Talking¬†Machines – Human Conversations on Machine Learning”¬†really sounds like a lot of fun (and deep insight of course):

We start with Kevin Murphy of Google talking about his textbook that has become a standard in the field. Then we turn to Hanna Wallach of Microsoft Research NYC and UMass Amherst and hear about the founding of WiML (Women in Machine Learning). Next we discuss academia’s relationship with business with Max Welling from the University of Amsterdam, program co-chair of the 2013 NIPS conference (Neural Information Processing Systems). Finally, we sit down with three pillars of the field Yann LeCun, Yoshua Bengio, and Geoff Hinton to hear about where the field has been and where it might be headed.

Downloading the first episode from January 1st right now.