What I like most about the R and Python developer and user communities, is their incredible openness and generosity. One of the finest examples in the past year was the online course “Statistical Learning” taught by Stanford professors Trevor Hastie and Rob Tibshirani.
In this MOOC they explain very understandably (even […]
If you look at the investments in Big Data companies in the last few years, one thing is obvious: This is a very dynamic and fast growing market. I am producing regular updates of this network map of Big Data investments with a Python program (actually an IPython Notebook).
But what insights can be […]
Open foresight is a great way to look into future developments. Open data is the foundation to do this comprehensively and in a transparent way. As with most big data projects, the difficult part in open foresight is to collect the data and wrangle it to a form that can actually be processed. While […]
To fill the gap until this year’s Strata Conference in Santa Clara, I thought of a way to find out trends in big data and data science. As this conference should easily be the leading edge gathering of practitioners, theorists and followers of big data analytics, the abstracts submitted and accepted for Strataconf should […]
A lot of people still have a lot of respect for Hadoop and MapReduce. I experience it regularly in workshops with market researchers and advertising people. Hadoop’s image is quite comparable with Linux’ perceived image in the 1990s: a tool for professional users that requires a lot of configuration. But in the same way, there […]
I am a regular visitor of Google’s research page where they post all of their latest and upcoming scientific papers. Lately I have thought whether it would be possible to statistically extract some of the meta-information from the papers. Here’s the result of the analysis of the papers’ titles produced with just a few […]
“So, what’s the mood of America?”
One of the most fascinating novels so far on data-driven politics is Neal Stephenson’s and J. Frederick George’s “Interface“, first published in 1994. Although written almost 20 years ago, many of the technologies discussed in this book, would still be […]
A very clear indicator that a topic is not only an ephemeral hype is when there will be a scientific journal for this new topic. This has just happened with Big Data as Liebert publishers just announced at the Strata conference the launch of their peer-reviewed journal “Big Data” for 2013. It will […]