I like to use the analogy of building bridges. If I have no principles, and I build thousands of bridges without any actual science, lots of them will fall down, and great disasters will occur.

Similarly here, if people use data and inferences they can make with the data without any concern about error bars, about heterogeneity, about noisy data, about the sampling pattern, about all the kinds of things that you have to be serious about if you’re an engineer and a statistician—then you will make lots of predictions, and there’s a good chance that you will occasionally solve some real interesting problems. But you will occasionally have some disastrously bad decisions. And you won’t know the difference a priori. You will just produce these outputs and hope for the best.

Today I learned there’s another Michael Jordan that is as awesome in machine learning as #23 is at basketball.  IEEE’s article Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts is worth a read and a re-read.

It’s also worth noting that Professor Jordan did an AMA on Reddit, and actually disagreed with the title and characterization of the IEEE interview and wrote a follow-up and response on a WordPress-powered blog.

One thought on “Michael Jordan on Big Data