Laplacian Ambitions: In Which I Will Master Bayesian Data Analysis
There are a few reasons for this pursuit, to be honest.
Gaining additional marketable skills, for one. Becoming a Bayesian samurai can only help me in my quest to continue to get paid to play with data, right? Data science street cred, of course. Very difficult to come by honestly. General fame and fortune, really. Mostly, though, this undertaking is driven by a long-held and burning desire for most, not just some, of the things I say in any work I do to be useful and meaningful to real people in the real world.
Most of my formal statistical training is through the study of applied economics, where I've studied both quantitative methods and theory. I've learned a lot and am grateful for the education, but by far the most obvious thing I've learned is that I am decidedly not a theory guy. My economic theory classes have often felt uncomfortably similar to the Sunday school of my childhood, the major difference being the expression of the core theological principles in equations. My quantitative and modeling classes have mostly been real pleasures, and I value the additional tools I've gained through them. But another thing I've learned is that a major part of the reason so many have difficulty grasping statistics and probability is that the traditional interpretations of probability and asymptotic justifications for the most common methods of statistical inference are fundamentally and unnecessarily unintuitive.
There are other people, far more able than me, currently engaged in laying out many of the problems with traditional statistical inference: John Myles White just posted his third in a great series articulating the weaknesses of Null Hypothesis Significance Testing (NHST). His twitter feed is abuzz with links to other great sources of why p-values, besides being unintuitive, are not nearly as useful in judging the quality of research results as their ubiquity would suggest. (First post in the series is here, second here). My first really useful overview of Bayesian methods of inference was at a recent talk John gave at George Mason University, the content and related code of which is available through his github repository, and he and Drew Conroy wrote a really great O'Reilly book called Machine Learning for Hackers that I've used at work. Incredibly useful guy to know about.
In the comments of John Myles White's first post, Ethan Fosse effectively articulates the fundamental strangeness of the p-value and the null hypothesis:
It’s incredibly difficult to interpret them correctly, in part because they really are very weird constructions. As is well-known, the p-values and confidence intervals of a particular parameter describe neither the properties of the data set at hand nor the particular model fit to the data. Instead, the p-values and confidence intervals describe the imagined properties of an imagined distribution of an unobserved parameter from a set of unobserved models that we imagine have been fit repeatedly to imagined data sets gathered in a similarly imagined way from the same unobserved population. Thus, a p-value never gives the probability that our parameter is above a certain observed threshold, and a confidence interval never indicates the probability that our parameter lies within a certain set of observed values.
Read that again.
I know that if you have studied statistics, it is very likely that you on some level knew that that is what a p-value is. But are you really confident that everyone you studied with, every professor you had, all the people that are now likely employing these techniques in analyzing data in all sorts of fields, that they truly understand and could articulate what these values actually mean in the context of their research? Especially in the private sector, p-values and confidence intervals are abused and repackaged in all sorts of meaningless ways. Part of my plan with this research is to show examples of this and rework them in more useful and interpretable ways in the bayesian framework.
Pierre Laplace - of the "Laplacian ambitions" - was an 18th century French mathematician and my favorite. I'm calling this pursuit my Laplacian ambition not JUST to show off that I am huge nerd and have a favorite mathematician and not JUST to add an absurd level of grandiosity to the whole thing. I'm calling the man out because he was really the first to apply the Bayesian idea to attempts to learn useful things in astronomy, demography and other areas. That's why I'm working to master these methods: to rigorously and continuously learn and communicate broadly useful, intuitively meaningful things about our world.
I'm not entirely sure why I'm blogging my attempt. I do feel a need for some very public accountability, both in ensuring that I'm not doing it wrong and that I'm continuing to do it. At the very least, a long period between posts will make me look foolish to my professors and coworkers and the few friends and family members nerdy enough to actually read this stuff. Though the risk of severely embarrassing myself by doing it wrong may end up reducing my aforementioned 'marketability'. I guess we'll just have to see how it goes. I can always drop this whole thing down the memoryhole...
I'll be keeping all the code and output in a repository on github and posting condensed versions of my work here. Please feel free to embarass me, it's the only way I'll learn. At least that's what my parents always said.