Three books that helped me learn Bayesian statistics

In a previous post, I wrote about my journey into learning (and continuing to learn) Bayesian statistics. Making the jump into Bayes would have been impossible without some great resources (books, articles, packages, and blogs) that have come out in the last few years. Here’s my quick review of books that have been most influential for me (a practicing ecologist). In later posts, I’ll talk about packages, articles, and blogs.


Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and Stan, by Franzi Korner-Nievergelt et al.

As someone who used frequentist statistics for over a decade, this book was essential for me to understand Bayesian models. Unlike other Bayesian books I’ve read, this book does a side-by-side comparison of frequentist and Bayesian analysis of the same models, instead of pretending that frequentistm doesn’t exist. That approach really helped me understand a fundamental lesson: learning Bayesian did not require learning new model structures. A linear regression y~a + bx is a linear regression, whether it’s a Bayesian regression or a frequentist. The main difference is in how we interpret the parameters, in this case the intercept a and slope b. This book helped me clear up confusion over common questions, such as “Do you think this would work with a Bayesian approach?”. After reading this book, I now know that the answer is of course it will work with a Bayesian approach.

The book comes with an R package and well-described R code in lmer() syntax that links to STAN for exploring the posterior. But it starts off by using a simple function in base R – sim(). I really liked this, because it generates a posterior (assuming flat priors) without the need for external programs, and allowed me to see the power of analyzing things like treatment comparisons using the full posterior (hint: it’s really easy once you get comfortable thinking about the iterations in the posterior).

Bayesian Models: A Statistical Primer for Ecologists, by Tom Hobbs and Mevin Hooten.

This was the first Bayesian book I ever read, and I learned Bayesian statistics from the authors at an NSF funded workshop that they taught with Kiona Ogle and Maria Uriarte.

What I like most about this are the clear ecological examples, and the emphasis on choosing the right likelihood with clear descriptions of the method of moments. My own work uses the gamma likelihood almost exclusively now, and their examples of the gamma in this book are excellent. In the appendix, there is also an extremely useful table that compares the different likelihoods, and what types of ecological data are relevant for each one. (also see Sean Anderson’s excellent vignettes for gamma examples in Bayes and non-Bayes).

The book does not have any code, instead using detailed mathematical notation and DAGs. For me, this was difficult to digest as a first Bayesian text. I like trying to replicate someone else’s work by trying to code it, failing, trying again, failing, etc… That’s not the most efficient way, but it works for me. However, the authors of this book also rightly point out that adding code or specific software will limit their audience. New packages come out all the time, instantly dating anything that would be in the book. Because of that, this book will be useful regardless of the programming language you use (or will use in the future).

Statistical Rethinking, by Richard McElreath

A lot has been written on this book already (e.g. here), and for good reason. It really is “a pedagogical masterpiece“. When I teach Bayesian Statistics to our graduate students, this is the book we use. It comes with it’s own R package (rethinking), which is used throughout the book.

One of the things I like best about it is the clear description of what the code and formulas mean. It’s use of R code and non-mathematical formulas are a godsend for readers that have very little recall of algebra or calculus. In that sense, it provides a nice contrast to the Hobbs and Hooten book, or to other well-known Bayesian books, such as Gelman et al.’s Bayesian Data Analysis.

This book is most helpful if you read the whole thing. That probably sounds obvious, but I say it because, as the name suggests, it really is a new style of thinking and writing about statistics. It is designed as a complement to semester-long course, in which each chapter builds on the others and references past analyses. It would be difficult to drop in on chapter 12 to only learn multilevel models if you’re not already familiar with the syntax and examples of earlier chapters. Of course, you should plan to learn Bayesian over months to years, anyway. Shortcuts to understanding any new statistical philosophy and re-wiring your statistical workflow don’t exist.

Importantly, as an example of the clarity of writing, McElreath has done away with traditional statistical lexicons that often confuse non-statisticians. If you have to pause every time you see “i.i.d” or “moments” or “jth group”, then this book is for you. Sure, it contains all of those concepts (often as separate “Overthinking” sections), but describes them in fresh ways, without resorting to verbal shortcuts. Brevity is not always a pedagogical friend, and McElreath understands that.

The parts of this book that I don’t like as much are that plots use base R, often using for loops. That’s just a personal preference, as I tend to use tidyverse and ggplot. The good news is that Solomon Kurz earned a lifetime’s worth of good academic karma by recoding everything in this book, from models to figures, with tidyverse, brms, and ggplot.

The other thing I’d hoped for are examples with categorical predictors that contain more than two levels. There are lots of examples of models with continuous predictors and with categorical predictors with two levels (i.e. 0 or 1). But I’m an experimental ecologist and we often have treatments with 4-5 levels, typically measured repeatedly over time, where we want to derive the posterior distribution for each treatment and compare them. The rethinking package can actually do this quite easily (hint: look at the end of Chapter 5), using the correct 0/1 matrix of predictors. But if you are used to using a shortcut like y ~ time*treatment to specify an interaction in base R models, there is nothing like that in Rethinking.

My statistical journey as an ecologist

When I was in grad school, Ken Burnham gave a seminar in my department about model selection and met with our research group. His book with David Anderson had been out for ~3 years at the time (it now has more than 45,000 citations!), but I had zero idea of what it was or why everyone was so excited about it. My understanding of statistical analysis was so poor that when a professor suggested that I should use model selection in my dissertation, I could only nod silently. In reality, I didn’t even know what a model was.

Sure, I had run t-tests and ANOVAs in SPSS and PROC MIXED in SAS, but they were just names for things that I didn’t really understand. The idea that there were underlying similarities between them, that they were models, was baffling to me. I was happy enough just getting the software to work. Then I’d google how to interpret the output, and try to add the stats to my paper with as little explanation as I could get away with, hoping no one would ask about the stats.

I don’t think I was alone. Like most ecologists I know, especially those of us who use controlled experiments, my training in statistics was limited to a few graduate courses in biostatistics that followed a familiar pattern.

We learned tests. 

If you have two groups, then use a t-test.
If you have more than two, then use an ANOVA.

We learned rules

If your data are not normally distributed, then it’s Kruskall-Wallis time.  Heterogeneity of variance is something you should really be afraid about.

But we didn’t learn what any of this meant. At least I didn’t. And for a while, that was just fine. My experiments were going well, showing big effects that hardly needed a p-value to convince anyone. So what if they weren’t analyzed with the perfect models (whatever that meant), the science was still sound, and we were replicating the findings. All was OK.

As I moved into postdocs and began to collaborate with a wider group of people, I felt a nagging discontent. Everyone had a unique set of rules to apply or ignore, often couched in folklore. Such as the idea that statistics are only around to make up for poorly designed experiments. Or that you should always use Tukey. Or that you should always use Bonferroni. Or that pseudoreplication was something to be terrified about (rather than just modeled).

But there were cracks in the wall.

Then, in 2011, I came across several papers by Shinichi Nakagawa that blew me away. The papers were critical of ecology’s blind allegiance to Bonferroni-style corrections and emphasized effect sizes as critical measures over p-values (written with Innes Cuthill). These papers were a revelation to me, partly for their sensible approach, but mostly for the simple existence of debate. Before then I had assumed that statisticians agreed on all the rules (even if us non-statisticians didn’t). After all, what were our textbooks and classes in statistics if not a slew of rules to be wary of? Instead, these papers brought a sense of excitement. Perhaps I was not alone in my confusion and frustration with arbitrary cutoffs. Perhaps the weirdness of it all wasn’t just a reflection of my poor math skills. Perhaps there was more to a statistical analysis than whether p was above or below 0.05.

Perhaps there was more…but I didn’t know what. For the next several years, I plodded along, analyzing data with the standard tools, but becoming increasingly disillusioned with them. Then, in 2014 (or maybe 2015), I analyzed some data for the first time using lmer() in R. The output had all the familiar summary statistics that come with any linear analysis, but to my dismay, there were no p-values. StackExchange quickly confirmed that this was no mistake. It also confirmed that I was not alone in wondering where this cornerstone of my statistical understanding had gone. The question has been viewed over 100,000 times.

Here was my opening. I had already committed myself to using R full-time when I started my faculty position, and the model I needed to run was a linear mixed model. I was stuck with lmer(). This was my chance to break free, to embrace effect sizes or confidence intervals or bootstrapping or…something…and let go of p-value shackles. And so, like any good transition from a comfort zone, the first thing I did was scramble straight back to safety. I googled a solution to produce p-values from lmer(), and that was that.

Eventually, I did try to publish a paper without p-values. In that paper, I tried to use some god-forsaken confidence interval approach I’d read in an ecotox journal. Something about comparing overlap with 84% intervals instead of 95%, because 84% was better at replicating the alpha of 0.05. I honestly can’t remember. What I do remember is that reviewer one hated it and refused to read beyond page 9, where I’d introduced the 84% idea. I can’t blame them. It sucked. I eventually abandoned that approach and published it in a different journal with all the traditional statistical approaches.

I needed some help

Clearly, I was not going to learn a better way to analyze my data on my own. I needed help, so I attended a workshop in Fort Collins, Colorado in 2015 (link is for a different year, but is the same workshop). It was targeted at ecologists who wanted to learn Bayesian statistics. I didn’t know what Bayesian statistics was, but I knew it was different than what I’d been doing and it seemed hip (sometimes that matters, too). The workshop was intense, and I was struck at how quickly I reverted to my habits from undergrad – sit in the back, never talk, wait too long to clarify a simple mis-understanding. Even though I was college professor, I was once again just a so-so student. In between classes, I couldn’t make myself ignore the grant I needed to write, the paper I was finishing, or the summer field season my grad students were starting. Plus I was back in Fort Collins, where I’d lived and worked before, and I had lots of reminiscing to do.

Year(s) of the books

When I got back to my office in Vermillion, I half-heartedly tried to run a Bayesian analysis, using the approach I’d learned at the workshop (in rjags). But it failed miserably, and I went straight back to analyzing data with the lmer p-value hack. But 2015/2016 was an incredible year for learning Bayes, due to the publication of several books that offered a fresh way of thinking about data analysis in general:

Bayesian Models: A Statistical Primer for Ecologists, by Tom Hobbs and Mevin Hooten (who taught the workshop I had attended), Statistical Rethinking, by Richard McElreath, and Bayesian Data Analysis in Ecology…, by Franzi Korner-Nievergelt et al.

These books touched on similar topics, such as defining Bayesian analysis, or describing a hierarchical model, but they did so in unique ways. In the year after I attended the workshop, I would constantly shift between them to understand some component of a model. Hobbs and Hooten described how to use the posterior distribution to compute derived quantities, akin to the “post-hoc” tests that had always flummoxed me. It was so simple that I still have to re-read every few months just to make sure I hadn’t missed something. McElreath’s description of hierarchical models, and the underlying structure of all of those “tests” is as good as it gets.

But it was Korner-Nievergelt et al.’s side-by-side comparisons of Bayesian and frequentist results, along with their bare-bones R code, that were most revealing to me. The first Bayesian regression I could run myself and understand came from the introductory chapters of their book. Numerically, the results weren’t any different than what I would have gotten from a frequentist analysis (i.e. the slope and intercept were numerically similar regardless of the method). But their book also contained perfectly understandable descriptions of why numerical similarity is not the point. With Bayes, I could now make direct statements about hypotheses that I couldn’t make otherwise. It is hard to describe what a relief it is to be able to say: “the probability that the slope is greater than zero is 93%”, instead of “the probability of obtaining data as extreme or more extreme than we obtained under the null hypothesis that the slope is exactly zero is 0.03″, which of course is never actually written out, but is instead short-circuited as something like “the slope was positive (p=0.03)“. The straightforward way of putting the results of Bayesian analyses into sentence format is easily one of the best arguments I have when someone asks why they ought to learn Bayes. It’s one less thing to worry about, so you can get on with what is most important, your scientific question and results.

Year of the packages

As if the publication of these books wasn’t helpful enough, rstanarm() was released around this time, and Paul Bürkner released the brms() package soon after. It uses the typical R syntax to specify models that I had become accustomed to, essentially removing any excuse I had left to not run Bayesian models as a default. Now, this frequentist regression in base R
lm(y ~ x, data=data)

became this Bayesian regression in brms()

brm(y ~ x, data=data)

Since ~2016, my graduate students and I have exclusively used Bayesian analysis in our publications. To my surprise, we have had zero trouble convincing reviewers that this approach is acceptable. My transition from frequentist to Bayesian analysis has easily been one of the most intellectually satisfying things I’ve ever done. It only took, uh, 15 or so years.

How to write a research article in ecology

I wrote this for my students last year. They are thoughts I have to constantly remind myself of in my own writing. They are far from universal. I hope they help.  Jeff Wesner (21 August 2017)

Readers and reviewers are desperate to learn new and exciting science. They are not desperate to tear your science apart (with few exceptions, whom no one likes). Write for the first group, not for the second.

Readers and reviewers will always know less about your study than you do. Your writing should be crystal clear in its justification – why the study is done and who cares. That justification is obvious to you but not to almost anyone else in the world. As a reviewer, I often get stuck in the first few paragraphs, wondering why I’m spending time on this paper.

Here’s a hypothetical example of a vague justification for research on subsidies:

Not crystal clear – “Subsidies are clearly important for ecosystems (cite), though not always and in every case. We need to better understand their effects under X conditions. We measured the effects of d on insects.”

How to fix it?  – Each of the above sentences would need its own paragraph. For example, you’ll need to convince most readers that subsidies are important (paragraph 1), why we need to see another study of them under X conditions (paragraph 2), what the importance of d is (paragraph 3), and what your hypotheses are (paragraph 4). Even though these things might seem clear to you and me, they won’t be to readers. This is the job of your introduction, to use four paragraphs that get a point across that makes sense to you in four sentences

Papers are single ideas

An individual paper is a single idea that takes 5000 words to get across. All words should be in service of this single idea. Though it pains me to write this, one way to think about it is to ask – If someone tweeted this paper, what would they say about it in 140 characters?

Loose goals for the structure of your paper:

Abstract – ~200-250 words

  • No detailed methods
  • 1-2 sentences of background
  • 1-2 sentences on your approach (“To test these hypotheses, we measured the effects of X on Y in artificial ponds.”
  • 2-3 sentences of results (with a couple of the most important summary statistics)
  • 1 sentence that summarizes the importance of the results

The abstract will always feel sparse to you, because you know all of the details behind the study, and all of the cool things left out. But the abstract is key. It’s an invitation to read more, not the final story. It’s your elevator pitch.

Introduction – 4 paragraphs.

The first paragraph sets the scope. Don’t limit yourself. Write for all ecologists, not just someone interested in freshwater, or in plants, insects, or bacteria, but anyone interested in how the world works. That usually means you need to tie your study to a key concept in the broader field (energy flow, predation, food webs, pollution, biodiversity, co-evolution, etc.). Those are broad concepts that transcend ecosystem types, scales, and organisms. Start there, then narrow down.  

The fourth paragraph is simply a description of what questions you asked that addressed the big ideas in the first paragraph. Sometimes more than four paragraphs are required, but rarely. Aim for 4 and add only if necessary.

Methods – variable, but it should be clear how each of your methods relates to the questions you promised in the introduction.

Results – variable, but they need to explicitly answer the questions you laid out in the introduction. This is the #1 reason that papers often get bad reviews or are rejected. They set up some great questions, but don’t answer them in the results in any explicit way (or have a fatal flaw in the methods). Don’t make readers search for the answer. Give it to them.

Great results sections can be as short as a single paragraph (~4 sentences). When papers report every single p-value they came across (or credible interval), it signals that they aren’t sure what they’re studying. Report everything, but think hard about what to put in the supplementary information versus the actual paper.

Discussion – 4 paragraphs (on average). Common pitfalls of discussions:

  • Simply rehashes the results in more flowery language
  • Doesn’t tie the results to the main questions in the introduction.
  • Repeatedly says things like “We found such and such. It was similar to what so and so found, but not similar to what so and so found. [next topic].” The problem here is that there is no context. What have we learned from these similarities and dissimilarities to other’s work?
  • Doesn’t state the most important results. Don’t leave those up to the reader to interpret. State them explicitly.

Discussions are hard work, and the hardest part is knowing how your results fit into previous knowledge but also being explicit about what we’ve learned now as a result of your work. How did your study shed light on the contrasting results you mentioned?

Discussion approach to consider:

Start with the following sentence. “The most important finding of this study is….” That forces you to be confident in the importance of your work but also sets the stage for the reader, who will really want to know why they’ve invested time in this paper. What do you want them to remember? They may disagree with what is most important in your work, but at least they know where you stand.

Writer’s block

  • Will not be fixed by staring
  • 1 – Take a walk
  • 2 – Sleep
  • 3 – Read, read, read. The most effective tonic for my own writer’s block is to read other papers; typically a seminal paper that inspired the work is best. It takes a lot of effort to shut down your mind and focus on someone else’s work for a bit. Go someplace quiet, commit 2 hours to the paper. Thoughts will come that help your writing. I promise.

Read your paper out loud. Does it sound as if someone would talk that way (scientifically speaking)? It should.

Don’t utilize “utilize”, just use “use”.

You will write lots of things over your career. Any single paper is like an idea in a conversation that spans decades. Get it out and into the conversation, then move on to the next topic.


You will get harsh reviews. They will not matter to your career. Everyone gets them and it hurts every time. Chances are that a) famous person X didn’t really review your paper, b) even if they did, they wouldn’t remember it when you’re talking to them at a meeting, and c) all reviewers are human and may have given a different opinion at a different time. In other words, harsh reviews can be harsh depending on the reviewer’s mood – did they just give good reviews to a few other papers? Maybe they felt they weren’t careful enough on a previous review. Maybe they just got a really bad review themselves. Maybe their mom just died. Maybe they have no time for this review they signed up for 5 weeks ago and for which they’re now getting reminder emails from the editor so they spit out a review that is not as careful or nuanced as they intended. They’ll do better next time. Promise. Maybe they’re just terrible people (some are). Reviews are a snapshot of the quality of the work and also the mindset of the reviewer (and editor) at the time. I promise, the next journal will give completely different assessments.

For better advice: see here, and here

Eric Sazama’s first article is published! Wolbachia in aquatic insects.

Wolbachia is a fascinating critter. It’s a bacterial genus that infects lots of arthropods, and does all kinds of things to them that make great headlines, like killing males or making them eat brains. However, it’s commonness is disputed, particularly among insects that live in rivers and lakes (i.e. freshwater insects). So in this study, Eric Sazama answered the question, How many freshwater insect species are infected with Wolbachia? See the answer here, for free.

State Wildlife Grant awarded to Kerby and Wesner labs

Jake Kerby (PI) and Jeff Wesner (Co-PI) received a State Wildlife Grant to study the effects of tile drains on prairie pothole wetlands. This comprehensive study will measure chemical contaminant levels in 18 wetlands (e.g. Se, Neonicotinoids, Nutrients) and their effects on amphibians, insects, and fish.

Brianna Henry is an NSF Graduate Research Fellow!

Congratulations to Brianna Henry! Brianna is an undergraduate at Clarion University of Pennsylvania who was just awarded an NSF GRFP fellowship to conduct research on herbicides and wetland ecosystem ecology. This award is highly competitive – only 12% of the 16,500 submitted proposals were funded. She will join our lab at USD this summer, and we’re excited to learn what she discovers!

Habitat selection paper published in Ecology and Evolution

Wesner JS, Meyers P, Billman EJ, Belk MC. Habitat selection and consumption across a landscape of multiple predators. Ecology and Evolution.

We tested whether egg-laying female insects could detect differences in predator community composition. Because some predators are more lethal than others, the ability to differentiate predator risk when laying eggs can have large fitness consequences. To test this, we allowed insects to oviposit in tanks that contained a native dragonfly (Ophiogomphus sp.) or a non-native trout brown trout (Salmo trutta). Predators were housed in isolated outdoor tanks either alone (single species) or combined (both species together). Predators were also caged to avoid direct consumption during colonization.

Surprisingly, insect colonization (number of larval insects after 21 days) did not depend on whether predators were present or not, regardless of community composition. However, follow-up consumption trials suggested that laying eggs in predator pools had clear negative consequences for larvae, particularly in trout pools, which reduced larval survival by ~47%. Thus, egg-laying insects either did not (or could not) detect differences in larval habitat quality.

Critical Inference

A blog about the use of Statistics in science and decision making ... and scientific culture

Bayesian Spectacles

Powered by JASP

Brian Moore

Biology Blog