In Defense of Beer-Drinking Scientists

I read an interesting story in the New York Times on Tuesday – interesting, but perplexing. It seems that a Czech ornithologist (more specifically, an avian evolutionary biologist and behavioral ecologist) surveyed other Czech ornithologists (more specifically, other avian evolutionary biologists and behavioral ecologists) on their beer drinking habits. He then correlated their scientific output (as measured by publications/year and citations/paper) with their annual beer consumption. The result was counterintuitive – higher beer consumption led to lower scientific output.

My first thought was to scoff at the study – after all, I drink a lot of beer, and my scientific output has been pretty good. Further, I hang out with quite of few other prolific scientists who also drink their fair share of man’s greatest beverage. There must be something strange about those Czech bird watchers.

But as I began to think further on the subject (and enjoy a fine Pale Ale to settle me down), I realized I was making two cardinal mistakes in my approach to this startling scientific development: 1) I trusted my limited anecdotal evidence over a statistically valid scientific study, and 2) I based my understanding of the science on a journalist’s description of a technical paper. Recognizing my initial flaws, I moved on to a smooth and especially bitter IPA and got on the internet. After a few minutes I had located the original paper in the biology journal Oikos. Here is the citation:

Tomáš Grim, “A possible role of social activity to explain differences in publication output among ecologists”, Oikos, OnlineEarly Articles, 8-Feb-2008.

The paper is only three pages long, so it was a quick read. It was also fairly easy to find the defects in the work. First, there was the common mistake of confusing correlation with causation. The author implied that increased beer drinking caused reduced scientific output. An equally likely explanation is that poor performance in one’s chosen career (in this case ornithology) led to increased beer drinking (and after all, the subjects live in a country with the world’s highest per capita beer consumption). Alternatively, a third, unmeasured factor could be leading to both poor job performance and higher beer consumption (a nagging spouse, for example).

As I looked more carefully at the data, I found a much more significant problem. The total number of data points (as these bird-watching scientists had been reduced to) was 34. This is not an exceptionally large number of subjects when one wishes to draw conclusions about all beer-drinking scientists. The discovered linear relationship between beer consumption and scientific output had a correlation coefficient (R-squared) of only about 0.5 – not very high by my standards, though I suspect many biologists would be happy to get one that high in their work.

But it was while I was switching to a magnificent Pacific Northwest microbrew porter that I saw the real problem. Looking at the graph of the 34 data points, it was clear that the entire correlation was caused by the five lowest-output scientists. Without those five data points, the remaining 29 – showing a wide range of scientific output and beer consumption habits – exhibited absolutely no correlation. Thus, the entire study came down to only one conclusion: the five worst ornithologists in the Czech Republic drank a lot of beer.

Other significant problems were also evident. Standard linear regression, with all the fanciest statistics one can muster, still makes the assumption that each data point is independent. But this study was specifically looking for the impact of social habits on scientific output. Isn’t it likely that some, or many, of these scientists socialized together? After all, the Czech avian evolutionary biology community is not that large. I know that much (possibly most) of my beer drinking is done with fellow lithographers. For all we know, the five lowest-output scientists that created this whole controversy were all part of a drinking club – they’re probably enjoying a fine pilsner and having a fine joke at our expense right now!

In the end, though, I was pleased to see that careful reading and analysis of the original published work led to an easy debunking of the silly notion reported in the press that somehow beer drinking was bad for scientific performance. With the reputation of beer-loving scientists restored to its rightful glory, I sat back and sipped my double-chocolate stout. Ah, the life of a Gentleman Scientist.

41 thoughts on “In Defense of Beer-Drinking Scientists”

  1. Yes, you should ask your drinking buddies about their publication history and base ‘scientific’ facts on that.

  2. 0.5 for R^2 is ok, but R^2 isn’t the best for correlation, Pearson or Spearman or Kendall should be used. And since this is a more or less one a non-parametric test like Spearman or Kendall would be better. The fact that there aren’t a lot of samples would be evident in the crappy p-values you get from the correlations which would probably be 95% or worse.

  3. Scott and Christian are correct – you should do your own study rather than empirically shred the Czech study. However, they are also royal jackasses. How dare they sneer at such impeccable (to borrow Marks comment) logic! The only reason I say you should do a study is because the most prolific parasitologists I know also can drink any ornithologist (regardless of nationality) under the table. Consistently. Damn the statistics, order another beer…

  4. Nice try,

    how much did u get paid to debunk ?
    There is something affecting the thought processes when we drink,find that out.

  5. Tsk tsk. Both the author and "Anonymous" are confused.

    You confuse correlation coefficient (r) with the coefficient of determination (r-squared). Both are in the Pearson family of statistics, with r being THE Pearson’s product-moment correlation (PPMC) coefficient. Anonymous’ recommendation to use Spearman’s rho doesn’t really make any sense either, as that is more of a scale-of-measurement question, and not anything to do with parametric vs. non-parametric tests. Although it doesn’t state so here (perhaps the author can verify?), I can only assume that amount of beer recorded was in ounces/glasses/pints. Scientific output was measured as publications/year, which makes both a ratio-level scale of measurement, which means Pearson’s r or linear regression are quite appropriate.

    Back to the main concern though: Is it r^2 of .5 or r of .5?

    As this is essentially a study in social science, either way that is quite a large correlation for predicting human behavior. There is substantially more variability in human behavior than in the beakers of a chemist, so it is much more difficult to get a high correlation (and should not be expected).

    Having said that – your critique of influential data points/outliers, violations of the assumptions of independence, and small sample size are all spot on.

    Perhaps a new study should be conducted, with a much larger and better selected sample? Volunteers?

    Signed – a social scientist with better things to do!

    P.S. And anonymous – give up your antiquated hypothesis testing. p-values are of no use to anyone!

  6. If you read the blog post it says R^2 right in the blog post.

    Ha, the day I trust a social scientist who read a bunch of hip propaganda about ignoring p-values and confidence intervals is the day I stop worrying about validation and the possibility I could be wrong. P-Values help warn those analyzing the results if perhaps issues such as sample size are being ignored.

    Rank based correlations such as spearman and kendall are just fine in this case, and probably preferable since a .5 although interesting, suggests that

    You are ranking performance of profs here, there is a comparison, you can apply rank based correlations.
    Rank based (which is NON-PARAMETRIC, go talk to your school’s Stats Consultant, yes that Masters student who does it for the TA credit) correlations tell you if the if you order two sets how likely is it that large values of A correlate with large values of B. It is pretty easy to do and is arguable most useful to this study because it is the rank of the achievement. You even admitted rank based correlations are for "scale-of-measurement" questions, testing linearity isn’t that important here.

  7. p-values are only dangerous to those who don’t understand them and their limitations (especially social scientists who have their second year stats for social science course and their grad course on research methods under their belt).

  8. To Anonymous: Pretty sure it says "correlation coefficient (R-squared)," which would imply that they are the same thing. They are not.

    p-values are useful only if you have some reason to assume that the null hypothesis is true aside from wishful thinking (which is rarely the case) which as I said before means that it isn’t very useful.

    Yes, Spearman’s rho is non-parametric, but there is no reason to think that a non-parametric test is called for here. Yes, the vague construct "professor ranking" would call for such a thing, but we have a much more concrete scale: publications/year and citations/paper. There is indeed a population mean for each of these, assuming a population of "all researchers," and seeing if there is a linear relationship between ounces of beer consumed yearly and number of pubs per year is indeed interesting. Why isn’t it important to you?

    I wouldn’t call it "hip propaganda." The limitations of hypothesis testing have been known since it was introduced (nearly a century now, I think?); it’s just that most universities (especially social science programs) tend to ignore them. The typical approach is: Reject the null = my theory is true, relevant, and important. Which is obviously not the case.

    To Karver: I agree fully. But even when they aren’t dangerous, they are usually not very informative. Adequate sample size minimizes the possibility of both Type I and Type II Errors. If people would just stop trying to get away with cheap low-N studies, most of these issues would resolve themselves. Estimation of parameters should be the focus of science, not an arbitrary rejection/retention of the null.

  9. "Looking at the graph of the 34 data points, it was clear that the entire correlation was caused by the five lowest-output scientists."

    It is equally clear to me that the lack of a strong correlation is caused by the 6 individuals who drink moderate amounts of beer but maintain high output. I suggest that you consider your justification for removing those 5 points over a strong cup of coffee.

    "The discovered linear relationship between beer consumption and scientific output had a correlation coefficient (R-squared) of only about 0.5 – not very high by my standards, though I suspect many biologists would be happy to get one that high in their work."

    Yes, It is a little easier when one simply removes offending points. We should be pleased that most studies don’t live to your standards.

  10. I forgot – I never said anything about ignoring confidence intervals. CIs are very useful. Seeing if they happen to include 0 is less so.

  11. Well done! Correlation does equal causation… the forgotten fact that is the downfall of many "discoveries".

  12. I haven’t bothered to read the study. But did the methodology sort results by age?

    Presumably young scientists have less publications and drink more. While older scientists have more publications and have settled down and drink wine or whatever.

  13. Since this study and comments pertain to beer and scientific effort I wanted to pass along a "Theorem" that a fellow graduate student (~35 years ago) developed. His name is HB. (to protect his identity) and he was working on his MS degree in engineering when he developed this Theorem.

    HB enjoyed beer as much as any person I have ever met.

    As graduate students we spent considerable time taking classes, studying and doing research. Many of our graduate engineering classes started early and we typically worked into the late of night, so we spent a lot of time in our offices on campus.

    HB and a few other graduate students would occasionally take an evening break and go have a beer or two or three. HB found that after one or two beers he could return to his office and continue studying or performing research with a reasonable level of proficiency. After the third beer, although not visibly impaired, he found his productivity had fallen to an unproductive level so he might as well stay in the bar and continue drinking.

    After much investigation of this phenomena (18-24 months worth) HB formulated the "Three Beer Thereon". The short version is after three beers don’t waste your time and try to return to studying and or doing research.

    Give’m hell HB you are truly unique.

  14. There’s one little flaw in this otherwise most excellent analysis, but it takes nothing away from its correctness.

    Are we sure that the 5 ornithologists who were the "worst" in the study are the worst in the Czech republic?

  15. Hmmm. I think that we need to conduct a large, international, cross-disciplinary study and get the NIH (perhaps the NIAAA) to supply the beer; oops I mean "provide the funding".

  16. As the marketing guy at Brewery Ommegang (Belgian-style ales from Cooperstown, NY) my understanding of the scientific method doesn’t go much beyond faintly-retained theory from collge and the practice of looking at sales data, which tends to make my eyes glaze over faster than a high-gravity beer.

    However, I’m glad to hear that there are two sides to this story. Or three. Or more.

    I’d add that my experience with a small group of hiughly successful Czech friends (who admittedly are all artists) is that their prodigious beer drinking doesn’t seem to have any effect on their ability to operate in their chosen field. It may even enhance it.

    Finally. If anyone out there wishes to put on a beer drinking/scientific output competition (however one would do that) I’d be willing to consider a small sponsorship. Mainly in beer, of course.

  17. As a post-doctoral research fellow in molecular pharmacology, avid beer drinker, and friend of Brewery Ommegang, I fully endorse the idea of such a competition. We could even hold it on the lovely Brewery Ommegang grounds. Seeing as writing a journal article can take many months, perhaps we can just do feats of SCIENCE that can be done in a day, while drunk, and with proper notification of local fire marshalls/emergency personnel.

  18. As a irregularly published scientist, who brews and consumes beer steadily, I wonder how many additional papers I could had published if I THREW OUT FIVE DATA POINTS.

    I know you can toss an outlier or two out of 34, but you CAN NOT toss 5 out of 34, especially when they are clustered. I wouldn’t trust/hire this blogger to run any of my experiments.

    I’m also pretty sure that there is strong correlation between productivity and leisure, provided no sleep is controlled for.

    -S

  19. I can only speak from experience but some of my most intuitive leaps came while at least three pints into the evening. That said I would highly recommend that the resultant data be reviewed with a sober mind prior to emailing all of your colleagues. Whilst the basic idea may indeed be valid the presentation will clearly be lacking.

  20. Mr. Braden, I think you may be on to something. During sessions at our local bar and while the creative sides of our approach full lubrication, my fellow graduate students and I will often plan FEATS OF SCIENCE. However, performing these in such a state is rarely a good idea — see the 3-drink theorem above.

    I suggest that such a competition would best involve riddles, perhaps difficult ones (http://www.ocf.berkeley.edu…).
    And of course a fine brew, perhaps a fine RARE VOS from our friends at OMMEGANG.

    There are number of establishments here in nyc where such activities are welcome without prior notification of authorities.

  21. "He" wrote: Presumably young scientists have less publications and drink more. While older scientists have more publications and have settled down and drink wine or whatever.

    uneducated one, the young scientists dont yet have tenure, so publish prolifically. The older ones, once getting tenure, can relax and ponder the bottom of their glass. Now the young ones seeking tenure may live by the "work hard play hard" rule. But presuming that age means no pubs means you have not lived it. have you?

  22. IIRC, the study did take age and length of time in the field into account. They were found to be very highly correlated (duh), and were formed into a common factor which was partialed out of the correlation.

    There are some legitimate limitations of this study, but considering that it was apparently the first on this topic that’s really no surprise and no reason to reject or "debunk" the theory at this point.

  23. as a thesis pending solid-state physicist and software engineer I will refer you to the Ballmer Peak – unequivocal evidence of the three drink theory.

    http://imgs.xkcd.com/comics

    I am thinking an international investigation is certainly in order.

  24. As a customer, brewer, and occasional vistor to Ommegang, I would be glad to applying my experience to judging such an investigation. I worked 30 years in an R&D lab, and conclude that beer is useful for "R" and not so much for "D."

  25. All things in moderation I suppose. The data in the report could almost be a toxicology curve… notice that the "curve" doesnt seem to drop until they exceed 4L/yr… so it might be good to keep a running tally! Maybe those belt mounted bird counters should be co-opted to count pints.
    Another unreported variable is the size and scope of the publications generated. He only cites number of citations but that doesnt always explain the amount of work that goes into each poublication or wether it was mostly grad student work that the PI esentially signed off on.
    I love the part about publication rate affecting biological sucess!

  26. A generalized least squares model would help to take a possible lack of independence into account. Not sure if it would prove anything, but it would be interesting to at least see if the targeted population has a common drinking habit.

  27. From the drinking author of the drinking study

    Hi Chris,
    thanks for the advertisement:-) I just feel like noting that there are several misunderstandings in your comments:

    – additionally to your confusion between R2 and correlation coefficients (see comments by others), the “magnitude of R2” issue is out of biological reality – I would be happy to reach R2 up to 0.52 (see Table 1) in my standard research papers (see also e.g. Moller and Jennions, Oecologia 2002, http://www.springerlink.com…)

    – the correlation-causation – see Abstract "I predicted negative CORRELATIONS…" ("causes" or "causation" does not appear in my paper. I am not that stupid:-) See also what I explicitly said in New York Times: “More important, as Dr. Grim pointed out, the study documents a correlation between beer drinking and scientific performance without explaining why they are correlated. That leaves open the possibility that it is not beer drinking that causes poor scientific performance, but just the opposite.” Next time, read more carefully, please.

    – the relationship is not maintained by the five "outliers" (by the way to exclude any data like you suggested is a clear scientific mis-conduct and should be penalized:-) The data in the Fig. 1 are transformed (read at least the caption of Fig. 1, please) so I have to ask ironically how do you know the SHAPE of the real relationship? The real spread of the data is from teetotallers to guys who drink hundreds of litres per year. The relationship is surprisingly strong all the way from the former to the latter, I can tell you. It is a pity I promised my respondents NOT to give away how much they drink (then you would see your naive mistake very clearly).

    – most importantly: my half-joke-half-study was just a preliminary survey. Even just stating the logic of a hypothesis without any empirical data is worth publishing, see e.g. Moreno J, Osorno JL 2003: Avian egg colour and sexual selection: does eggshell pigmentation reflect female condition and genetic quality? ECOLOGY LETTERS 6: 803-806. Doing that is just a standard in ecology.

    Cheers!:-)
    Tom

  28. I guess if the author performed a study at the Deparment of Optics at the same University, he would obtained exactly opposite correlations…

  29. – most importantly: my half-joke-half-study was just a preliminary survey. Even just stating the logic of a hypothesis without any empirical data is worth publishing, see e.g. Moreno J, Osorno JL 2003: Avian egg colour and sexual selection: does eggshell pigmentation reflect female condition and genetic quality? ECOLOGY LETTERS 6: 803-806. Doing that is just a standard in ecology.

  30. It is equally clear to me that the lack of a strong correlation is caused by the 6 individuals who drink moderate amounts of beer but maintain high output. I suggest that you consider your justification for removing those 5 points over a strong cup of coffee.

Leave a Reply

Your email address will not be published. Required fields are marked *