Part 5: Statistics

By Alma Laney and Alison Bernstein

This post is the fifth in a series about transgenerational inheritance, epigenetics, and glyphosate that address questions raised by the publication of the paper, Assessment of Glyphosate Induced Epigenetic Transgenerational Inheritance of Pathologies and Sperm Epimutations: Generational Toxicology.

Statistics

In this paper, the researchers used a Student’s t-test to test the effects of glyphosate exposure on the biological measures that were taken in the experiment. Student’s t-tests are used to compare the means of two groups and should only be used with datasets that have a normal distribution. A normal distribution of data has a bell curve-shape with most data points being at the median.

Photo caption from Wikimedia Commons: “For the normal distribution, the values less than one standard deviation away from the mean account for 68.27% of the set; while two standard deviations from the mean account for 95.45%; and three standard deviations account for 99.73%.” Photo credit: Dan Kernler via Wikimedia Commons and used under CC BY-SA 4.0 with no alterations.

Are the data normally distributed?

The authors do not mention in the manuscript if they tested for normality in this specific dataset, but they cite a previous paper from their groups that states that normality was confirmed for these outcome measures. However, no mention was made of what tests were used and what those results were in either paper. Thus, as readers, we cannot verify that the data is normally distributed due to the incomplete reporting of methods and results.

It is generally recommended to state the name of all statistical tests used in the methods and provide information about the results of these tests, although these details are often omitted from publications. A t-test can tolerate some deviation from a normal distribution, but if the data violate the assumption of normal distribution greatly, there are alternate non-parametric statistical tests for a two-group comparison that would be appropriate.

Is a t-test or non-parametric alternative appropriate for this study design?

The bigger issue with a t-test in this scenario is not whether the data is normally distributed or not, but instead is that given the lack of control for all the other potential sources of variation (genetics, litter effects, breeding effects, etc) is a simple two-group test appropriate?

Because the experimental design did not adequately account for these other sources of variation, a mixed-effects model of analysis would be more appropriate here. A t-test would only be appropriate if all the other variables were controlled for or were demonstrated to not affect the outcome. Even in a well-designed transgenerational experiment, this would be difficult so a mixed-effects model would be strongly preferred. A mixed-effects model would allow researchers to control for issues such as litter size and cage effects if they were not controlled for experimentally.


View the other parts of our series on transgenerational epigenetic inheritance:

Leave a comment