How to do The Shapiro-Wilk Normality Test for Biomedical Research
- Mar 23
- 4 min read

Is your data truly normal? In biomedical research, this single question often determines whether your p-values are "significant" or just "statistical noise."
If you are analyzing pre-clinical data—whether it's cell culture viability, mouse tumor volumes, or protein expression levels—you likely rely on T-tests and ANOVAs. But these parametric tests rely on a strict assumption: normality. If your data violates this, your results could be invalid.
This guide is the definitive resource on performing, interpreting, and troubleshooting the Shapiro-Wilk test, the "gold standard" method for small sample sizes common in basic research (n < 50). We move beyond simple definitions to provide step-by-step protocols for GraphPad Prism, SPSS, and R, ensuring your statistical rigor meets the highest publication standards.
What is the Shapiro-Wilk Test and Why Use It?
The Shapiro-Wilk (SW) test determines if a dataset comes from a normally distributed population. It is widely considered the most powerful normality test for small sample sizes, which makes it indispensable for "wet lab" biology where N=3 or N=6 is standard.
The Core Logic
Null Hypothesis (H_0): The population is normally distributed.
Alternative Hypothesis (H_1): The population is not normally distributed.
The P-Value Rule:
p > 0.05: You fail to reject the null hypothesis. Your data is likely normal. (Result: Use Parametric Tests)
p < 0.05: You reject the null hypothesis. Your data deviates significantly from normality. (Result: Use Non-Parametric Tests or Transform)
Why Not Just Use a Histogram?
Visual inspection (histograms) is subjective, especially with small datasets (e.g., n=10) where "bins" can distort the shape. The SW test provides an objective, standardized metric (W statistic) to quantify how well your data fits the Gaussian bell curve.
Step-by-Step Protocols: Performing the Test
Option 1: GraphPad Prism (The Lab Standard)
Most biological researchers use Prism. Here is the modern workflow.
Enter Data: Input your data into a Column data table.
Analyze: Click Analyze > Column Analyses > Normality and Lognormality Tests.
Select Test: Check the box for Shapiro-Wilk.
Note: Older versions of Prism may calculate this slightly differently. Ensure you are using Prism 6+ for the updated Royston approximation algorithm.
Run: Click OK.
Interpret: Look at the "P value summary" column.
If it says ns (not significant), your data is normal.
If it shows asterisks (*), your data passed the threshold for non-normality.
Option 2: SPSS (The Clinical Standard)
Navigate: Go to Analyze > Descriptive Statistics > Explore.
Select Variables: Drag your variable of interest (e.g., "TumorSize") into the Dependent List.
Split by Group (Optional): If you have groups (e.g., "Treatment" vs "Control"), drag the grouping variable into the Factor List.
Configure Plots: Click the Plots button. Check "Normality plots with tests". Uncheck "Stem-and-leaf" if you don’t need it.
Output: Look for the table titled "Tests of Normality". Focus on the Shapiro-Wilk column, specifically the Sig. (significance) value.
Option 3: R (The Data Science Standard)
Best for high-throughput data or automated pipelines.
The basic command is built into the stats package:
# Basic Syntax
shapiro.test(numeric_vector)
# Example with interpretation
data <- c(2.1, 3.4, 2.8, 3.1, 2.9) # Your biological data
result <- shapiro.test(data)
print(result)
# Output will look like:
# Shapiro-Wilk normality test
# data: data
# W = 0.986, p-value = 0.967
Interpretation: In the code above, p = 0.967. Since 0.967 > 0.05, we treat the data as normal.
Troubleshooting & "Edge Cases"
Scenario A: One Group Fails, Others Pass
Example: You have 3 treatment groups. Control and Drug A are normal (p > 0.05), but Drug B is not (p = 0.03).
The Problem: You cannot mix Parametric (ANOVA) and Non-Parametric (Kruskal-Wallis) tests in a single analysis.
The Solution:
Transform: Try a Log10 transformation on all groups and re-test. This often fixes right-skewed biological data.
Go Non-Parametric: If transformation fails, switch to Kruskal-Wallis (for >2 groups) or Mann-Whitney (for 2 groups) for the entire experiment. It is safer and more conservative.
Scenario B: "My Sample Size is Huge (n > 5000)"
The Problem: The SW test is too sensitive at large sample sizes. It will return p < 0.05 for trivial deviations from normality that don't actually affect the validity of a T-test.
The Solution: Do not rely solely on SW for n > 50. Use a Q-Q Plot (Quantile-Quantile plot). If the dots lie roughly on the diagonal line, assume normality regardless of the SW p-value.
Scenario C: Raw Data vs. Residuals
Advanced Statistician Note: Technically, parametric tests like ANOVA assume the residuals (errors) are normally distributed, not necessarily the raw data.
For simple 1-way designs, testing raw data is an acceptable proxy.
For complex models (e.g., Two-Way ANOVA), you should extract the residuals and run shapiro.test(residuals) for the most accurate assessment.
What does a p-value 0.05 in the Shapiro-Wilk test indicate?
Result | P-Value | Conclusion | Recommended Next Step |
Passed | > 0.05 | Data is Normal | T-Test / ANOVA |
Failed | < 0.05 | Data is Not Normal | Check Outliers -> Log Transform -> Non-Parametric Test |
References
https://builtin.com/data-science/shapiro-wilk-test
https://stackoverflow.com/questions/15427692/perform-a-shapiro-wilk-normality-test
https://www.reddit.com/r/rstats/comments/kvl66f/correct_use_of_shapirotest_shapiro_wilk/
https://statistics.laerd.com/spss-tutorials/testing-for-normality-using-spss-statistics.php
https://www.spss-tutorials.com/spss-shapiro-wilk-test-for-normality/
https://www.researchgate.net/post/Testing_normality_Skewness_or_Shapiro-Wilk
https://www.quora.com/How-do-we-use-the-Shapiro-Wilks-method-to-test-normality
https://www.geeksforgeeks.org/r-language/shapiro-wilk-test-in-r-programming/
https://www.statology.org/shapiro-wilk-test-r/
https://www.graphpad.com/guides/prism/latest/statistics/stat_choosing_a_normality_test.htm
https://www.graphpad.com/guides/prism/latest/statistics/stat_how_to_normality_test.htm
https://www.reddit.com/r/labrats/comments/1iokgpj/need_help_with_graphpad_prism/





