How to do The Shapiro-Wilk Normality Test for Biomedical Research

Mar 23
4 min read

Is your data truly normal? In biomedical research, this single question often determines whether your p-values are "significant" or just "statistical noise."

If you are analyzing pre-clinical data—whether it's cell culture viability, mouse tumor volumes, or protein expression levels—you likely rely on T-tests and ANOVAs. But these parametric tests rely on a strict assumption: normality. If your data violates this, your results could be invalid.

This guide is the definitive resource on performing, interpreting, and troubleshooting the Shapiro-Wilk test, the "gold standard" method for small sample sizes common in basic research (n < 50). We move beyond simple definitions to provide step-by-step protocols for GraphPad Prism, SPSS, and R, ensuring your statistical rigor meets the highest publication standards.

You might be interested in this Bonferroni vs. FDR article as well!

Ask Sophie AI to walk you through How to do The Shapiro-Wilk Normality Test!

What is the Shapiro-Wilk Test and Why Use It?

The Shapiro-Wilk (SW) test determines if a dataset comes from a normally distributed population. It is widely considered the most powerful normality test for small sample sizes, which makes it indispensable for "wet lab" biology where N=3 or N=6 is standard.

The Core Logic

Null Hypothesis (H_0): The population is normally distributed.
Alternative Hypothesis (H_1): The population is not normally distributed.
The P-Value Rule:
- p > 0.05: You fail to reject the null hypothesis. Your data is likely normal. (Result: Use Parametric Tests)
- p < 0.05: You reject the null hypothesis. Your data deviates significantly from normality. (Result: Use Non-Parametric Tests or Transform)

Why Not Just Use a Histogram?

Visual inspection (histograms) is subjective, especially with small datasets (e.g., n=10) where "bins" can distort the shape. The SW test provides an objective, standardized metric (W statistic) to quantify how well your data fits the Gaussian bell curve.

Step-by-Step Protocols: Performing the Test

Option 1: GraphPad Prism (The Lab Standard)

Most biological researchers use Prism. Here is the modern workflow.

Enter Data: Input your data into a Column data table.
Analyze: Click Analyze > Column Analyses > Normality and Lognormality Tests.
Select Test: Check the box for Shapiro-Wilk.
- Note: Older versions of Prism may calculate this slightly differently. Ensure you are using Prism 6+ for the updated Royston approximation algorithm.
Run: Click OK.
Interpret: Look at the "P value summary" column.
- If it says ns (not significant), your data is normal.
- If it shows asterisks (*), your data passed the threshold for non-normality.

Option 2: SPSS (The Clinical Standard)

Navigate: Go to Analyze > Descriptive Statistics > Explore.
Select Variables: Drag your variable of interest (e.g., "TumorSize") into the Dependent List.
Split by Group (Optional): If you have groups (e.g., "Treatment" vs "Control"), drag the grouping variable into the Factor List.
Configure Plots: Click the Plots button. Check "Normality plots with tests". Uncheck "Stem-and-leaf" if you don’t need it.
Output: Look for the table titled "Tests of Normality". Focus on the Shapiro-Wilk column, specifically the Sig. (significance) value.

Option 3: R (The Data Science Standard)

Best for high-throughput data or automated pipelines.

The basic command is built into the stats package:

# Basic Syntax
shapiro.test(numeric_vector)

# Example with interpretation
data <- c(2.1, 3.4, 2.8, 3.1, 2.9) # Your biological data
result <- shapiro.test(data)

print(result)
# Output will look like:
# Shapiro-Wilk normality test
# data:  data
# W = 0.986, p-value = 0.967

Interpretation: In the code above, p = 0.967. Since 0.967 > 0.05, we treat the data as normal.

Troubleshooting & "Edge Cases"

Scenario A: One Group Fails, Others Pass

Example: You have 3 treatment groups. Control and Drug A are normal (p > 0.05), but Drug B is not (p = 0.03).

The Problem: You cannot mix Parametric (ANOVA) and Non-Parametric (Kruskal-Wallis) tests in a single analysis.
The Solution:
1. Transform: Try a Log10 transformation on all groups and re-test. This often fixes right-skewed biological data.
2. Go Non-Parametric: If transformation fails, switch to Kruskal-Wallis (for >2 groups) or Mann-Whitney (for 2 groups) for the entire experiment. It is safer and more conservative.

Scenario B: "My Sample Size is Huge (n > 5000)"

The Problem: The SW test is too sensitive at large sample sizes. It will return p < 0.05 for trivial deviations from normality that don't actually affect the validity of a T-test.
The Solution: Do not rely solely on SW for n > 50. Use a Q-Q Plot (Quantile-Quantile plot). If the dots lie roughly on the diagonal line, assume normality regardless of the SW p-value.

Scenario C: Raw Data vs. Residuals

Advanced Statistician Note: Technically, parametric tests like ANOVA assume the residuals (errors) are normally distributed, not necessarily the raw data.

For simple 1-way designs, testing raw data is an acceptable proxy.
For complex models (e.g., Two-Way ANOVA), you should extract the residuals and run shapiro.test(residuals) for the most accurate assessment.

What does a p-value 0.05 in the Shapiro-Wilk test indicate?

Result	P-Value	Conclusion	Recommended Next Step
Passed	> 0.05	Data is Normal	T-Test / ANOVA
Failed	< 0.05	Data is Not Normal	Check Outliers -> Log Transform -> Non-Parametric Test

Ask Sophie AI!

How to do The Shapiro-Wilk Normality Test for Biomedical Research

What is the Shapiro-Wilk Test and Why Use It?

The Core Logic

Why Not Just Use a Histogram?

Step-by-Step Protocols: Performing the Test

Option 1: GraphPad Prism (The Lab Standard)

Option 2: SPSS (The Clinical Standard)

Option 3: R (The Data Science Standard)

Troubleshooting & "Edge Cases"

Scenario A: One Group Fails, Others Pass

Scenario B: "My Sample Size is Huge (n > 5000)"

Scenario C: Raw Data vs. Residuals

What does a p-value 0.05 in the Shapiro-Wilk test indicate?

Recent Posts

Let' Connect!

EdU vs. BrdU Proliferation Assays: Guide to Choosing the Right Protocol

The Ultimate Guide to Resazurin Assays: AlamarBlue vs. PrestoBlue Protocols & Optimization

How to Do Outlier Detection: Grubbs’ Test vs. ROUT Method

CytCut 3.0: Wound Healing Assay Tool

Subscribe for Updates