Search

Bonferroni vs. FDR: Multiple Hypothesis Testing in Biomedical Data

CLYTE research team
Nov 17
4 min read

Multiple hypothesis testing correction (Bonferroni vs FDR)

In the age of high-throughput biomedical research, we are drowning in data. A single genomics, transcriptomics (RNA-seq), or proteomics experiment can generate tens of thousands of data points, each one a separate statistical test. This "data deluge" is a scientific goldmine, but it hides a critical statistical minefield: the multiple hypothesis testing problem.

You might also be interested in this article Cell Counting!

Ask Sophie about Bonferroni vs. FDR!

If you run one test with the standard 95% confidence level (p-value < 0.05), you have a 5% chance of a "false positive"—incorrectly claiming a discovery. But if you run 10,000 tests (e.g., checking 10,000 genes for differential expression), you can expect 500 false positives (0.05 * 10,000) by sheer random chance. Your "groundbreaking" discovery list could be overwhelmingly junk.

To save our science from being swamped by these false alarms, we must apply a correction. The two most common methods are the classic Bonferroni correction and the modern False Discovery Rate (FDR). But which one is right? As we'll see, the answer depends entirely on your scientific goal: are you confirming a single, high-stakes fact, or are you exploring for a list of promising new leads?

The Problem: Why More Tests Mean More False Alarms

Before comparing the solutions, let's solidify the problem. The core issue is the inflation of the Family-Wise Error Rate (FWER).

The FWER is the probability of getting at least one false positive (Type I error) in your entire set of tests. The math is stark:

Probability of no false positive in one test = 95% (or 0.95)
Probability of no false positive in 100 tests = (0.95)¹⁰⁰ = 0.6%
This means the probability of at least one false positive (the FWER) is 1 - 0.006 = 99.4%

With 100 tests, you are almost guaranteed to have a false positive. This is the problem Bonferroni was designed to solve.

The Classic Guardian: The Bonferroni Correction (Controlling FWER)

The Bonferroni correction is the simplest and most traditional method for controlling the FWER.

How It Works

It's brutally simple and conservative. To maintain an overall FWER of 5% (0.05), you simply divide that alpha by the number of tests (n) you are performing.

Bonferroni Corrected p-value = 0.05 / n

If you are testing 10,000 genes, your new significance threshold becomes 0.05 / 10,000 = 0.000005. A gene is only "significant" if its p-value is less than this tiny number.

The Good and The Bad

Pro: It guarantees that your chance of making even one false positive claim remains low (at 5%).
Con: It is too strict. By being so conservative, Bonferroni massively increases the rate of Type II errors (false negatives)—where you miss real, genuine discoveries. In a genomic study, you might correctly find no false positives, but you may also find no true positives, even when they exist.

A Modern Solution: The False Discovery Rate (FDR)

In the 1990s, researchers (most famously Benjamini and Hochberg) proposed a paradigm shift. They argued that in large-scale exploratory studies, we shouldn't be so afraid of any false positives. Instead, we should focus on controlling the proportion of false positives among our list of "discoveries."

This is the False Discovery Rate (FDR).

How It Works

The most common method, the Benjamini-Hochberg (BH) procedure, is more powerful because it's adaptive. Instead of applying one punishing threshold to all tests, it ranks the p-values from smallest to largest and evaluates them sequentially against a rising threshold.

The result is a new metric: the q-value, which is the FDR-analog of the p-value.

The Good and The Bad

Pro: FDR is far more powerful. It controls the rate of false discoveries, not the chance of one. An FDR of 5% means "of all the genes on your 'significant' list, you can expect 5% of them to be false positives." This is a fantastic trade-off for discovery-based science. You get a much larger list of candidates that is still overwhelmingly correct.
Con: It does permit a small number of false positives in your final results, by its very definition.

The Biomedical Showdown: When to Use Which?

The choice between Bonferroni (FWER) and FDR is not about which is "better," but which question you are asking.

Use Bonferroni (or other FWER methods) for Confirmatory Research

You should control the FWER when the cost of a single false positive is catastrophic.

Example: A final clinical trial for a new drug. You are testing one primary hypothesis ("Does this drug work?"). A false positive means approving an ineffective or unsafe drug. In this "confirmatory" setting, you must be as strict as possible.

Use FDR (Benjamini-Hochberg) for Exploratory Research

You should control the FDR when your goal is discovery and screening—especially in high-throughput biomedical studies.

Example: A microarray or RNA-seq study to find which of 20,000 genes are differentially expressed. Your goal is to generate a list of promising candidates for future study (e.g., follow-up with PCR).
The Trade-off: Missing a key gene (a false negative) is often worse than having a few duds on your list (false positives). As one source notes, controlling the FDR gives you "many more significant scores" than Bonferroni, allowing you to "identify as many significant features as possible."

Multiple Hypothesis Testing: Choose the Right Tool for the Scientific Question

The multiple testing problem is a fundamental challenge in modern biomedical data analysis.

The Bonferroni correction is a simple, conservative method that controls the Family-Wise Error Rate (FWER), minimizing the chance of any false positives. It is best for high-stakes, confirmatory studies.
The False Discovery Rate (FDR), typically using the Benjamini-Hochberg procedure, is a more powerful and modern method that controls the proportion of false positives among your discoveries.

For the vast majority of large-scale genomic, proteomic, and other "-omic" studies, the goal is exploration. You want to find all the promising leads. For this, controlling the FDR is the overwhelmingly preferred, more powerful, and more practical statistical approach.

Check out our other biomedical guides an SOPs!

References