Per-sample missingness (F_MISS) quantifies the fraction of genotype calls that failed for a given individual.
Elevated missingness typically indicates technical failures during genotyping — degraded DNA, low input concentration,
partial hybridisation on the array, or scanning artefacts. Retaining such samples introduces differential
missingness bias: variants appear to differ in frequency between high-quality and low-quality samples even when
no true genetic difference exists, producing spurious associations and inflated test statistics.
Key Metrics
Overlap with Step 0: All 99 removed samples were independently flagged by the chip-level investigation (84 chip failures, 12 high-missingness outliers, 3 contaminated). See Section 4 for the full cross-reference.