Question: Does the exposure-phenotype association differ by a modifier (e.g., sex, race)?
\[P = \beta_0 + \beta_1 E + \beta_2 M + \beta_3 (E \times M) + \text{covariates}\]
\(\beta_3\) is the interaction term — tests whether the E-P association differs across levels of M.
Interaction with nhanespewas
# Test interaction between lead and sex on BMI# Some pe_flex_adjust implementations support interact_with parameterresult_interaction <-pe_flex_adjust(phenotype ="BMXBMI",exposure ="LBXBPB",adjustment_model = adjustment_models[[4]],con = con,interact_with ="RIAGENDR")
Wald Test for Interaction
The Wald test assesses whether the interaction term is significantly different from zero:
# Extract interaction term from the modelresult_interaction %>%map_dfr(~tidy(.), .id ="cycle") %>%filter(grepl(":", term)) %>%select(cycle, term, estimate, std.error, p.value) %>%kable(digits =4) %>%kable_styling()
A significant p-value indicates that the lead-BMI association differs between males and females.
Part C: Microbiome-Exposome Studies
The Microbiome as an Exposure
The gut microbiome is part of the exposome:
Shaped by diet, medications, environment
Influences metabolism, immunity, disease risk
Compositional data requiring special statistical treatment
CLR Transformation
Microbiome data is compositional (proportions sum to 1). The Centered Log-Ratio (CLR) transformation addresses this:
Can be applied to ExWAS hits as a second-stage validation
Key limitation: For many environmental exposures, GWAS have not been conducted or are severely underpowered — making valid genetic instruments hard to find. Unlike well-studied traits (e.g., BMI, lipids), most chemical exposures lack large-scale GWAS, so MR may only be feasible for a small subset of ExWAS hits.
Mitigation Strategy 3: Multiple Adjustment Models
Already built into nhanespewas (Module 4):
9 adjustment models from unadjusted to fully adjusted
If an association is sensitive to covariate choice, it is more likely confounded
If it is robust across models, residual confounding is less likely (though not ruled out)
The atlas (Module 9) found 15% of associations reversed sign between models — a direct measure of confounding sensitivity
Emerging methods that select confounders per association:
Double/Debiased Machine Learning (DML)
Uses ML (e.g., LASSO, random forest) to flexibly model both the exposure and the outcome as functions of potential confounders
Estimates the causal effect after partialing out the ML-predicted confounding
Does not require pre-specifying which variables are confounders
Targeted Maximum Likelihood Estimation (TMLE)
Semiparametric method that combines ML-based nuisance estimation with targeted bias correction
Provides valid inference even with flexible confounder models
Both approaches allow the confounders to differ per exposure without manually specifying each DAG.
Mitigation Strategy 5: Triangulation
No single method eliminates confounding. Triangulation combines evidence from multiple approaches with different biases:
Approach
Bias Profile
ExWAS (observational)
Unmeasured confounding, reverse causation
Mendelian randomization
Pleiotropy, weak instruments
Negative controls
Detects but doesn’t correct bias
Longitudinal studies
Residual confounding, attrition
Cross-population replication
Different confounding structures
If an association survives across approaches with different biases, it is more likely to be real.
Key references:
Munafò MR, Davey Smith G. Robust research needs many lines of evidence. Nature 2018; 553:399-401.
Lawlor DA, Tilling K, Davey Smith G. Triangulation in aetiological epidemiology. Int J Epidemiol 2016; 45(6):1866-1886.
Munafò MR, Higgins JPT, Davey Smith G. Triangulating evidence through the inclusion of genetically informed designs. Cold Spring Harb Perspect Med 2021; 11:a040659.
A Two-Stage Workflow
A practical approach that addresses per-association confounding:
Stage 1 — Screen broadly (ExWAS)
Apply uniform covariate set to hundreds of exposures
Accept that some findings are confounded
Use FDR and replication to reduce false positives
Stage 2 — Investigate deeply (per-hit)
For each top hit, construct an exposure-specific DAG