Dataset for: Mendelian randomization using semiparametric linear transformation models
datasetposted on 2019-12-27, 09:24 authored by Yen-Tsung Huang
Mendelian randomization (MR) uses genetic information as an instrumental variable (IV) to estimate the causal effect of an exposure of interest on an outcome in the presence of unknown confounding. We are interested in the causal effect of cigarette smoking on lung cancer survival, which is subject to confounding by underlying pulmonary functions. Despite the well-developed IV analyses for the continuous and binary outcomes, the scarcity of methodology for the survival outcome limits its utility for the time-to-event data collected in many observational studies. We propose an IV analysis method in the survival context, estimating causal effects on a transformed survival time and survival probabilities using semiparametric linear transformation models. We study the conditions under which hazard ratio and the effect on survival probability can be approximated. For statistical inference, we construct estimating equations to circumvent the difficulty in deriving joint likelihood of the exposure and the outcome, due to the unknown confounding. Asymptotic properties of the proposed estimators are established without parametric assumptions about confounders. We study the finite sample performance in extensive simulation studies. The MR analysis of a lung cancer study suggests a harmful prognostic effect of smoking packyears that would have been missed by the crude association.