Semiparametric estimators for the regression coefficients in. Missing data arise in almost all scientific disciplines. In this paper, we consider a semiparametric regression model in the presence of missing covariates for nonparametric components under a bayesian framework. Abstract we consider the efficiency bound for the estimation of the parameters of semiparametric models defined solely by restrictions on the means of a vector of correlated outcomes, y, when the data on y are missing at random. Rod little and don rubin have contributed massively to the development of theory and methods for handling missing data rubin being the originator of multiple imputation. This book combines much of what is known in regard to the theory of estimation for semiparametric models with missing data in an organized and comprehensive manner. A semiparametric estimation of mean functionals with. Kalbfleisch and menggang yu university of michigan, university of michigan and indiana university we consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model with. Semiparametric efficiency in gmm models of nonclassical measurement errors, missing data and treatment effects by xiaohong chen, han hong, and alessandro tarozzi march 2008 cowles foundation discussion paper no. Pdf semiparametric estimation with data missing not at. Missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. Multiple imputation mi rubin, 1987 is a principled method for addressing itemlevel missing data. Semiparametric efficiency in multivariate regression models.
The theory and methods for measurement errors and missing. The theory of missing data applied to semiparametric models is scattered throughout the literature with no thorough comprehensive treatment of the subject. Analysis of generalized semiparametric regression models for. Semiparametric theory and missing data electronic resource anastasios a. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Statistical analysis with missing data, by little and rubin, 2002, 408 pages. Missing data occurs frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. The main results are given in a more relevant format for. Semiparametric estimating equations inference with. The description of the theory of estimation for semiparametric. Kalbfleischand menggangyu university of michigan, university of michigan and indiana university we consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model. International conference on robust statistics 2016 1. This paper considers the problem of parameter estimation in a general class of semiparametric models when observations are subject to missingness at random.
Semiparametric regression models with missing data. Semiparametric theory and missing data researchgate. We develop inference tools in a semiparametric partially linear regression model with missing response data. Asymptotic theory for the semiparametric accelerated failure time model with missing data.
Analysis of semiparametric regression models for repeated outcomes in the presence of missing data james m. Strategies for bayesian modeling and sensitivity analysis m. Some covariates, however, may be measured with substantial errors. Kriging regression imputation method to semiparametric. By continuing to use our website, you are agreeing to our use of cookies. Semiparametric theory and missing data by anastasios. In this article, we propose a nonparametric imputation method based on the propensity score in a general class of semiparametric models for nonignorable missing data.
An outcome is said to be missing not at random mnar if, conditional on the observed variables, the missing data mechanism still depends on. Estimation in semiparametric models with missing data 789 from the imputed estimating function gn. In order to overcome the robust defect of traditional complete data estimation method and regression imputation estimation technique, we propose a modified imputation estimation approach called krigingregression. Semiparametric methods for missing data and causal. Missing data is a big issue in the world of clinical trials. Semiparametric theory and missing data ebook, 2006. If we are interested in studying the time to an event such as death due to cancer or failure of a light bulb, the cox model specifies the following distribution function for. We study a class of semiparametric skewed distributions arising when the sample selection process produces nonrandomly sampled observations. Semiparametric regression analysis with missing response at random qihua wang, oliver linton and wolfgang h. Missing data is a very important issue in clinical trials and many models have been devised to handle them including multiple imputation rubin, pattern mixture models little and mixed effects linear and nonlinear models pinheiro and bates, molenberghs and verbeke and davidian. We show that any of our class of estimators is asymptotically normal.
Semiparametric modelbased inference in the presence of. Semiparametric theory for causal mediation analysis. Based on semiparametric theory and taking into account the symmetric nature of the population distribution, we propose both consistent estimators, i. In particular, we investigate a class of regressionlike mean regression, quantile regression, models with missing data, an example of a supply and demand simultaneous equations model and a. Bounded, efficient and doubly robust estimation with.
If time permits, it will also cover other advanced topics on handling incomplete data, such as in highdimensional data. Dec 28, 2012 the semiparametric models allow for estimating functions that are nonsmooth with respect to the parameter. Theory and practice guohua fengx, bin pengy, liangjun su. The semiparametric models allow for estimating functions that are nonsmooth with respect to the parameter. The theory is applied to a semiparametric missing data model where it is shown that the twostep gelweighted estimator possesses good efficiency and robustness properties when nuisance models are. This paper investigates a class of estimation problems of the semiparametric model with missing data. In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation. The description of the theory of estimation for semiparametric models is both rigorous and intuitive, relying on geometric ideas to reinforce the intuition and understanding of the theory. Semiparametric models allow at least part of the data generating process to be. Semiparametric methods for missing data and causal inference abstract in this dissertation, we propose methodology to account for missing data as well as a strategy to account for outcome heterogeneity. It starts with the study of semiparametric methods when there are no missing data. A semiparametric logistic regression model is assumed for the response probability and a nonparametric regression approach for missing data discussed in cheng 1994 is used in the estimator.
Semiparametric models allow at least part of the datagenerating process to be. We consider a semiparametric model that parameterizes the conditional density of the response, given covariates, but allows the marginal distribution we use cookies to enhance your experience on our website. In order to overcome the robust defect of traditional complete data estimation method and regression imputation estimation technique, we propose a modified imputation estimation approach called krigingregression imputation. Semiparametric modelling is, as its name suggests, a hybrid of the parametric and nonparametric approaches to construction, fitting, and validation of statistical models. While many of the other missing data books do mention clinical trials some quite extensively, this book focuses exclusively on missing data in trials. While estimation of the marginal total causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. Our theoretical results provide new insight for the theory of semiparametric efficiency bounds literature and open the door to new applications. A class of estimators is defined that includes as special cases a semiparametric regression imputation estimator, a marginal average estimator, and a marginal propensity score weighted estimator. Classical semiparametric inference with missing outcome data is not robust to contamination of the observed data and a single observation can have arbitrarily large influence on estimation of a parameter of interest. Tchetgen tchetgen and ilya shpitser more by eric j.
Abstract we develop inference tools in a semiparametric partially linear regression model with missing response data. It has just been published, and ive not looked at it yet, but my guess is that it will be of use to many statisticians and trialists. In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components. Efficiency bounds, multiple robustness and sensitivity analysis eric j. This book summarizes knowledge regarding the theory of estimation for semiparametric models with missing data.
We propose a nonparametric imputation method for the missing values, which then leads to imputed estimating equations for the finite dimensional parameter of interest. Id 7011694 semiparametric theory and missing data anastasios a. A wellknown example of a semiparametric model is the cox proportional hazards model. In many cases, missing data in an analysis is treated in a casual and adhoc manner, leading to invalid inferences and erroneous conclusions. Missing data often appear as a practical problem while applying classical models in the statistical analysis. Semiparametric regression analysis with missing response at. We propose a nonparametric imputation method for the missing values, which then leads to imputed estimating equations for the. This sensitivity is exacerbated when inverse probability weighting methods are used, which may overweight contaminated observations. By adopting nonparametric components for the model, the estimation method can be made robust. In many cases, the treatment of missing data in an analysis is carried out in a casual and adhoc manner, leading, in many cases, to invalid inference and erroneous conclusions. Semiparametric theory and missing data by tsiatis, a. When the proportion of itemlevel missing values is nontrivial and the data are not missing completely at random mcar, typical solutions for missing data like complete case analysis often lead to increased bias and reduced statistical power. In this paper, based on the exponential tilting model, we propose a semiparametric estimation method of mean functionals with nonignorable missing data.
Semiparametric theory and missing data anastasios a. Asymptotic theory for the semiparametric accelerated. These methods are also illustrated using data from a breast cancer stage ii clinical trial. Simulation studies demonstrate the relevance of the theory in finite samples. Tsiatis takes a new approach with semiparametric models. An outcome is said to be missing not at random mnar if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. The asymptotic properties of these estimators are developed using theory of counting processes and semiparametric theory for missing data problems. Semiparametric theory and missing data anastasios tsiatis.
Zhao 1994 and robins and rotnitzky 1992 are revisited for semiparametric regression models with missing data using the theory outlined in the monograph by bickel, klaassen, ritov, and wellner 1993. We consider a class of doubly weighted rankbased estimating methods for the transformation or accelerated failure time model with missing data as arise, for. The main results are given in a more relevant format for the calculations of ecient score functions. In such settings, identification is generally not possible without imposing additional. Semiparametric theory and missing data in searchworks catalog. Anastasios a tsiatis this book combines much of what is known in regard to the theory of estimation for semiparametric models with missing data in an organized and comprehensive manner. Tsiatis andrea rotnitzky missing data in longitudinal studies. For a more thorough discussion of semiparametric efficiency theory and precise definitions of pathwise derivatives and the tangent space we refer to. Semiparametric theory and missing data pp 5399 cite as. Moreover, the responses may be missing and the missingness may be nonignorable. Semiparametric singleindex panel data models with interactive fixed e ects. A statistical model is a parameterized family of distributions.
Asymptotic theory for the semiparametric accelerated failure. The generalized semiparametric regression models for the cumulative incidence functions with missing covariates are investigated. Semiparametric inverse propensity weighting for nonignorable. Semiparametric theory and missing data springerlink. Semiparametric location estimation under nonrandom sampling. Kriging regression imputation method to semiparametric model. The cumulative incidence function quantifies the probability of failure over time due to a specific cause for competing risks data. Missing data is a pervasive problem in data analyses, resulting in datasets that contain censored realizations of a target distribution. Semiparametric regression analysis with missing response.
Handling data with the missing not at random mnar mechanism is still a challenging problem in statistics. Zhiqiang tan, bounded, efficient and doubly robust estimation with inverse weighting, biometrika, volume 97, issue 3. Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. Stanford libraries official online search tool for books, media, journals, databases, government documents and more. Bounded, efficient and doubly robust estimation with inverse weighting zhiqiang tan. Statistics in the pharmaceutical industry, 3rd edition. We also refer to chen, hong and tarozzi 2008, who study semiparametric e. Estimation in semiparametric models with missing data. Asymptotic theory for the semiparametric accelerated failure time model with missing data by bin nan,1 johnd. Covariates are usually introduced in the models to partially explain interindividual variations.