Summary: | In this work we introduce a new frailty model for clustered survival data using the Generalized Inverse- Gaussian (GIG) distribution for the frailty. Assuming this distribution implies in a exible model that is mathematically advantageous since closed expressions are avaliable for the unconditional survival and density functions. The parametric and semiparametric versions of the GIG frailty model are presented. We focus on the semiparametric approach that is based on the piecewise exponential distribution. An EM algorithm is proposed to estimate the parameters under this approach. The exibility of the proposed model comes from working with a two-parameter frailty distribution. One of the parameters will determine the frailty distribution and our interest will be in adjusting the di erent special cases of the GIG distribution obtained by changing the value of this parameter. These include the Inverse-Gaussian, Reciprocal Inverse-Gaussian, Hyperbolic and Positive Hyperbolic distributions. With this, we have in hand the exibility of testing di erent frailties, making it possible to accomodate distinct correlation structures that might not be captured by tting a single model. We present simulation studies under both parametric and semiparametric approaches. In the parametric simulation study, we explore parameter estimation in nite samples sizes under correct model speci cation. A comparison to other models in the literature such as gamma and generalized exponential frailty models is made under the semiparametric approach where the proposed frailty shows competitive results under misspeci cation. We illustrate the applicability of the GIG frailty model through two real data examples. The rst consists on data obtained from the Therapeutically Applicable Research to Generate E ective Treatments (TARGET) 2 initiative, where we chose to investigate the e ect of two genetic variables on the lifetime of children diagnosed with neuroblastoma cancer. To illustrate the application of the proposed methodology to clustered survival data, we also include the t to the well known kidney catheter data set. In the real data examples we compared the t of the proposed model with the t of the gamma and generalized exponential frailty models under parametric and semiparametric approaches. Through the TARGET Neuroblastoma data set we were able to show that the gamma frailty model, being the most popular choice, su ers with convergence issues that the other models did not present. In addition, in this example, the GIG frailty proved to be the most robust regarding the speci cation of the baseline hazard function
|