Summary: | HIV/AIDS epidemic is an important public health problem. The burden of the epidemic is estimated from surveillance systems data. The collected information is incomplete, making the estimation a challenging task and the reported trends often biased. The most common incomplete-data problems, in this kind of data, are due to under-diagnosis and reporting delays, mainly in the most recent years. This is a classical problem for imputation methodologies. In this paper we study the distribution of AIDS reporting delays through a mix approach, combining longitudinal K-means with the generalized least squares method. While the former identifies homogeneous delay patterns, the latter estimated longitudinal regression curves. We found that a 2-cluster structure is appropriated to accommodate the heterogeneity in reporting delay on HIV/AIDS data and that the corresponding estimated delay curves are almost stationary over time.
|