Summary: | This paper presents an extended comparison study between 16 different linear and nonlinear regression methods to predict the sugar, pH, and anthocyanin contents of grapes through hyperspectral imaging (HIS). Despite the numerous studies on this subject that can be found in the literature, they often rely on the application of one or a very limited set of predictive methods. The literature on multivariate regression methods is quite extensive, so the analytical domain explored is too narrow to guarantee that the best solution has been found. Therefore, we developed an integrated linear and non-linear predictive analytics comparison framework (L&NL-PAC), fully integrated with five preprocessing techniques and five different classes of regression methods, for an effective and robust comparison of all alternatives through a robust Monte Carlo double cross-validation stratified data splitting scheme. L&NLPAC allowed for the identification of the most promising preprocessing approaches, best regression methods, and wavelengths most contributing to explaining the variability of each enological parameter for the target dataset, providing important insights for the development of precision viticulture technology, based on the HSI of grape. Overall, the results suggest that the combination of the SavitzkyGolay first derivative and ridge regression can be a good choice for the prediction of the three enological parameters.
|