This project compared model updating methods with and without labeled secondary samples.
Updating a calibration model formed in original (primary) sample and spectral measurement conditions to predict analyte values in novel (secondary) conditions is an essential activity in analytical chemistry in order to avoid a complete recalibration. Established model updating methods require sample analyte reference values for a small set of secondary domain samples (labeled data) to be used in updating processes. Because obtaining reference values is time consuming and is the costly part of any calibration, methods are needed that do not require labeled secondary samples, thereby enabling on demand model updating. In the current project, a hybrid model updating approach was also developed and evaluated. Unfortunately, a major impediment to adapting a model without secondary analyte reference values has been model selection. Because multiple tuning parameters are commonly involved in model updating methods, thousands of models are formed, making model selection complex. A recently developed framework was evaluated for automatic model selection of several two to three tuning parameter-based model updating methods without secondary analyte reference values (labels). The model selection method is based on model diversity and prediction similarity (MDPS) of the unlabeled samples to be predicted. The new secondary samples to be predicted can be used to form the updated models and again to select the final predicting models. Because models are formed and selected on demand to directly predict target samples, complicated cross-validation processes are not needed. Four near-infrared data sets covering 40 model updating situations were evaluated, showing that MDPS can select reliable updated models outperforming or rivaling prediction errors from total recalibrations with secondary reference values. (publisher abstract modified)