Since face recognition from video typically involves two steps: face detection and face recognition, one of the problems associated with poor face recognition results is poor face detection. In the current work, the authors devise a three-step strategy to prune a set of frames in a probe video to that subset in which the face is actually found and in which the matching scores are more reliable.
The authors also exploit temporal continuity of video frames to improve recognition by weighting the match scores based on the results in previously seen frames. The authors show that though this is a very challenging real-world dataset, by combining these different approaches, recognition can improve over the baseline case. (Publisher abstract provided)