Comparative Evaluation of Structural and Appearance-Based Gait Representations Under Viewpoint and Appearance Variations

Gait recognition remains challenging under viewpoint changes and appearance-related covariates such as clothing variation and carried objects. While silhouette-based representations perform strongly in controlled conditions, their robustness across mismatched scenarios is limited. This work evaluates three gait representations on a multi-view benchmark: the Gait Energy Image and two pose-derived motion templates encoding normalized joint trajectories. Sixteen training--testing configurations were defined to assess cross-view performance and generalization under clothing and carrying variations. All experiments used an identical convolutional neural network to isolate representation effects. Results show that the silhouette-based representation achieves the highest accuracy in matched conditions, especially at lateral viewpoints, whereas pose-derived templates exhibit smoother degradation under appearance shifts and moderate view changes. Diversified training improves robustness for all methods but does not remove intrinsic representation differences. The findings highlight a trade-off between discriminative strength and covariate resilience, supporting the complementary role of structural motion representations in realistic gait recognition systems.

João Ferreira Nunes
Instituto Politécnico de Viana do Castelo
Portugal

Débora Gonçalves
Instituto Politécnico de Viana do Castelo
Portugal

Rúben Ferreira
Instituto Politécnico de Viana do Castelo
Portugal

Pedro Miguel Moreira
Instituto Politécnico de Viana do Castelo
Portugal