ST CrossingPose: A Spatial-Temporal Graph Convolutional Network for Skeleton-Based Pedestrian Crossing Intention Prediction

Link to article


Pedestrian crossing intention prediction is crucial for the safety of pedestrians in the context of both autonomous and conventional vehicles and has attracted widespread interest recently. Various methods have been proposed to perform pedestrian crossing intention prediction, among which the skeleton-based methods have been very popular in recent years. However, most existing studies utilize manually designed features to handle skeleton data, limiting the performance of these methods. To solve this issue, we propose to predict pedestrian crossing intention based on spatial-temporal graph convolutional networks using skeleton data (ST CrossingPose). The proposed method can learn both spatial and temporal patterns from skeleton data, thus having a good feature representation ability. Extensive experiments on a public dataset demonstrate that the proposed method achieves very competitive performance in predicting crossing intention while maintaining a fast inference speed. We also analyze the effect of several factors, e.g., size of pedestrians, time to event, and occlusion, on the proposed method.

IEEE Transactions on Intelligent Transportation Systems