The images are collected from real-world scenarios, with humans appearing with challenging poses and views, heavily occlusions, various appearances and low-resolutions. We provide 14K images with 85 kinds of labels and 31 kinds of relations. For each image, we have 10 instances and 17 relationships on average. In sum, we label 136K instances and 235K relations. We mainly define 2 kinds of relations, including position relations and action relations. Several example images of the dataset are shown in the following.
|Dataset||Total||Train||Val||Test||Labels||Relations||Instance Num||Relation Num|