Human-Object-Interaction Detection


Human-Object Interaction (HOI) detection has received increasing attention recently. Given an image, HOI detection aims to detect the triplet < human, interaction, object >. The subject of the triplet is fixed as human while the interaction is action. HOI detection has a lot of applications, such as activity analysis, human-machine interaction and intelligent monitoring. The existing datasets such as HICO-Det and VCOCO have greatly boosted the related research. However, in practical application, there are limited frequent HOI categories that need to be paid special attention to, which are not emphasized in previous datasets. To this end, we introduced a new dataset called Human-Object Interaction for Application (HOI-A) dataset.

Important Date

Time zone: Beijing, UTC+8

April 1th, 10:00:00, 2021
May 6st, 10:00:00, 2021
June 4th, 10:00:00, 2021
June 8th, 10:00:00, 2021
June 20th, 2021
Training / validation set released
Testing set released and submission opened
Submission deadline
Challenge winners notified
Winners present at CVPR 2021 Workshop

Task Rules

  1. Make sure the submission format has met the requirements. See results.json
  2. Using extra training data are not allowded. But the usage of pretrain model has no limitations.
  3. You have 10 submission chances in total.
  4. The evaluation process can takes times. And a failed submission will not cause the reduction of submission chances.

Task Metric

  1. Following the standard settings in HICO-DET and VCOCO benchmarks, we evaluate HOI detection using mean average precision (mAP). An HOI detection is considered as a true positive when the human detection, the object detection, and the interaction class are all correct. The human and object bounding boxes are considered as true positives if they overlap with a ground truth bounding boxes of the same class with an intersection over union (IoU) greater than 0.5.
  2. Dataset

    Human-Object Interaction in the Wild (HOI-W), which only focuses on limited typical kinds of relationships with practical significance, such as smoke,ride,call telephone. We provide 38K images with different scenes and various illuminations. The HOI-Det annotations include the bounding-box of human and object, and the human-human/object relationships. An example annotation of HOI-W is shown as follow:

    Data Statistics

    Dataset Total Trainval Test Objects Relations Instance Num HoI Num
    HOI-W 38636 29842 8794 11 10 104K 96K

    Image Example



    title={Ppdm: Parallel point detection and matching for real-time human-object interaction detection},
    author={Liao, Yue and Liu, Si and Wang, Fei and Chen, Yanjie and Qian, Chen and Feng, Jiashi},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},

    title={Reformulating HOI Detection as Adaptive Set Prediction},
    author={Chen, Mingfei and Liao, Yue and Liu, Si and Chen, Zhiyuan and Wang, Fei and Qian, Chen},
    booktitle={arXiv preprint arXiv:2103.05983},