State of the art on the Market-1501 dataset

In this page, will summarize the state-of-the-art methods on Market-1501 dataset. We will report both mAP and rank-1, 5, 10, 20 accuracies. Note that this may not be the only performance measurement. Other metrics, such as recognition time, are also important.

When CMC curves are used in the respective paper, we roughly estimate the numbers and fill in the blanks. The authors may feel free to contact me with the accurate numbers. Priorities are given to papers whose codes are published. Should you have any inquery, please contact me at liangzheng06@gmail.com.

Reference Market-1501 Notes
rank-1rank-5rank-10rank-20rank-30rank-50mAP
"Scalable person re-identification: a benchmark", Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, Qi Tian, ICCV 2015 8.28-----2.23 gBiCov [1], Euclidean distance, single query
9.62-----2.72HistLBP [2], Euclidean distance, single query. Super thanks to Mengran Gou for sending us the evaluation results
26.07-----7.75LOMO [3], Euclidean distance, single query
35.8452.4060.3367.6471.8875.8014.75BoW, Euclidean distance, single query
44.3660.2466.4873.2576.1979.6919.42BoW, Euclidean distance, multiple query
34.00-----15.66BoW + LMNN, single query
38.21-----17.05BoW + ITML, single query
44.4263.9072.1878.9582.5187.0520.76BoW + KISSME, single query
"Person re-identification: Past, Present and Future", Liang Zheng, Yi Yang, Alexander Hauptmann, Arxiv 2016 55.4976.2883.5588.9891.7293.9732.36 AlexNet identification model, using FC7 (4,096-dim) and Euclidean distance for testing, single query. This method is also used in [4,5]
73.9087.6891.5494.8096.0297.2147.78ResNet-50 identification model, using Pool5 (2,048-dim) and Euclidean distance for testing, single query
State of the art in Supervised Learning
"Multiregion Bilinear Convolutional Neural Networks for Person Re-Identification", Evgeniya Ustinova, Yaroslav Ganin, Victor Lempitsky, AVSS 2017. 66.36 85.01 90.17---41.17Multiregion Bilinear DML, single query.
"Scalable Metric Learning via Weighted Approximate Rank Component Analysis", Cijo Jose, François Fleuret, ECCV 2016 45.16 68.12 768487-- Use the baseline BoW descriptor and the proposed WARCA metric learning method.
"A Comprehensive Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets", Srikrishna Karanam, Mengran Gou, Ziyan Wu, Angels Rates-Borras, Octavia Camps, Richard J. Radke, ArXiv 2016 46.5 71.1 79.986.9--- HistLBP+kLFDA. Single query.
"Temporal Model Adaptation for Person Re-Identification", Niki Martinel, Abir Das, Christian Micheloni, Amit K. Roy-Chowdhury, ECCV 2016 47.92 - ----22.31 Using 13.58% of the labeled data. Single query.
"Deep Linear Discriminant Analysis on Fisher Networks: A Hybrid Architecture for Person Re-identification", Lin Wu, Chunhua Shen, Anton van den Hengel, ArXiv 2016 48.15 - ----29.94 Combines Fisher vector and deep neural network. Not sure whether multiple queries are used.
"Learning a Discriminative Null Space for Person Re-identification", Li Zhang, Tao Xiang, Shaogang Gong, CVPR 2016. 55.43 - ----29.87LOMO+Discriminative Null Space, single query.
71.56 - ----46.03Both multiple query (MQ) and score-level feature fusion are used.
"Similarity Learning with Spatial Constraints for Person Re-identification", Dapeng Chen, Zejian Yuan, Badong Chen, Nanning Zheng, CVPR 2016 51.90 - ----26.35 Extract HSV, LAB, HOG, and SILTP features from patches, and use the proposed SCSP method. Single query.
"PersonNet: Person Re-identification with Deep Convolutional Neural Networks", Lin Wu, Chunhua Shen, Anton van den Hengel, ArXiv 2016. 37.21 - ----18.57Use single query. Similarity between boxes is learnt end-to-end through a deep network.
"End-to-End Comparative Attention Networks for Person Re-identification", Hao Liu, Jiashi Feng, Meibin Qi, Jianguo Jiang, Shuicheng Yan, ArXiv 2016. 48.24 - ----24.43Use single query. Features are learned by the Comparative Attention Network
"Deep Attributes Driven Multi-Camera Person Re-identification", Chi Su, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian, ECCV 2016. 39.4 - ----19.6single query.
49.0 - ----25.8Multiple query.
"Multi-Scale Triplet CNN for Person Re-Identification", Jiawei Liu, Zheng-Jun Zha, Qi Tian, Dong Liu, Ting Yao, Qiang Ling, Tao Mei, A 2016. 45.1 70.1 78.4-88.7--single query. Use a triplet loss CNN model with multi-scale improvement.
55.4 78.9 85.6-93.7--Multiple query
"Learning Deep Embeddings with Histogram Loss", Evgeniya Ustinova and Victor Lempitsky, NIPS 2016. 59.47 80.73 86.9491.09- --It seems the single query mode is chosen. A previously introduced deep metric learning framework is adopted, but with new loss functions.
"A Siamese Long Short-Term Memory Architecture for Human Re-Identification", Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang, ECCV 2016. 61.6 - ----35.3Use multiple queries. The LSTM model processes image regions sequentially.
"Gated Siamese Convolutional Neural Network Architecture for Human Re-Identification", Rahul Rama Varior, Mrinal Haloi, Gang Wang, ECCV 2016. 65.88 - ----39.55single query. Feature learned by the Gated Siamese CNN.
76.04 - ----48.45Multiple query
"Point to Set Similarity Based Deep Feature Learning for Person Re-identification", Sanping Zhou, Jinjun Wang, Jiayun Wang, Yihong Gong, Nanning Zheng, CVPR 2017. 70.72 - ----44.27single query. The pairwise loss, triplet loss and a regularizor are jointly optimzed in the loss function.
85.78 - ----55.73Multiple query
"Person Re-Identification by Camera Correlation Aware Feature Augmentation", Ying-Cong Chen, Xiatian Zhu, Wei-Shi Zheng, Jian-Huang Lai, TPAMI 2017. 71.8 - ----45.5single query. Use CRAFT-MFA+LOMO
79.7 - ----54.3Multiple query
"Consistent-Aware Deep Learning for Person Re-identification in a Camera Network, Ji Lin, Liangliang Ren, Jiwen Lu, Jianjiang Feng, Jie Zhou, CVPR 2017. 73.84 - ----47.11single query. Pairwise similarities are considered across multiple cameras for samples in a batch.
80.85 - ----55.58Multiple query
"Looking Beyond Appearances: Synthetic Training Data for Deep CNNs in Re-identification", Igor Barros Barbosa, Marco Cristani, Barbara Caputo, Aleksander Rognhaugen and Theoharis Theoharis, Arxiv 2017. 73.87 88.03 92.2295.0796.2097.3947.89single query. Use SOMAnet and Market1501 as training set.
81.29 92.61 95.3197.1297.6898.4356.98Multiple query
"Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion", Haiyu Zhao, Maoqing Tian, Shuyang Sun, Jing Shao, Junjie Yan, Shuai Yi, Xiaogang Wang, Xiaoou Tang, CVPR 2017. 76.9 91.5 94.696.7---single query. CPM is trained on MPII for pose estimation and part localization.
"Re-ranking Person Re-identification with k-reciprocal Encoding", Zhun Zhong, Liang Zheng, Donglin Cao and Shaozi Li, CVPR 2017. 77.11 - --- -63.63Single query. Re-ranking is performed.
"Pose Invariant Embedding for Deep Person Re-identification", Liang Zheng, Yujia Huang, Huchuan Lu, and Yi Yang, Arxiv 2017. 79.33 90.76 94.4196.52- -55.95Single query. The PIE descriptor and kissme is used.
"Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro", Zhedong Zheng, Liang Zheng, Yi Yang, ICCV 2017. 78.06 - ----56.23single query. GAN images are used in the ResNet baseline.
85.12 - ----68.52Multiple query
"A Discriminatively Learned CNN Embedding for Person Re-identification", Zhedong Zheng, Liang Zheng, Yi Yang, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2017. 79.51 90.91 94.0996.2397.3398.2559.87single query. Identification and Verification losses are used in a siamese network based on ResNet-50.
85.84 94.54 96.4197.5198.0798.8170.33Multiple query
"Learning Deep Context-aware Features over Body and Latent Parts for Person Re-identification", Dangwei Li, Xiaotang Chen, Zhang Zhang, Kaiqi Huang, CVPR 2017 80.31 - ----57.53single query. Latent body parts are discovered by the spatial transformer network instead of rigid partitioning.
86.79 - ----66.70Multiple query
"Deeply-Learned Part-Aligned Representations for Person Re-Identification", Liming Zhao, Xi Li, Jingdong Wang, Yueting Zhuang, ICCV 2017. 81.0 - ----63.4single query. Body parts are detected from feature maps and their respective features are concatenated later.
"Scalable Person Re-identification on Supervised Smoothed Manifold", Song Bai, Xiang Bai, Qi Tian, CVPR 2017. 82.21 - ----68.80single query. IDE+re-ranking.
88.18 - ----76.18Multiple query
"Divide and Fuse: A Re-ranking Approach for Person Re-identification", Rui Yu, Zhichao Zhou, Song Bai, Xiang Bai, BMVC 2017. 82.30 - ----72.42single query. Features are divided into sub-vectors before re-encoded into a new vector. The new vectors are fused into one vector for ranking.
"SVDNet for Pedestrian Retrieval", Yifan Sun, Liang Zheng, Weijian Deng, Shengjin Wang, ICCV 2017. 82.3 - --- -62.1Single query. 1,024-dim pool5 feature from svdnet is used.
"Pose-driven Deep Convolutional Model for Person Re-identification", Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian, ICCV 2017. 84.14 92.73 94.9296.82--63.41Single query. Human part is discovered with pose models. Local and Global images are used for feature learning.
"Deep Transfer Learning for Person Re-identification", Mengyue Geng, Yaowei Wang, Tao Xiang, Yonghong Tian, Arxiv 2016. 83.7 - ----65.5single query. Identification and Verification losses are used in a siamese network based on GoogleNet.
89.6 - ----73.8Multiple query
"Improving Person Re-identification by Attribute and Identity Learning", Yutian Lin, Liang Zheng, Zhedong Zheng, Yu Wu and Yi Yang, Arxiv 2017. 84.2993.20 95.1997.00--64.67Single query. Attributes and ID classification are jointly learning.
"Pedestrian Alignment Network for Person Re-identification", Liang Zheng, Zhedong Zheng, Yi Yang, Arxiv 2017. 82.81 - ----63.35single query. Pedestrians are aligned by the Spatial Transformer Network. Results could be higher when fine-tuning on the GAN model [7].
85.78 93.38 ----76.56Single query + re-ranking [6]
88.18 - ----71.72Multiple query
89.79 - ----83.79Multiple query + re-ranking [6]
"Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-free Approach", Lingxiao He, Jian Liang, Haiqing Li, and Zhenan Sun, CVPR 2018. 83.58 - ----64.25single query. Deep Spatial feature Reconstruction (DSR) is further developed to avoid explicit alignment..
"Person re-identification by deep joint learning of multi-loss classification", Wei Li, Xiatian Zhu, and Shaogang Gong, IJCAI 2017. 83.9 - ----64.4single query. Stripes and global images are jointly considered in a classification CNN network with multiple streams.
88.8 - ----72.9Single query + re-ranking [6]
85.1 - ----65.5single query, 4 body parts
89.7 - ----74.5Multiple query, 4 body parts
"In Defense of the Triplet Loss for Person Re-Identification", Alexander Hermans, Lucas Beyer and Bastian Leibe, Arxiv 2017. 84.92 94.21 ----69.14single query. The triplet-loss based network is fine-tuned. Image size: 256x128. The last layer in ResNet is replaced with one 1,024-dim layer and one 128-dim layer. Batch normalization is used as well.
86.67 93.38 ----81.07Single query + re-ranking [6]
90.53 96.29 ----76.42Multiple query
91.75 95.78 ----87.18Multiple query + re-ranking [6]
"CamStyle Augmentation", Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li , Yi Yang, CVPR 2018. 88.12 - ----68.72single query. A new data augmentation approach which transfers images from one camera to the style of another camera.
89.49 - ----71.55Re-ranking.
"Deep Mutual Learning", Ying Zhang, Tao Xiang, Timothy Hospedales, Huchuan Lu, CVPR 2018. 87.73 - ----68.83single query. Two MoblieNets learn from each other, and the average re-ID results of the two individual networks is reported.
91.66 - ----77.14multiple query
"A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking", M. Saquib Sarfraz, Arne Schumann, Andreas Eberle, Rainer Stiefelhagen, CVPR 2018. 87.7 - ----69.0single query. Camera view and body joints are integrated in the network.
90.3 - ----84.0single query + ECN (Expanded Cross Neighborhood) re-ranking
"Random Erasing Data Augmentation", Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, Yi Yang, Arxiv 2017. 87.08 - ----71.31single query. SVDNet + random erasing data augmentation.
89.13 - ----83.93Re-ranking is used on the rank list obtained by single query
"Features for Multi-Target Multi-Camera Tracking and Re-Identification", Ergys Ristani, Carlo Tomasi, CVPR 2018. 89.46 - ----75.67single query. Based on DPFL, called AWTL (2-stream).
"Harmonious Attention Network for Person Re-Identification", Wei Li, Xiatian Zhu, Shaogang Gong, CVPR 2018. 91.2 - ----75.7single query. Pixel-level attention, regional level attention and feature learning are jointly optimized.
93.8 - ----82.8multiple queries.
State of the art in unsupervised Learning / domain adaptation
"Efficient Online Local Metric Adaptation via Negative Samples for Person Re-Identification", Jiahuan Zhou, Pei Yu, Wei Tang and Ying Wu, ICCV 2017. 40.93 - -74.06---Single query. LOMO is used for initialization. This method does not need any positive pairs.
51.45 - -80.98---Multiple query.
"Unsupervised Person Re-identification: Clustering and Fine-tuning", Hehe Fan, Liang Zheng and Yi Yang, Arxiv 2017. 44.7 59.1 65.671.7--20.1Single query. An IDE model trained on DukeMTMC-reID [7] is used for initialization. Kmeans is used for label estimation.
41.9 57.3 64.370.5--18.0Single query. An IDE model trained on CUHK03 is used for initialization.
"Cross-view Asymmetric Metric Learning for Unsupervised Person Re-identification", Hong-Xing Yu, Ancong Wu, and Wei-Shi Zheng, ICCV 2017. 54.5 - ----26.3Multiple query. JSTL is used for initialization. A clustering method is used for label estimation.
"Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification", Weijian Deng, Liang Zheng, Guoliang Kang, Yi Yang, Qixiang Ye, Jianbin Jiao, CVPR 2018. 51.5 70.1 76.8---22.8Single query. DukeMTMC [7] labels are used for domain adaptation. SPGAN is an improved version of CycleGAN.
57.7 75.8 82.4---26.7Single query. Local max pooling is used in addition to SPGAN.
57.0 73.9 80.3---27.1Multiple query. SPGAN is used without local max pooling.
"Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification", Jingya Wang, Xiatian Zhu, Shaogang Gong, Wei Li, CVPR 2018. 58.2 74.8 81.186.5--26.5Single query. DukeMTMC [7] labels are used as source for unsupervised domain adaptation. Source attributes are used in addition to the ID labels.
"Unsupervised Cross-dataset Person Re-identification by Transfer Learning of Spatial-Temporal Patterns", Jianming Lv, Weihang Chen, Qing Li, and Can Yang, CVPR 2018. 60.75 74.44 79.25----Single query. Pedestrians’ spatio-temporal patterns in the target domain are learned, in addition to model fusion and learning to rank methods.
"Unsupervised Person Re-identification by Deep Learning Tracklet Association", Minxian Li, Xiatian Zhu, and Shaogang Gong, ECCV 2018. 63.7 - -----Single query. Tracklets are associated across cameras to provide labels for subsequent learning.
Use the dataset, but do not report results/ use different evaluation protocols
"Constrained Deep Metric Learning for Person Re-identification", Hailin Shi, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Yang Yang, Stan Z. Li, ArXiv 2015. - - -----Used together with CUHK03 as training data for the proposed Constrained Deep Metric Learning. Test on CUHK01 and VIPeR.
"An Enhanced Deep Feature Representation for Person Re-identification", Shangxuan Wu, Ying-Cong Chen, Xiang Li, An-Cong Wu, Jin-Jie You, Wei-Shi Zheng, WACV 2016. - - -----Used as training data for the proposed Feature Fusion Net. Testing is performed on other benchmarks.
"Semantics-Aware Deep Correspondence Structure Learning for Robust Person Re-identification", Yaqing Zhang, Xi Li, Liming Zhao, Zhongfei Zhang, IJCAI 2016. - - -----Used as training data for the proposed DCSL model.
"Human-In-The-Loop Person Re-Identification", Hanxiao Wang, Shaogang Gong, Xiatian Zhu, Tao Xiang, ECCV 2016. 78.0 - ---86.0-1000 identities, 300 queries are used. Single Shot. 6 random splits.
33.8 61.0 73.685.3--- 501 identities, single shot, 6 random splits. We assume 501 queries are used.

References

[1] B. Ma, Y. Su, and F. Jurie. Covariance descriptor based on bioinspired features for person re-identification and face verification. Image and Vision Computing, 2014.
[2] F. Xiong, M. Gou, O. Camps, and M. Sznaier. Person reidentification using kernel-based metric learning methods. In ECCV, 2014.
[3] S. Liao, Y. Hu, X. Zhu, and S. Z. Li. Person re-identification by local maximal occurrence representation and metric learning. In CVPR, 2015.
[4] L. Zheng, Z. Bie, Y. Sun, J. Wang, C. Su, S. Wang, and Q. Tian, MARS: A Video Benchmark for Large-Scale Person Re-identification. In ECCV, 2016.
[5] L. Zheng, H. Zhang, S. Sun, M. Chandraker, Yi Yang, and Q. Tian. Person re-identification in the Wild. In CVPR, 2017.
[6] Z. Zhong, L. Zheng, D. Cao, and S. Li. Re-ranking Person Re-identification with k-reciprocal Encoding. In CVPR 2017.
[7] Z. Zheng, L. Zheng, Yi Yang, Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro. ArXiv 2017.