浏览全部资源
扫码关注微信
山西大学 算机与信息技术学院,山西 太原 030006 计算智能与中文信息处理教育部重点实验室(山西大学),山西 太原 030006
[ "郭虎升,博士,教授,博士生导师,入选“三晋英才”。中国计算机学会(CCF)高级会员,CCF人工智能与模式识别专委会执行委员、中国人工智能学会(CAAI)机器学习专委会委员、CAAI知识工程专委会委员。担任AAAI、CCFAI、CCDM、CCML、NCIIP等国际国内学术会议的出版主席、论坛主席、程序委员等。近年来,主持国家自然科学基金项目2项,省部级科学研究、教学改革及其他企事业委托项目10余项,以第一或通讯作者身份在Neural Networks、Data Mining & Knowledge Discovery、Pattern Recognition、Knowledge & Information Systems、Information Sciences、《软件学报》《计算机研究与发展》等发表论文40余篇,出版国家级教材1部。曾荣获山西省科技进步二等奖、教育部宝钢教育奖、山西大学十佳青年教师、ACM理事会太原分会优博。主要研究方向为数据挖掘、机器学习、计算机视觉等。" ]
纸质出版日期:2024-06-15,
收稿日期:2024-05-02,
修回日期:2024-05-30,
移动端阅览
郭虎升.目标检测综述:从传统方法到深度学习[J].新兴科学和技术趋势,2024,3(2):128-145.
GUO Husheng.Object detection: From traditional methods to deep learning[J].Emerging Science and Technology,2024,3(2):128-145.
郭虎升.目标检测综述:从传统方法到深度学习[J].新兴科学和技术趋势,2024,3(2):128-145. DOI: 10.12405/j.issn.2097-1486.2024.02.002.
GUO Husheng.Object detection: From traditional methods to deep learning[J].Emerging Science and Technology,2024,3(2):128-145. DOI: 10.12405/j.issn.2097-1486.2024.02.002.
目标检测是计算机视觉领域中一个基础而富有挑战性的研究领域,近年来由于其广泛的应用前景,引起了学术界和工业界的极大关注。本文阐述了目标检测技术的历史进程和最新发展,尤其关注了从传统图像处理技术向基于深度学习模型的演进过程。文章详细探讨了深度学习时代的部分标志性算法,并评估了这些算法在实际场景中的表现和优势。本综述还深入分析了目标检测当前面临的一系列挑战,包括多尺度目标的检测、遮挡处理问题及满足实时处理的需求等。针对这些挑战,我们探讨了目前的解决策略以及未来的研究方向。最后,本文展望了目标检测技术的未来发展趋势,特别关注了如自监督学习和算法优化等前沿技术的潜在影响。
Research of object detection is basic but challenging in the field of computer vision. It has attracted great attention of academia and industry in recent years because of its wide application prospects. This paper describes the history and the latest development of object detection technology, and the focus is especially on the evolution from traditional image processing technology to deep learning-based model. The article discusses in more detail several landmark algorithms based on deep learning and evaluates their performance and advantages in real-world scenarios. This survey also provides an in-depth analysis of the current challenges of object detection, including multi-scale object detection, occlusion handling, and meeting the requirements of real-time processing, etc. In response to these challenges, this paper discusses current solutions and future research directions. Finally, it looks forward to the future development trend of object detection technology, with special attention to the potential impact of cutting-edge technologies such as self-supervised learning and algorithmic optimization.
计算机视觉深度学习目标检测技术演变
computer visiondeep learningobject detectiontechnical evolution
HARIHARAN B, ARBELÁEZ P, GIRSHICK R, et al. Simultaneous detection and segmentation[C]//European Conference on Computer Vision, Berlin: Springer, 2014: 297-312. DOI:10.5220/0009142905550561http://dx.doi.org/10.5220/0009142905550561.
DAI J, HE K, SUN J. Instance-aware semantic segmentation via multi-task network cascades[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2016: 3150-3158. DOI: 10.1109/cvpr.2016.343http://dx.doi.org/10.1109/cvpr.2016.343.
HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2017: 2980-2988.
KARPATHY A, L Fei-Fei. Deep visual-semantic alignments for generating image descriptions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2015: 3128-3137. DOI: 10.1109/cvpr.2015.7298932http://dx.doi.org/10.1109/cvpr.2015.7298932.
CIAPARRONE G, SANCHEZ F. L, TABIK S, et al. Deep learning in video multi-object tracking: A survey[J]. Neurocomputing, 2020, 381: 61-88. DOI:10.1016/j.neucom.2019.11.023http://dx.doi.org/10.1016/j.neucom.2019.11.023.
KALAKE L, WAN W, HOU L, et al. Analysis based on recent deep learning approaches applied in real-time multi-object tracking: A review[J]. IEEE Access, 2021, 9: 32650-32671. DOI: 10.1109/access.2021.3060821http://dx.doi.org/10.1109/access.2021.3060821.
薛万利, 张智彬, 裴生雷, 等. 混合目标与搜索区域令牌的视觉目标跟踪[J]. 计算机研究与发展, 2024, 61(2): 460-469. DOI: 10.7544/issn1000-1239.202220698http://dx.doi.org/10.7544/issn1000-1239.202220698.
HU S, Chen L, Wu P, et al. St-p3: End-to-end vision-based autonomous driving via spatial-temporal feature learning[C]//European Conference on Computer Vision, Berlin: Springer, 2022: 533-549. DOI: 10.1007/978-3-031-19839-7_31http://dx.doi.org/10.1007/978-3-031-19839-7_31.
GU J, HU C, ZHANG T, et al. Vip3d: End-to-end visual trajectory prediction via 3d agent queries[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2023: 5496-5506. DOI: 10.1109/cvpr52729.2023.00532http://dx.doi.org/10.1109/cvpr52729.2023.00532.
HU Y, YANG J, CHEN L, et al. Planning-oriented autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2023: 17853-17862.
CAI J, XU M, LI W, et al. Memot: Multi-object tracking with memory[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2022: 8090-8100. DOI: 10.1109/cvpr52688.2022.00792http://dx.doi.org/10.1109/cvpr52688.2022.00792.
CAO Z, HUANG Z, PAN L, et al. Tctrack: Temporal contexts for aerial tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2022: 14798-14808. DOI: 10.1109/cvpr52688.2022.01438http://dx.doi.org/10.1109/cvpr52688.2022.01438.
ZHOU X, YIN T, KOLTUN V, et al. Global tracking transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2022: 8771-8780. DOI: 10.1109/cvpr52688.2022.00857http://dx.doi.org/10.1109/cvpr52688.2022.00857.
OLLERO A, TOGNON M, SUAREZ A, et al. Past, present, and future of aerial robotic manipulators[J]. IEEE Transactions on Robotics, 2021, 38(1): 626-645. DOI: 10.1109/tro.2021.3084395http://dx.doi.org/10.1109/tro.2021.3084395.
CAO S, LU X, SHEN S. GVINS: Tightly coupled GNSS-visual-inertial fusion for smooth and consistent state estimation[J]. IEEE Transactions on Robotics, 2022, 38(4): 2004-2021. DOI: 10.1109/tro.2021.3133730http://dx.doi.org/10.1109/tro.2021.3133730.
TIAN Y, CHANG Y, ARIAS F H, et al. Kimera-multi: Robust, distributed, dense metric-semantic slam for multi-robot systems[J]. IEEE Transactions on Robotics, 2022,38(4): 3137751. DOI: 10.1109/tro.2021.3137751http://dx.doi.org/10.1109/tro.2021.3137751.
LOWE D G. Distinctive image features from scale-invariant key points[J]. International Journal of Computer Vision, 2004, 60: 91-110. DOI: 10.1023/b:visi.0000029664.99615.94http://dx.doi.org/10.1023/b:visi.0000029664.99615.94.
DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2005, vol. 1: 886-893. DOI: 10.1109/cvpr.2005.177http://dx.doi.org/10.1109/cvpr.2005.177.
史建柯,乔美英,李冰锋,等.基于注意力机制的水下遮挡目标检测算法[J].电子科技,2023,36(05):62-70. DOI:10.16180/j.cnki.issn1007-7820.2023.05.010http://dx.doi.org/10.16180/j.cnki.issn1007-7820.2023.05.010.
肖进胜, 赵陶, 周剑,等. 基于上下文增强和特征提纯的小目标检测网络[J]. 计算机研究与发展, 2023, 60(2): 465-474. DOI: 10.7544/issn1000-1239.202110956http://dx.doi.org/10.7544/issn1000-1239.202110956.
FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645. DOI: 10.1109/tpami.2009.167http://dx.doi.org/10.1109/tpami.2009.167.
MALISIEWICZ T, GUPTA A, EFROS A A. Ensemble of exemplar-SVMs for object detection and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Pittsburgh: Carnegie Mellon Univ, 2011: 89-96. DOI: 10.1109/iccv.2011.6126229http://dx.doi.org/10.1109/iccv.2011.6126229.
MALISIEWICZ T. Exemplar-Based Representations for Object Detection, Association and Beyond[M]. Pittsburgh: Carnegie Mellon Univ, 2011.
HOSANG J, BENENSON R, DOLLAR P, et al. What makes for effective detection proposals?[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(4): 814-830. DOI: 10.1109/tpami.2015.2465908http://dx.doi.org/10.1109/tpami.2015.2465908.
HOSANG J, BENENSON R, SCHIELE B. How good are detection proposals, really?[DB/OL]. [2014-07-22]. https://doi.org/10.48550/arXiv.1406.6962https://doi.org/10.48550/arXiv.1406.6962.
ALEXE B, DESELAERS T, FERRARI V. What is an object[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2010: 73-80. DOI: 10.1109/tpami.2015.2465908http://dx.doi.org/10.1109/tpami.2015.2465908.
ALEXE B, DESELAERS T, FERRARI V. Measuring the objectness of image windows[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11): 2189-2202. DOI: 10.1109/tpami.2012.28http://dx.doi.org/10.1109/tpami.2012.28.
CHENG M M, ZHANG Z, LIN W Y, et al. BING: Binarized normed gradients for objectness estimation at 300fps[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2014: 3286-3293. DOI: 10.1109/cvpr.2014.414http://dx.doi.org/10.1109/cvpr.2014.414.
REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149. DOI: 0.1109/tpami.2016.2577031http://dx.doi.org/0.1109/tpami.2016.2577031.
ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2014: 2147-2154. DOI: 10.1109/cvpr.2014.276http://dx.doi.org/10.1109/cvpr.2014.276.
SZEGEDY C, TOSHEV A, ERHAN D. Deep neural networks for object detection[J]. Advances in Neural Information Processing Systems, 2013: 2553-2561.
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2016: 779-788. DOI: 10.1109/cvpr.2016.91http://dx.doi.org/10.1109/cvpr.2016.91.
LAW H, DENG J. CornerNet: Detecting objects as paired keypoints[C]//European Conference on Computer Vision, Berlin: Springer, 2018: 734-750. DOI: 0.1007/s11263-019-01204-1http://dx.doi.org/0.1007/s11263-019-01204-1.
ZHOU X, ZHUO J, KRAHENBUHL P. Bottom-up object detection by grouping extreme and center points[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2019: 850-859. DOI: 10.1109/cvpr.2019.00094http://dx.doi.org/10.1109/cvpr.2019.00094.
YANG Z, LIU S, HU H, et al. RepPoints: Point set representation for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE,2019: 9657-9666. DOI: 10.1109/iccv.2019.00975http://dx.doi.org/10.1109/iccv.2019.00975.
ZHOU X, WANG D, KRÄHENBÜHL P. Objects as points[DB/OL]. [2019-04-26]. https://doi.org/10.48550/arXiv.1904.07850https://doi.org/10.48550/arXiv.1904.07850.
TIAN Z, SHEN C, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2019: 9627-9636. DOI: 10.1109/iccv.2019.00972http://dx.doi.org/10.1109/iccv.2019.00972.
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. [2020-04-23]. https://doi.org/10.48550/arXiv.2004.10934https://doi.org/10.48550/arXiv.2004.10934.
LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision, Cham, Switzerland: Springer, 2016: 21-37.
REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[DB/OL]. [2016-12-25]. https://doi.org/10.48550/arXiv.1612.08242https://doi.org/10.48550/arXiv.1612.08242.
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2017: 2117-2125.DOI: 10.1109/cvpr.2017.106http://dx.doi.org/10.1109/cvpr.2017.106.
LIU Z, LIN Y, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2021: 10012-10022. DOI : 10.1109/iccv48922.2021.00986http://dx.doi.org/10.1109/iccv48922.2021.00986.
CAI Z, FAN Q, FERIS R S, et al. A unified multi-scale deep convolutional neural network for fast object detection[C]//European Conference on Computer Vision, Cham, Switzerland: Springer, 2016: 354-370.DOI: 10.1007/978-3-319-46493-0_22http://dx.doi.org/10.1007/978-3-319-46493-0_22.
CAI Z, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2018: 6154-6162. DOI: 10.1109/cvpr.2018.00644http://dx.doi.org/10.1109/cvpr.2018.00644.
FELZENSZWALB P, MCALLESTER D, RAMANAN D. A discriminatively trained, multiscale, deformable part model[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2008: 1-8. DOI: 10.1109/cvpr.2008.4587597http://dx.doi.org/10.1109/cvpr.2008.4587597.
GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2015: 1440-1448. DOI: 10.1109/iccv.2015.169http://dx.doi.org/10.1109/iccv.2015.169.
PAPAGEORGIOU C, POGGIO T. A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38(1): 15-33.
ROTHE R, GUILLAUMIN M, GOOL L. Non-maximum suppression for object detection by passing messages between windows[C]//Asian Conference on Computer Vision, Cham, Switzerland: Springer, 2014: 290-306.
BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS—Improving object detection with one line of code[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE,2017: 5562-5570. DOI: 10.1109/iccv.2017.593http://dx.doi.org/10.1109/iccv.2017.593.
HE Y, ZHU C, WANG J, et al. Bounding box regression with uncertainty for accurate object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE,2019: 2888-2897. DOI: 10.1109/cvpr.2019.00300http://dx.doi.org/10.1109/cvpr.2019.00300.
LIU S, HUANG D, WANG Y. Adaptive NMS: Refining pedestrian detection in a crowd[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2019: 6459-6468. DOI: 10.1109/cvpr.2019.00662http://dx.doi.org/10.1109/cvpr.2019.00662.
VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2001: 1-9. DOI: 10.1109/cvpr.2001.990517http://dx.doi.org/10.1109/cvpr.2001.990517.
MROWCA D, ROHRBACH M, J. Hoffman, et al. Spatial semantic regularisation for large scale object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE,2015: 2003-2011. DOI: 10.1109/iccv.2015.232http://dx.doi.org/10.1109/iccv.2015.232.
SERMANET P, EIGEN D, ZHANG X, et al. OverFeat: Integrated recognition, localization and detection using convolutional networks[DB/OL]. [2014-02-24]. https://doi.org/10.48550/arXiv.1312.6229https://doi.org/10.48550/arXiv.1312.6229.
DESAI C, RAMANAN D, FOWLKES C C. Discriminative models for multi-class object layout[J]. International Journal of Computer Vision,2011,95(1): 1-12. DOI: 10.1109/iccv.2009.5459256http://dx.doi.org/10.1109/iccv.2009.5459256.
HU H, GU J, ZHANG Z, et al. Relation networks for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2018: 3588-3597. DOI: 10.1109/cvpr.2018.00378http://dx.doi.org/10.1109/cvpr.2018.00378.
HENDERSON P, FERRARI V. End-to-end training of object class detectors for mean average precision[C]//Asian Conference on Computer Vision, Cham, Switzerland: Springer, 2016: 198-213. DOI: 10.1007/978-3-319-54193-8_13http://dx.doi.org/10.1007/978-3-319-54193-8_13.
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30.
任书玉, 汪晓丁, 林晖. 目标检测中注意力机制综述[J]. 计算机工程, 2024, 1-19. DOI:10.19678/j.issn.1000-3428.0068553http://dx.doi.org/10.19678/j.issn.1000-3428.0068553.
孙福明, 胡锡航, 武景宇, et al. 跨模态交互融合与全局感知的RGB-D显著性目标检测[J]. 软件学报, 2024, 35(4): 1899-1913. DOI:10.13328/j.cnki.jos.006833http://dx.doi.org/10.13328/j.cnki.jos.006833.
CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//European Conference on Computer Vision, Berlin: Springer, 2020: 213-229.
HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2017: 2961-2969.
LI Y, MAO H, GIRSHICK R, et al. Exploring plain vision transformer backbones for object detection[C]//European Conference on Computer Vision, Berlin: Springer, 2022: 280-296. DOI: 10.1007/978-3-031-20077-9_17http://dx.doi.org/10.1007/978-3-031-20077-9_17.
ZHENG D, DONG W, HU H, et al. Less is more: Focus attention for efficient detr[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2023: 6674-6683. DOI: 10.1109/iccv51070.2023.00614http://dx.doi.org/10.1109/iccv51070.2023.00614.
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2016: 2818-2826. DOI: 10.1109/cvpr.2016.308http://dx.doi.org/10.1109/cvpr.2016.308.
MÜLLER R, KORNBLITH S, HINTON G E. When does label smoothing help[J]. Advances in Neural Information Processing Systems, 2019,32: 1-13.
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,42(2): 318-327. DOI: 10.1109/iccv.2017.324http://dx.doi.org/10.1109/iccv.2017.324.
YU J, JIANG Y, WANG Z, et al. Unitbox: An advanced object detection network[C]//Proceedings of the 24th ACM international conference on Multimedia, New York: ACM SIGMM, 2016: 516-520. DOI: 10.1145/2964284.2967274http://dx.doi.org/10.1145/2964284.2967274.
REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2019: 658-666. DOI: 10.1109/cvpr.2019.00075http://dx.doi.org/10.1109/cvpr.2019.00075.
ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020, 34(7): 12993-13000. DOI: 10.1609/aaai.v34i07.6999http://dx.doi.org/10.1609/aaai.v34i07.6999.
CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20: 273-297.
HUNT E B, MARIN J, STONE P J. Experiments in induction[M]. New York: Academic Press, 1966.
FREUND Y, SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting[J]. Journal of Computer and System Sciences, 1997, 55(1): 119-139.
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2014: 580-587. DOI: 10.18127/j00338486-202109-11http://dx.doi.org/10.18127/j00338486-202109-11.
HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. DOI:10.1109/tpami.2015.2389824http://dx.doi.org/10.1109/tpami.2015.2389824.
DAI J, LI Y, HE K, et al. R-FCN: Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems, 2016, 29. DOI: 10.1109/ist48021.2019.9010104http://dx.doi.org/10.1109/ist48021.2019.9010104.
QIAO S, CHEN L C, YUILLE A. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2021: 10213-10224. DOI: 10.1109/cvpr46437.2021.01008http://dx.doi.org/10.1109/cvpr46437.2021.01008.
REDMON J, FARHADI A. Yolov3: An incremental improvement[DB/OL]. [2018-04-08]. https://doi.org/10.48550/arXiv.1804.02767https://doi.org/10.48550/arXiv.1804.02767.
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[DB/OL]. [2020-04-23].https://doi.org/10.48550/arXiv.2004.10934https://doi.org/10.48550/arXiv.2004.10934.
FU C Y, LIU W, RANGA A, et al. Dssd: Deconvolutional single shot detector[DB/OL]. [2017-01-23]. https://doi.org/10.48550/arXiv.1701.06659https://doi.org/10.48550/arXiv.1701.06659.
LI Z, YANG L, ZHOU F. FSSD: feature fusion single shot multibox detector[DB/OL]. arXiv preprint arXiv:1712.00960, 2017. [2024-02-23]. https://doi.org/10.48550/arXiv.1712.00960https://doi.org/10.48550/arXiv.1712.00960.
TAN M, PANG R, LE Q V. Efficientdet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2020: 10781-10790. DOI: 10.1109/cvpr42600.2020.01079http://dx.doi.org/10.1109/cvpr42600.2020.01079.
HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2016: 770-778. DOI: 10.1109/cvpr.2016.90http://dx.doi.org/10.1109/cvpr.2016.90.
WANG Z, LI Y, CHEN X, et al. Detecting everything in the open world: Towards universal object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2023: 11433-11443. DOI: 10.1109/cvpr52729.2023.01100http://dx.doi.org/10.1109/cvpr52729.2023.01100.
SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2016: 761-769. DOI: 10.1109/cvpr.2016.89http://dx.doi.org/10.1109/cvpr.2016.89.
BELL S, LAWRENCE ZITNICK C, BALA K, et al. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2016: 2874-2883. DOI: 10.1109/cvpr.2016.314http://dx.doi.org/10.1109/cvpr.2016.314.
DAI J, LI Y, HE K, et al. R-FCN: Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems, 2016, 29.
DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, Piscataway, NJ: IEEE, 2017: 764-773. DOI: 10.1109/iccv.2017.89http://dx.doi.org/10.1109/iccv.2017.89.
SINGH B, DAVIS L S. An analysis of scale invariance in object detection snip[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2018: 3578-3587. DOI: 10.1109/cvpr.2018.00377http://dx.doi.org/10.1109/cvpr.2018.00377.
ZHAO Q, SHENG T, WANG Y, et al. M2det: A single-shot object detector based on multi-level feature pyramid network[DB/OL]. [2019-01-06]. https://doi.org/10.48550/arXiv.1811.04533https://doi.org/10.48550/arXiv.1811.04533.
0
浏览量
5
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构