{"id":841,"date":"2020-08-28T13:06:00","date_gmt":"2020-08-28T04:06:00","guid":{"rendered":"https:\/\/arithmer.blog\/?p=841"},"modified":"2022-03-08T15:45:19","modified_gmt":"2022-03-08T06:45:19","slug":"centernet","status":"publish","type":"post","link":"https:\/\/arithmer.blog\/blog\/centernet","title":{"rendered":"CenterNet"},"content":{"rendered":"\n<p class=\"has-small-font-size\">\u672c\u8cc7\u6599\u306f2020\u5e748\u670828\u65e5\u306b\u793e\u5185\u5171\u6709\u8cc7\u6599\u3068\u3057\u3066\u5c55\u958b\u3057\u3066\u3044\u305f\u3082\u306e\u3092WEB\u30da\u30fc\u30b8\u5411\u3051\u306b\u30ea\u30cb\u30e5\u30fc\u30a2\u30eb\u3057\u305f\u5185\u5bb9\u306b\u306a\u308a\u307e\u3059\u3002<\/p>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"purpose\"><strong>\u25a0Purpose<\/strong><\/h3>\n\n\n\n<p style=\"font-size:18px\"><strong>Purpose of this material<\/strong><\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>Understand an anchor free approach object detection algorithm<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"agenda\"><strong>\u25a0Agenda<\/strong><\/h3>\n\n\n\n<ul style=\"font-size:16px\"><li>Current object detection approaches<\/li><li>Centernet approach<\/li><li>Object as Points<\/li><li>Training<ul><li>Keypoint heatmap<\/li><li>Local offset<\/li><li>Size prediction<\/li><li>Loss function<\/li><\/ul><\/li><li>Network Architecture<ul><li>DLA<\/li><li>Modified DLA<\/li><\/ul><\/li><li>Inference<\/li><li>Results<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"background\">\u25a0Background<\/h3>\n\n\n\n<p style=\"font-size:18px\"><strong>Current approaches<\/strong><\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>Object detections model (such as Yolo, SSD, etc.) rely on the usage of anchor boxes<\/li><li>Anchor boxes are not completely optimal:<ul><li>Wasteful: SSD300 does 8732 detections per class, and yolo448 does 98 detections per class, which means that most of the box are discarded<\/li><li>Inefficient: We have to process all the boxes (even we will discard them later), which comes with more processing time<\/li><li>Require post processing: like non-max suppression algorithm<\/li><li>Fixed: SSD requires fixed scale and steps of boxes, while yolov3 fixes the size of the anchors per detection level<\/li><\/ul><\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"centernet\"><strong>\u25a0Centernet<\/strong><\/h3>\n\n\n\n<p style=\"font-size:18px\"><strong>Centernet approach<\/strong><\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>End-to-end differentiable solution<\/li><li>Relies on keypoint estimation to find the center points and regress all other object properties(such as size)<\/li><li>As a result, the model is simpler, faster and more accurate than bounding-box based detectors<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"249\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_01.jpg\" alt=\"\" class=\"wp-image-848\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_01.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_01-300x73.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_01-768x187.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_01-304x74.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"object-as-points\"><strong>\u25a0Object as Points<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"468\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_02.jpg\" alt=\"\" class=\"wp-image-849\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_02.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_02-300x137.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_02-768x351.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_02-304x139.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"training\"><strong>\u25a0Training<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"164\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_03.jpg\" alt=\"\" class=\"wp-image-850\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_03.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_03-300x48.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_03-768x123.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_03-304x49.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"keypoint-heatmap\"><strong>\u25a0Keypoint Heatmap<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"262\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_04.jpg\" alt=\"\" class=\"wp-image-851\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_04.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_04-300x77.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_04-768x197.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_04-304x78.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"403\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_05.jpg\" alt=\"\" class=\"wp-image-852\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_05.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_05-300x118.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_05-768x302.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_05-304x120.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"330\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_06.jpg\" alt=\"\" class=\"wp-image-853\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_06.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_06-300x97.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_06-768x248.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_06-304x98.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"local-offset\"><strong>\u25a0Local Offset<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"155\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_07.jpg\" alt=\"\" class=\"wp-image-854\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_07.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_07-300x45.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_07-768x116.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_07-304x46.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"406\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_08.jpg\" alt=\"\" class=\"wp-image-855\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_08.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_08-300x119.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_08-768x305.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_08-304x121.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"size-prediction\"><strong>\u25a0Size Prediction<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"162\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_09.jpg\" alt=\"\" class=\"wp-image-856\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_09.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_09-300x47.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_09-768x122.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_09-304x48.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"269\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_10.jpg\" alt=\"\" class=\"wp-image-857\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_10.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_10-300x79.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_10-768x202.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_10-304x80.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"loss-function\"><strong>\u25a0Loss Function<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"208\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_11.jpg\" alt=\"\" class=\"wp-image-858\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_11.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_11-300x61.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_11-768x156.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_11-304x62.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"network-architecture\"><strong>\u25a0Network Architecture<\/strong><\/h3>\n\n\n\n<ul style=\"font-size:16px\"><li>Authors experiment with different backbone architectures, obtaining different results:<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"145\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_12.jpg\" alt=\"\" class=\"wp-image-859\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_12.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_12-300x42.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_12-768x109.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_12-304x43.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<p class=\"has-small-font-size\">Results without test augmentation(N.A.), flip testing(F), and multi-scale augmentation(MS). HW: Intel Core i7-8086k CPU, Titan Xp GPU<\/p>\n\n\n\n<ul style=\"font-size:16px\"><li>The backbone that produces best speed\/accuracy tradeoff is DLA-34 (modified by authors)<\/li><\/ul>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"network-architecture-1\"><strong>\u25a0Network Architecture<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"423\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_13.jpg\" alt=\"\" class=\"wp-image-860\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_13.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_13-300x124.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_13-768x317.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_13-304x126.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"465\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_14.jpg\" alt=\"\" class=\"wp-image-861\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_14.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_14-300x136.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_14-768x349.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_14-304x138.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"inference\"><strong>\u25a0Inference<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"317\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_15.jpg\" alt=\"\" class=\"wp-image-862\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_15.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_15-300x93.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_15-768x238.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_15-304x94.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"results\"><strong>\u25a0Results<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"438\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_16.jpg\" alt=\"\" class=\"wp-image-863\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_16.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_16-300x128.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_16-768x329.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_16-304x130.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"1024\" height=\"425\" src=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_17.jpg\" alt=\"\" class=\"wp-image-846\" srcset=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_17.jpg 1024w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_17-300x125.jpg 300w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_17-768x319.jpg 768w, https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/NS20200828_17-304x126.jpg 304w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n\n\n<h3 class=\"has-medium-font-size wp-block-heading\" id=\"\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9\"><strong>\u25a0\u30c0\u30a6\u30f3\u30ed\u30fc\u30c9<\/strong><\/h3>\n\n\n\n<p style=\"font-size:16px\"><a rel=\"noreferrer noopener\" href=\"https:\/\/arithmer.blog\/wp-content\/uploads\/2022\/02\/10_CenterNet_Centernet2020_0828.pdf\" target=\"_blank\">Centernet.pdf<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u672c\u8cc7\u6599\u306f2020\u5e748\u670828\u65e5\u306b\u793e\u5185\u5171\u6709\u8cc7\u6599\u3068\u3057\u3066\u5c55\u958b\u3057\u3066\u3044\u305f\u3082\u306e\u3092WEB\u30da\u30fc\u30b8\u5411\u3051\u306b\u30ea\u30cb\u30e5\u30fc\u30a2\u30eb\u3057\u305f\u5185\u5bb9\u306b\u306a\u308a\u307e\u3059\u3002 \u25a0Purpose Purpose of this material Understand an anc &#8230; <\/p>\n","protected":false},"author":3,"featured_media":847,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[20,85,35,86,45],"_links":{"self":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts\/841"}],"collection":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=841"}],"version-history":[{"count":5,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts\/841\/revisions"}],"predecessor-version":[{"id":871,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/posts\/841\/revisions\/871"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=\/wp\/v2\/media\/847"}],"wp:attachment":[{"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=841"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=841"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/arithmer.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=841"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}