Dilation and Deconvolution Single-Shot Detector for Small Objects

pipeline

Authors

Abstract

SSD (Single Shot Multibox Detector) is one of the most successful object detectors for its high accuracy and fast speed. However, the features from shallow layer (mainly Conv4_3) of SSD lack semantic information, resulting in the poor performance in small objects. In this paper, we proposed DDSSD (Dilation and Deconvolution Single Shot Multibox Detector), an enhanced SSD with a novel feature fusion module which can improve the performance over SSD for small obejct detection. In the feature fusion module, dilation convolution module is utilized to enlarge the receptive field of features from shallow layer and deconvolution module is adopted to increase the size of feature maps from high layer. On the Pascal VOC 2007 test, our network can achieve 79.7% mAP at the speed of 41 FPS with the input size 300x300 using a single Nvidia 1080 GPU. Our DDSSD outperforms a lot of state-of-the-art object detection algorithms in both aspects of accuracy and speed.

Highlights

Receptive field

feature maps 

This figure shows the visulization the feature maps from L2 Norm (an operation after Conv4_3) of the conventional SSD and from Dilaiton Module of our DDSSD, respectively. Obviously, under the same size (38x38), in the original SSD only serveral pixels are activated, which makes the detection difficult. By contrast, our DDSSD achieves a wider range of activation, which helps a lot when detecting especially for small objects. Unfortunately, dilation convolution also introduces additional noise as shown in the feature maps.

Dilation and Deconvolution Module

Dilation and Deconvolution Module 

Illustration of Dilation and Deconvolution Module. The Dilation branch including a Conv with kernel 3x3 and a Conv with kernel 3x3 and dilation 2 is utilized to enlarge the receptive field in Conv4_3. The Deconvolution module uses a Deconv with 2x2 kernel size and a Conv with 3x3 kernel to expand the resolution of feature maps from fc_7.

Comparisons

t1

t2

t3

t4

Visulization of PASCAL VOC 2007 test

vis

Related Publications

Hao Zhang, Xianggong Hong, Li Zhu, and Shifen Zhou. Dilation and Deconvolution Single-Shot Detector for Small Objects, 2019 (Under Review)