Object detection is one of the most important computer vision tasks and many researchers have proposed enormous object detection methods based on Convolutional Neural Network (CNN). Still, the performances of these object detectors are hindered by the diversity of object sizes and categories.
To get better feature expression, the utilization of multiscale features has been proposed. The extraction and utilization of multiscale features, known as Feature Pyramid Network (FPN), have great influence on the performance of the final detector. However, feature fusion of FPN is insufficient to express objects of similar size but different appearance due to the unidirectional feature fusion.
A research team led by Prof. Dr. LU Xiaoqiang from the Xi'an Institute of Optics and Precision Mechanics (XIOPM) of the Chinese Academy of Sciences (CAS) proposed a new multiscale feature fusion method with bidirectional feature fusion to solve the one-direction fusion of FPN, which was called Adaptive Multiscale Feature (AMF). The results were published in Neurocomputing.
The main problem of the backbone network is how to integrate the deep and shallow features reasonably, because using only the last layer of features makes it difficult to deal with multi-size objects. Therefore, the unidirectional feature fusion of FPN should be avoided and AMF is employed in the detector.
According to the researchers, there are two parts in the AMF module for feature fusion and feature redistribution, Feature Scattering (FS) and Feature Redistribution (FR). Based on Convolutional Long Short Term Memory networks, the fusion was carried out in two directions. The shallow features are enhanced by the deep features and the deep features are also enhanced by the shallow features. Then the two features were further fused, for each level, channel-wise attention was utilized to assign features to the corresponding layer.
To demonstrate the effectiveness of the proposed AMF for both anchor-free based and anchor based detectors, they used Fully Convolutional One-Stage Object Detection and RetinaNet as the baseline, representing anchor-free based and anchor based detectors, respectively.
Experimental results based on the COCO 2014 dataset show that the proposed AMF module performs the popular FPN based detector. Whether anchored-free based detectors or anchored based detectors, the performance of detector can be improved through AMF.
The proposed AMF method exceeds the current most advanced object detector in accuracy.