A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification
Introduction
The presently available technologies (e.g. multi/hyper-spectral, synthetic aperture radar) for earth observation generate many types of airplane and satellite images with high resolutions (spatial resolution, spectral resolution, and temporal resolution) (Cheng et al., 2017, Plaza et al., 2011, Gamba, 2013, Cantalloube and Nahum, 2013, Lu et al., 2017, Li et al., 2016, Yuan et al., 2017, Liu et al., 2017). The main task transfers to intelligent earth observation through massive high-resolution remote sensing (HRRS) images, which smart classify land use and land cover scenes (LULC) from airborne or space platforms (Gómez-Chova et al., 2015). Remote sensing image scene classification, which plays an important role in earth observation and is receiving significant attention, categorizes scene images into an independent set of semantic-level LULC class labels according to image contents. During the past few decades, a lot of remarkable efforts have been made to develop various methods for the task of HRRS image scene classification in a wide range of applications (Wang et al., 2016, Li and Wang, 2015, Dou et al., 2014, Cheng and Han, 2016, Ma et al., 2016, Yu et al., 2016, Dópido et al., 2013, Li et al., 2014), such as LULC determination urban planning, environmental protection, and crop monitoring.
Deep-learning-based methods, which achieve many improvements over state-of-the-art records in many research fields, have been widely applied in natural images classification, object recognition, natural language, and text processing (Chatfield et al., 2014, Simonyan and Zisserman, 2015, He et al., 2016, Krizhevsky et al., 2012, Szegedy et al., 2015). Due to their remarkable performance, these methods are used to analyze HRRS images, and have achieved more impressive results than the traditional shallow methods for scene classification (Castelluccio et al., xxxx, Hu et al., 2015, Zhang et al., 2016, Zhao and Du, 2016, Luo et al., 2017, Wang et al., 2017, Cheng et al., 2016). Though satellite and aerial images dramatically increase in both quality and quantity, deep-learning-based methods in a fully-supervised learning fashion (Zhang et al., 2015) require a large scale, artificially-annotated dataset to obtain ideal classifiers. However, there is no HRRS dataset with as comparative a scale as ImageNet (Deng et al., 2009) to meet the requirements of the deep-learning-based methods in remote sensing. Additionally, in contrast to natural images, an annotated HRRS dataset needs to be labeled by experts and engineers, which greatly increases the difficulty of acquiring a large scale annotation dataset of HRRS images.
The acquisition of unlabeled images is much easier compared to acquiring a manually-annotated dataset. Hence, the use of the original, unlabeled data to generate labeled data could solve the problem of lacking labeled samples. Self-label technique (Triguero et al., 2015) is an available solution that aims to obtain an enlarged annotation dataset from unlabeled samples via semi-supervised learning. However, the existing self-label methods have a significant weakness in that they annotate samples by employing handcrafted features. Handcrafted features are designed by experts and engineers to solve the classification tasks. Traditional handcrafted features contain many severe limitations. On the one hand, extensive of domain expertise and engineering skills are needed to design handcrafted features. On the other hand, the representational capability of handcrafted features is significantly influenced by human ingenuity in feature designing. The ideal features should be automatically generated with powerful representation ability. Fortunately, deep learning features, which are spontaneously learned from data by a deep architecture neural network with remarkable performance, could be an option to address the limitations of the handcrafted features. Therefore, we replace the handcrafted features of the self-label techniques with deep learning features to carry out our work.
We propose a semi-supervised, generative framework with deep learning features (SSGF) for HRRS image scene classification to solve the problem of lacking sufficient annotation HRRS datasets. The details of this framework are summarized below:
- 1.
Deep convolutional neural network (CNN) features are transferred to replace the traditionally handcrafted features due to their powerful representation ability. It enables the discovery of sufficient diversities and variations hidden in the HRRS images and provides a better understanding of scene classes.
- 2.
A co-training (Blum and Mitchell, 1998) self-label method is used to learn valuable information from unlabeled samples and obtain an annotated dataset. It not only makes use of the low-confidence samples, but suppresses the problem of misclassification.
- 3.
A discriminative evaluation method enhances classification of the confusion classes with similar texture structures and visualized features, which further improves the reliability of generative samples.
By combining the three techniques, the proposed SSGF is able to learn effective information from unlabeled data for the improvement of classification ability. Therefore, with a limited number of annotation samples and a significant number of unlabeled samples, the ideal model can be obtained. Hence, the enlarged set generated by the model is available for supervised learning. To evaluate the performance of SSGF, we further develop an extended algorithm (SSGA-E). The major contributions of this work are summarized as follows:
- 1.
Focusing on the problem of insufficient annotation datasets in remote sensing, we propose a semi-supervised generative framework. It can instantly improve the capability of scene classification by learning unlabeled instances, and generate a reliable annotation dataset for supervised learning.
- 2.
On the basis, we further develop an extended algorithm. We have performed extensive experiments to evaluate the proposed method over four public HRRS datasets. The experimental results show that the proposed method outperforms most of the fully-supervised methods, and it has achieved the third best accuracy on the UCM dataset, the second best accuracy on the WHU-RS, the NWPU-RESISC45, and the AID datasets. The experimental results demonstrate that the proposed SSGA-E is effective in solving the problem of insufficient annotated datasets for HRRS image scene classification.
The remainder of this paper is organized as follows: In Section 2, we briefly review some related works about deep learning methods and scene classification. In Section 3, the deep neural networks used in this works are introduced briefly. The semi-supervised generative framework and an extended algorithm are proposed and explained in detail in Section 4. We display and discuss the experimental results in Section 5. Finally, conclusions are drawn in Section 6.
Section snippets
Related work and background
In the early 1970s, the spatial resolution of satellite images was extremely coarse and pixel sizes were similar in size or bigger than the interest objects (Janssen and Middelkoop, 1992). Therefore, available methods for analysis of remote sensing images have been based on pixel level since the early 1970s (Blaschke et al., 2008, Blaschke, 2010). With the advance of remote sensing technology, a higher number of HRRS images are obtainable, such the UCMerced Land Use dataset (Yang and Newsam,
Deep Convolutional Neural Networks (CNNs)
In this section, we first discuss the typical structure of a CNN and the back propagation algorithm used to optimize the gradient to refer to the weight parameters of the network. Next, all CNNs used in this paper are briefly introduced.
A semi-supervised generative framework for HRRS scene classification
In this section, we first mathematically define the problem. Next, we propose the semi-supervised generative framework and an extended algorithm in detail.
Experimental results
In this section, we detail the series of experiments conducted to evaluate the performance of the proposed SSGA-E for annotation of remote sensing images over four HRRS image datasets. The detailed experimental setup and experimental results with reasonable analysis are presented below.
Conclusion and future work
In this paper, we focus on the problem of insufficient manually-labeled samples in the remote sensing to develop a semi-supervised generative framework. This goes along with a limited number of labeled samples and extensive unlabeled samples for reliable annotation datasets for HRRS scene classification. The proposed framework combines deep-learning-based features, the co-training-based self-label method and the discriminative evaluation method to complete the annotation task. The
Acknowledgments
We gratefully acknowledge the editor, associate editor, and reviewers for their comments in helping us to improve this work. We also acknowledge the support of the National Natural Science Foundation of China (No. 41571413 and No. 41701429); the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) (No. CUG170625); and NVIDIA Corporation for the donation of the Titan X GPU used in this research.
References (60)
Object based image analysis for remote sensing
ISPRS J. Photogram. Rem. Sens.
(2010)- et al.
A survey on object detection in optical remote sensing images
ISPRS J. Photogram. Rem. Sens.
(2016) - et al.
Modeling and simulation for natural disaster contingency planning driven by high-resolution remote sensing images
Fut. Gener. Comp. Syst.
(2014) - et al.
Semisupervised classification for hyperspectral image based on multi-decision labeling and deep feature learning
ISPRS J. Photogram. Rem. Sens.
(2016) - et al.
Sparse coding with an overcomplete basis set: a strategy employed by v1?
Vis. Res.
(1997) - et al.
Semi-supervised learning based on nearest neighbor rule and cut edges
Knowl.-Based Syst.
(2010) - et al.
Rotation-and-scale-invariant airplane detection in high-resolution satellite images based on deep-hough-forests
ISPRS J. Photogram. Rem. Sens.
(2016) - et al.
Change detection based on deep feature representation and mapping transformation for multi-spatial-resolution remote sensing images
ISPRS J. Photogram. Rem. Sens.
(2016) - et al.
Learning multiscale and deep representations for classifying remotely sensed imagery
ISPRS J. Photogram. Rem. Sens.
(2016) - et al.
Object-Based Image Analysis: Spatial Concepts for Knowledge-Driven Remote Sensing Applications
(2008)
Combining labeled and unlabeled data with co-training
Airborne sar-efficient signal processing for very high resolution
Proc. IEEE
Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images
IEEE Trans. Geosci. Rem. Sens.
Remote sensing image scene classification: benchmark and state of the art
Proc. IEEE
Histograms of oriented gradients for human detection
Knowledge management: semantic drift or conceptual shift?
J. Educ. Lib. Inf. Sci.
Imagenet: a large-scale hierarchical image database
Semisupervised self-learning for hyperspectral image classification
IEEE Trans. Geosci. Rem. Sens.
Unsupervised feature learning for scene classification of high resolution remote sensing image
Human settlements: a global challenge for eo data processing and interpretation
Proc. IEEE
Multimodal classification of remote sensing images: a review and future directions
Proc. IEEE
Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning
IEEE Trans. Geosci. Rem. Sens.
A fast learning algorithm for deep belief nets
Neural Comput.
Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery
Rem. Sens.
Knowledge-based crop classification of a landsat thematic mapper image
Int. J. Rem. Sens.
Caffe: convolutional architecture for fast feature embedding
Principal Component Analysis
Cited by (188)
Trustworthy remote sensing interpretation: Concepts, technologies, and applications
2024, ISPRS Journal of Photogrammetry and Remote SensingA new Bayesian semi-supervised active learning framework for large-scale crop mapping using Sentinel-2 imagery
2024, ISPRS Journal of Photogrammetry and Remote SensingSemi-supervised remote sensing image scene classification with prototype-based consistency
2024, Chinese Journal of AeronauticsMapping integrated crop-livestock systems in Brazil with planetscope time series and deep learning
2023, Remote Sensing of EnvironmentSatellite-derived sediment distribution mapping using ICESat-2 and SuperDove
2023, ISPRS Journal of Photogrammetry and Remote SensingA survey of machine learning and deep learning in remote sensing of geological environment: Challenges, advances, and opportunities
2023, ISPRS Journal of Photogrammetry and Remote Sensing