Statistical Analysis of Radar and Hyperspectral Remote Sensing Data
Younan, Nicolas H.
In this dissertation, three studies were done for radar and hyperspectral remote sensing applications using statistical techniques. The first study investigated a relationship between synthetic aperture radar backscatter and in situ soil properties for levee monitoring. A series of statistical analyses were performed to investigate potential correlations between three independent polarization channels of radar backscatter and various soil properties. The results showed a weak but considerable correlation between the cross-polarized (HV) radar backscatter coefficients and several soil properties. The second study performed effective statistical feature extraction for levee slide classification. Images about a levee are often very large, and it is difficult to monitor levee conditions quickly because of high computational cost and large memory requirement. Therefore, a time-efficient method to monitor levee conditions is necessary. The traditional support vector machine (SVM) did not work well on original radar images with three bands, requiring extraction of discriminative features. Gray level co-occurrence matrix is a powerful method to extract textural information from grey-scale images, but it may not be practical for a big data in terms of calculation time. In this study, very efficient feature extraction methods with spatial filtering were used, including a weighted average filter and a majority filter in conjunction with a nonlinear band normalization process. Feature extraction with these filters, along with normalized bands, yielded comparable results to gray level co-occurrence matrix with a much lower computational cost. The third study focused on the case when only a small number of ground truth labels were available for hyperspectral image classification. To overcome the difficulty of not having enough training samples, a semisupervised method was proposed. The main idea was to expand ground truth using a relationship between labeled and unlabeled data. A fast self-training algorithm was developed in this study. Reliable unlabeled samples were chosen based on SVM output with majority voting or weighted majority voting, and added to labeled data to build a better SVM classifier. The results showed that majority voting and weighted majority voting could effectively select reliable unlabeled data, and weighted majority voting yielded better performance than majority voting.