OSA's Digital Library

Optics Express

Optics Express

  • Editor: C. Martijn de Sterke
  • Vol. 19, Iss. 27 — Dec. 19, 2011
  • pp: 26816–26826
« Show journal navigation

Use of weighting algorithms to improve traditional support vector machine based classifications of reflectance data

Bin Qi, Chunhui Zhao, Eunseog Youn, and Christian Nansen  »View Author Affiliations


Optics Express, Vol. 19, Issue 27, pp. 26816-26826 (2011)
http://dx.doi.org/10.1364/OE.19.026816


View Full Text Article

Acrobat PDF (836 KB)





Browse Journals / Lookup Meetings

Browse by Journal and Year


   


Lookup Conference Papers

Close Browse Journals / Lookup Meetings

Article Tools

Share
Citations

Abstract

Support vector machine (SVM) is widely used in classification of hyperspectral reflectance data. In traditional SVM, features are generated from all or subsets of spectral bands with each feature contributing equally to the classification. In classification of small hyperspectral reflectance data sets, a common challenge is Hughes phenomenon, which is caused by many redundant features and resulting in subsequent poor classification accuracy. In this study, we examined two approaches to assigning weights to SVM features to increase classification accuracy and reduce adverse effects of Hughes phenomenon: 1) “RSVM” refers to support vector machine with relief feature weighting algorithm, and 2) “FRSVM” refers to support vector machine with fuzzy relief feature weighting algorithm. We used standardized weights to extract a subset of features with high classification contribution. Analyses were conducted on a reflectance data set of individual corn kernels from three inbred lines and a public data set with three selected land-cover classes. Both weighting methods and reduction of features increased classification accuracy of traditional SVM and therefore reduced adverse effects of Hughes phenomenon.

© 2011 OSA

1. Introduction

With increasing use of hyperspectral reflectance data in research, commercial and military applications, there is a continuous demand for improving the accuracy of classification algorithms. Classification accuracy may be defined as the ability to correctly classify a given object or pixel, and Cohen Kappa coefficient [1

1. J. Cohen, “A coefficient of agreement for nominal scales,” Educ. Psychol. Meas. 20(1), 37–46 (1960). [CrossRef]

] is often used as a measurement of classification algorithm. Support vector machine (SVM) was proposed by Vapnik and his colleagues as a classification approach in the fields of pattern recognition and machine learning based on the structural risk minimization principle [2

2. V. Vapnik, The Nature of Statistical Learning Theory (Springer & New York, 2000), Chap. 1.

4

4. B. E. Boser, I. M. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in COLT '92 Proceedings of the fifth annual workshop on computational learning theory, D. Haussler, ed. (ACM, New York, NY, 1992), pp. 144–152.

]. That is, SVM searches for a decision boundary, which aims at providing a tradeoff between hypothesis space complexity and quality of fitting the training data [5

5. M. Pal and G. M. Foody, “Feature selection for classification of hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens. 48(5), 2297–2307 (2010). [CrossRef]

, 6

6. F. Bovolo, L. Bruzzone, and L. Carlin, “A novel technique for subpixel image classification based on support vector machine,” IEEE Trans. Image Process. 19(11), 2983–2999 (2010). [CrossRef]

]. Different SVMs have been applied successfully to analyses of hyperspectral reflectance data in pattern recognition (i.e. endmember extraction [7

7. A. M. Filippi, R. Archibald, B. L. Bhaduri, and E. A. Bright, “Hyperspectral agricultural mapping using support vector machine-based endmember extraction (SVM-BEE),” Opt. Express 17(26), 23823–23842 (2009). [CrossRef] [PubMed]

], geometric camera calibration [8

8. B. Ergun, T. Kavzoglu, I. Colkesen, and C. Sahin, “Data filtering with support vector machines in geometric camera calibration,” Opt. Express 18(3), 1927–1936 (2010). [CrossRef] [PubMed]

], text categorization [9

9. M. A. Kumar and M. Gopal, “A comparison study on multiple binary-class SVM methods for unilabel text categorization,” Pattern Recognit. Lett. 31(11), 1437–1444 (2010). [CrossRef]

], handwritten character recognition [10

10. N. Shanthi and K. Duraiswamy, “A novel SVM-based handwritten Tamil character recognition system,” Pattern Anal. Appl. 13(2), 173–180 (2010). [CrossRef]

], and face recognition [11

11. X. Xu, D. Zhang, and X. Zhang, “An efficient method for human face recognition using nonsubsampled contourlet transform and support vector machine,” Opt. Appl. 39, 601–615 (2009).

]) and in classification of objects into discrete classes [12

12. B. Guo, S. R. Gunn, R. I. Damper, and J. B. Nelson, “Customizing kernel functions for SVM-based hyperspectral image classification,” IEEE Trans. Image Process. 17(4), 622–629 (2008). [CrossRef] [PubMed]

]. Traditional SVM treats each feature (spectral band or variable) with equal weight [13

13. J. Li, X. Gao, and L. Jiao, “A new feature weighted fuzzy cluster algorithm,” Acta. Electron. 34, 89–92 (2006).

], even though they unlikely have equal contributions to the classification. Thus, it might be advantageous to incorporate ways to assign highest weights to the features with the largest contribution to the classification and lower weights or simply omit those associated with noise/ stochasticity. Weighting of features is widely used in statistically-based classifications (i.e. forward stepwise band/feature selection), such as stepwise discriminant analysis [14

14. C. Nansen, A. J. Sidumo, and S. Capareda, “Variogram analysis of hyperspectral data to characterize the impact of biotic and abiotic stress of maize plants and to estimate biofuel potential,” Appl. Spectrosc. 64(6), 627–636 (2010). [CrossRef] [PubMed]

] and other regression-based classifications [15

15. L. R. LaMotte and A. McWhorter, “A regression-based linear classification procedure,” Educ. Psychol. Meas. 41(2), 341–347 (1981). [CrossRef]

, 16

16. L. Gao, F. Gao, X. Guan, D. Zhou, and J. Li, “A regression algorithm based on AdaBoost,” in WCICA 2006: Sixth World Congress on Intelligent Control and Automation, D. M. Zhou, ed. (IEEE Computer Society Press, Dalian, Liaoning, 2006), pp. 4400–4404.

]. Basic relief algorithm was originally proposed as a statistically-based feature selection method to assign different weights to different features according to their statistical contribution [17

17. K. Kira and L. A. Rendell, “A practical approach to feature selsecion,” in Proceeding of the 9th International Workshop on Machine Learning, D. Sleeman, ed. (Morgan Kaufmann, San Francisco, CA, 1992), pp. 249–256.

]. However, one of the potential challenges with use of basic relief algorithm is that the adjustment of weights is sensitive to outliers or noise in the training data set. Consequently, this approach may reduce classification robustness and increase the risk of “over-fitting” [18

18. T. Kayikcioglu and O. Aydemir, “A polynomial fitting and k-NN based approach for improving classification of motor imagery BCI data,” Pattern Recognit. Lett. 31(11), 1207–1215 (2010). [CrossRef]

]. In order to reduce the risk of over-fitting, we incorporated fuzzy theory into the basic relief algorithm and adjusted the contribution of features based on the pixel distance to the centroid of each class. The design idea presented in this study is based on the assumption that the membership degree in fuzzy theory can identify the distribution of the training samples. Moreover, in order to use the statistical information that exists in the training data set, we utilized relief algorithm as a feature weighting method in the SVMs. A new weighting formula was used to increase differences among classes and reduce differences within each class. Thus, the relief weighting algorithm may increase the class separation of SVM [13

13. J. Li, X. Gao, and L. Jiao, “A new feature weighted fuzzy cluster algorithm,” Acta. Electron. 34, 89–92 (2006).

, 19

19. F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machine,” IEEE Trans. Geosci. Remote Sens. 42(8), 1778–1790 (2004). [CrossRef]

, 20

20. L. Wang, C. Zhao, Y. Qiao, and W. Chen, “Research on all-around weighting methods of hyperspectral imagery classification,” Int. J. Infrared Millim. Waves 27, 442–446 (2008).

] by giving comparatively higher weights to features with high classification contribution. For convenient citation, we use the following abbreviations: “SVM” refers to original support vector machine, “RSVM” refers to support vector machine with relief feature weighting algorithm, and “FRSVM” refers to support vector machine with fuzzy relief feature weighting algorithm.

A problem often noted in the classification of reflectance data is Hughes phenomenon, which tend to occur when the number of classification features exceeds the number of training samples [21

21. P.-H. Hsu, “Feature extraction of hyperspectral images using wavelet and matching pursuit,” ISPRS J. Photogramm. Remote Sens. 62(2), 78–92 (2007). [CrossRef]

]. As a consequence of Hughes phenomenon, the classification accuracy progressively increases with the addition of features but reaches a maximum and subsequently declines [5

5. M. Pal and G. M. Foody, “Feature selection for classification of hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens. 48(5), 2297–2307 (2010). [CrossRef]

]. An important aspect of classification accuracy is therefore to select the most appropriate number of classification features to avoid adverse effects of Hughes phenomenon [22

22. C. Lee and D. A. Landgrebe, “Analyzing high-dimensional multispectral data,” IEEE Trans. Geosci. Remote Sens. 31(4), 792–800 (1993). [CrossRef]

].

The objective of this study was to compare traditional SVM with RSVM and FRSVM regarding: 1) classification accuracy, and 2) effect of feature reduction. We conducted this evaluation on the basis of two reflectance data sets: 1) individual corn kernels of three inbred lines and 2) a public data set with three selected land-cover classes. With this study, we intended to demonstrate that weighting and feature reduction methods can increase the accuracy of SVM based classifications.

2. Methods and concepts

Basic classification method used in this study is SVM, which has already shown high performance in machine learning applications, especially in dealing with high dimensional features [19

19. F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machine,” IEEE Trans. Geosci. Remote Sens. 42(8), 1778–1790 (2004). [CrossRef]

, 23

23. D. J. Sebald and J. A. Bucklew, “Support vector machine techniques for nonlinear equalization,” IEEE Trans. Signal Process. 48(11), 3217–3226 (2000). [CrossRef]

]. For additional theory about SVM, we refer to [2

2. V. Vapnik, The Nature of Statistical Learning Theory (Springer & New York, 2000), Chap. 1.

4

4. B. E. Boser, I. M. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in COLT '92 Proceedings of the fifth annual workshop on computational learning theory, D. Haussler, ed. (ACM, New York, NY, 1992), pp. 144–152.

].

2.1 Relief feature selection algorithm

Relief feature selection algorithm was proposed by Kira and Rendell [17

17. K. Kira and L. A. Rendell, “A practical approach to feature selsecion,” in Proceeding of the 9th International Workshop on Machine Learning, D. Sleeman, ed. (Morgan Kaufmann, San Francisco, CA, 1992), pp. 249–256.

] and briefly presented here to foster the discussion of our proposed feature weighting methods. In a theoretical reflectance data set, X={x1,x2,,xn},xid is the training data set of p classes, n pixels and each pixel has d features. λ is a d×1 vector which represents the weight of each dimensional feature. As for an arbitrary pixel xi, L pixels are selected that have the closest distance to xi of the same class with xi, which is referred as hj,j=1,2,,L,hjd. Then L pixels are chosen that have the closest distance to xi of the class different from xi, which is referred as mlj,j=1,2,,L,l={1,2,,p}/class(xi),mljd, diff_hit is a d×1 vector, which represents the difference between hj and xi.
diff_hit=j=1L|xihj|max(X)min(X)
(1)
where max(X)(min(X)) is the maximum (minimum) element in X. diff_miss is a d×1 vector, which represents the difference between mlj and xi.
diff_miss=lclass(xi)P(l)1P(class(xi))j=1L|ximlj|max(X)min(X)
(2)
where P(l) is the probability of class l and | | is the operation of absolute value. The update formula for λ is given by
λnew=λolddiff_hit/L+diff_miss/L
(3)
λ is updated for each xi,i=1,2,,n with its initial start vector set to zero, and features with weights above a given threshold, denoted τ, are retained and those features with smaller weights are discarded.

2.2 Relief feature weighting algorithm

The original relief algorithm was proposed for feature selection. diff_hit refers to the difference within each class, and diff_miss refers to the difference among classes. In this study, we utilized relief algorithm as a feature weighting method, so that only features with high diff_miss / diff_hit ratios were selected according to:
wq=i=1nlclass(xi)P(l)1P(class(xi))j=1L|xi,qmlj,q|max(X)min(X)i=1nj=1L|xi,qhj,q|max(X)min(X),q=1,2,d
(4)
where wq is the weight for the q-th feature, xi,q is the q-th element of vector xi, mlj,q is the q-th element of vector mlj, hj,q is the q-th element of vector hj.

2.3 Fuzzy relief feature weighting algorithm

If assumed that all pixels are divided into p classes (α1, α2,∙∙∙, αp) and the centroid of each class is ri,i=1,2,,p,xiαk, then the distance between xi and rk is
D(xi,rk)=xirk
(5)
where is the operation of Euclidean distance. The membership degree of xi to class αk is defined as:

uik=jkD2(xi,rj)D2(xi,rk)
(6)

The corresponding diff_hit and diff_miss are given by
diff_hit=j=1L|xihj|uikmax(X)min(X),xiαk
(7)
diff_miss=lkP(l)1P(k)j=1L|ximlj|uikmax(X)min(X),xiαk
(8)
w=(w1,w2,,wd) is a vector which corresponds to the weights of the features which are given by
wq=i=1nlkP(l)1P(k)j=1L|xi,qmlj,q|uikmax(X)min(X)i=1nj=1L|xi,qhj,q|uikmax(X)min(X),q=1,2,d
(9)
where wq is the weight for the q-th feature, xi,q is the q-th element of vector xi, mlj,q is the q-th element of vector mlj, hj,q is the q-th element of vector hj. Both the training and the test data sets were weighted by the feature weighting vector w prior to analysis.

3. Materials and experimental design

3.1 Experimental data samples

The corn kernel samples used in this study were provided by Dr. Kolomiets at Texas A&M University. In brief, they represent three proprietary inbred lines: a wild type without genetic modification, and two mutants with suppression of one of two genes in the lipoxygenase pathway. Genetically, the homozygous corn mutants are near-isogenic to the recurrent wild type parent and share about 97.5% of the parent genome with one mutant (mutant 1) showing negligible visual/phenotypic difference from the wild type, and the other mutant (mutant 2) being slightly darker in color than the wild type (Fig. 1(a) and (b)
Fig. 1 Digital image of the corn genotypes: Wild type (left column), Mutant 1 (center column), and Mutant 2 (right column) (a), and corresponding average reflectance profiles (b).
). Consequently, kernels from these inbred corn lines were considered ideal as a challenging model data set for evaluation of classification accuracy. Reflectance data from 15 individual corn kernels, five kernels from each of the three genotypes were used. Reflectance data from corn kernels were acquired after the kernels had been positioned on white Teflon, and hyperspectral images were acquired with a spatial resolution of 169 pixels per cm2. A subsample of 100 pixels was selected from each kernel, so totally there were 500 pixels from each class.

3.2 Hyperspectral imaging system

Hyperspectral imaging data of corn kernels were acquired with a line-scanning push-broom hyperspectral camera (PIKA II, www.resonon.com), which has 640 sensors producing hyperspectral images with 160 wavelength channels within the wavelength range from 405 to 907 nm (wavelength resolution of 3.1 nm). The objective lens has 35mm focal length optimized for the visible and near-infrared (NIR) spectra and the angular field of view is 7° [24

24. C. Nansen, T. Herrman, and R. Swanson, “Machine vision detection of bonemeal in animal feed samples,” Appl. Spectrosc. 64(6), 637–643 (2010). [CrossRef] [PubMed]

]. The hyperspectral camera was mounted on an aluminum tower-structure 60 cm above the target object platform. Hyperspectral image acquisition was conducted inside a darkroom with four halogen lamps (http://www.resonon.com/scanning-systems-and-accesories.html) as only light source in order to keep unwanted light from contaminating the signal. To ensure consistent acquisition conditions, the hyperspectral camera and lighting system were turned on at least 30 min prior to image acquisition. Dark calibration was conducted at the beginning of the data acquisition by covering the lens of the camera with its cap, and white Teflon was used for white calibration immediately before image acquisition. Based on dark and white calibration, reflectance values from hyperspectral image cubes were converted into proportions (denoted relative reflectance) ranging from 0 to 1.

3.3 AVIRIS data set

Public vegetation reflectance data from northwest Indiana’s Indian Pines (AVIRIS sensor, June 12, 1992: ftp://ftp.rcn.purdue.edu/biehl/MultiSpec/92AV3C) was also included in this study (Fig. 2
Fig. 2 Pseudo-color image of AVIRIS data set (composed of band 17, 27 and 57) (a), and corresponding average reflectance profiles (b).
), which has been used in multiple published studies [7

7. A. M. Filippi, R. Archibald, B. L. Bhaduri, and E. A. Bright, “Hyperspectral agricultural mapping using support vector machine-based endmember extraction (SVM-BEE),” Opt. Express 17(26), 23823–23842 (2009). [CrossRef] [PubMed]

, 12

12. B. Guo, S. R. Gunn, R. I. Damper, and J. B. Nelson, “Customizing kernel functions for SVM-based hyperspectral image classification,” IEEE Trans. Image Process. 17(4), 622–629 (2008). [CrossRef] [PubMed]

]. The hyperspectral image consists of a scene of size 145 by 145 pixels, with a spatial resolution of 20m/pixel and 200 spectral bands. From 16 different land-cover classes available in the original ground truth data, three classes (Corn-min till (834 pixels), Grass/Pasture (497 pixels) and Soybean-clean till (614 pixels)) were selected to testify the effectiveness of different classifier.

3.4 Training and test data sets

Experimental analysis was organized into two main parts. The first aimed at comparing average classification accuracies based on 10-fold cross-validations of proposed classifiers (RSVM and FRSVM) with that of traditional SVM. In the second part we examined effects of feature reduction on RSVM and FRSVM to evaluate effects of Hughes phenomenon on two data set sizes, small and large. Small data sets consisted of randomly partitioning the input data into 10 subsamples with one of the 10 subsamples retained as training data set, and the remaining nine subsamples were used as test data set. The subsampling process was repeated 10 times, with each subsample used once as training data set. Large data sets consisted of randomly partitioning the input data into five subsamples. Of the five subsamples, a single subsample was retained as training data set, and the remaining four subsamples were used as test data set. The process was then repeated five times, with each subsample used once as the training data set.

3.5 SVM and parameter settings

Similar to [12

12. B. Guo, S. R. Gunn, R. I. Damper, and J. B. Nelson, “Customizing kernel functions for SVM-based hyperspectral image classification,” IEEE Trans. Image Process. 17(4), 622–629 (2008). [CrossRef] [PubMed]

], we used the “one-against-one” SVM classification strategy without weighting of features as initial SVM method. The kernel function used here is the Gaussian RBF, as follows:
K(x,y)=exp(γ(xy)2)
(10)
where γ determines the width and tunes the smoothing of the discriminant function. The penalty C is another important factor in the SVM classifier, and it controls the trade-off between the margin and the size of the slack variables [25

25. F. A. Mianji and Y. Zhang, “Robust hyperspectral classification using relevance vector machine,” IEEE Trans. Geosci. Remote Sens. 49(6), 2100–2112 (2011). [CrossRef]

]. Consequently, to reliably optimize γ and C, a cross-validation frame work was applied with both γ and C ranging from 2−2-25 (Fig. 3
Fig. 3 Cross validation of corn kernel data set (a), cross validation of AVIRIS data set (b)
). Based on this initial analysis, γ = 2−1 and C = 21.5 were selected as suitable parameter values for corn kernel data set (Fig. 3a), while γ = 22 and C = 22.5 were selected for AVIRIS data set (Fig. 3b).

Parameter L is an integer, which represents the number of pixels selected as the closest pixels to calculate difference within each class (diff_hit) and difference among classes (diff_miss). As part of testing the accuracy of RSVM and FRSVM to parameter settings, we compared classification accuracies with L values ranging from 1 to 4. We also tested L > 4, but these results are not presented, as the classification accuracy decreased markedly in response to increasing parameter L. The Cohen Kappa coefficient [7

7. A. M. Filippi, R. Archibald, B. L. Bhaduri, and E. A. Bright, “Hyperspectral agricultural mapping using support vector machine-based endmember extraction (SVM-BEE),” Opt. Express 17(26), 23823–23842 (2009). [CrossRef] [PubMed]

] was used to measure the classification accuracy of each classifier.

4. Results and discussion

4.1. Reflectance data and weights assigned to spectral bands

Figure 1(b) shows the average reflectance profiles from kernels of three inbred corn lines with reflectance values acquired from mutant 2 being consistently lower than those from wild type and mutant 1, especially in spectral bands from 600 to 907 nm. Relative reflectance values were consistently higher from wild type kernels compared to mutant kernels, and about 6% difference in average reflectance curves was observed at 885nm between wild type and mutant 1. For comparison, the highest difference in average reflectance curve was 19% between wild type and mutant 2, which appeared at 724nm. With as little as 6% difference in average reflectance profiles between wild type and mutant 1 and only about 19% difference in average reflectance between wild type and mutant 2, this challenging data set was considered highly suitable for testing novel SVM approaches to reflectance data classification.

Figure 2(b) shows the average reflectance profiles from three land cover classes with reflectance values acquired from Grass/Pasture showing visual difference from the other two classes. It is evident that the average reflectance profiles of Corn-min till and Soybean-clean till are very similar across the examined spectrum. Careful evaluation reveals that the average reflectance curve from Corn-min till was slightly above Soybean-clean till in several regions.

Figure 4(a)
Fig. 4 Comparison of standardized weights obtained by using two different weighting methods: support vector machine with relief feature weighting algorithm (RSVM) and support vector machine with fuzzy relief feature weighting algorithm (FRSVM) on corn kernel data set (a), and the ratio of standardized weights using FRSVM and RSVM (b). Parameter L was equal to 1.
and Fig. 5(a)
Fig. 5 Comparison of standardized weights obtained by using two different weighting methods: support vector machine with relief feature weighting algorithm (RSVM) and support vector machine with fuzzy relief feature weighting algorithm (FRSVM) on AVIRIS data set (a), and the ratio of standardized weights using FRSVM and RSVM (b). Parameter L was equal to 1.
show standardized weights assigned by RSVM and FRSVM to corn kernel data and AVIRIS data, respectively. In both data sets, it is clearly illustrated that the two classification methods assigned similar weights to spectral bands and that spectral bands did not contribute equally to the classifications. For corn kernel data, both RSVM and FRSVM assigned highest standardized weights to spectral bands between 550 and 700 nm, and careful evaluation revealed that weights assigned by FRSVM between 550 and 700 nm were slightly higher than those assigned by RSVM (Fig. 4(b)). It was also seen that RSVM assigned higher standardized weights to spectral bands in both ends of the examined spectrum than those assigned by FRSVM. Regarding the AVIRIS data, both RSVM and FRSVM assigned highest standardized weights to spectral bands between 500 and 750 nm, 780-1200nm, 1550-1850nm, and 1950-2400nm. It was evident that weights assigned by FRSVM between 570 and 770nm, 820-880nm and 900-1200nm were slightly higher than those assigned by RSVM (Fig. 5(b)). We suspect that the slight difference in assignments of weighting scores by RSVM and FRSVM is attributed to the way the two classification methods operate. In RSVM, weighting scores assigned to each spectral band are based on all pixels providing equal contribution. For comparison, FRSVM identified a class centroid, which is a vector representing the spectral mean for each class. Subsequently, FRSVM assigns high weighting contributions to pixels near this class centroid and lower weighting contributions to pixels away from the class centroid. As a consequence, standardized weights assigned by RSVM are almost exclusively determined by spectral information, while standardized weights assigned by FRSVM are determined by a combination of spectral and spatial (distance from class centroid) information within the hyperspectral image cube.

4.2. Classification accuracy

Classification accuracies based on 10-fold cross-validations showed that both weighting methods outperformed the traditional SVM, and FRSVM exhibited the highest overall accuracy (i.e., the percentage of correctly classified pixels among all the test pixels considered) (Table 1

Table 1. Comparison of classification accuracies (%), overall accuracies (%) and Cohen Kappa coefficients conducted by the SVM, RSVM and FRSVM algorithm yielded on corn kernel data set

table-icon
View This Table
| View All Tables
and 2

Table 2. Comparison of classification accuracies (%), overall accuracies (%) and Cohen Kappa coefficients conducted by the SVM, RSVM and FRSVM algorithm yielded on AVIRIS data set

table-icon
View This Table
| View All Tables
). In the analysis of corn kernel data, RSVM and FRSVM caused 0.86% and 1.07% increase in average overall classification accuracy, respectively. As expected, the highest classification accuracy was obtained when differentiating mutant 2 and the other two inbred lines. In the analysis of AVIRIS data, RSVM and FRSVM showed an average increase in overall accuracy of 1.67% and 1.82% compared to SVM, respectively. The slightly better classification accuracy of FRSVM is likely explained by the fact that FRSVM is less influenced by outliers.

4.3. Feature reduction and classification

As part of assessing the effect of feature reduction on classification accuracy, features were selected based on Eq. (4) and (9).

For corn kernel data set, the largest difference between the peak accuracy and that obtained from the use of all 160 features was 1.44% (RSVM) and 1.13% (FRSVM) (Table 3

Table 3. Difference between peak accuracy and that derived from the use of all 160 features acquired from corn kernel data

table-icon
View This Table
| View All Tables
), when 1/10 of the original data was selected as training data. A similar general trend was also observed with the analysis of the AVIRIS data set. That is, the largest difference between the peak accuracy and that obtained from the use of all 200 features was 0.19% (RSVM) and 0.40% (FRSVM) (Table 4

Table 4. Difference between peak accuracy and that derived from the use of all 200 features acquired from AVIRIS data

table-icon
View This Table
| View All Tables
), when 1/10 of the original data was selected as training data. The results highlight the adverse effects of Hughes phenomenon when a small training data set is used, but it was also seen that RSVM and FRSVM reduced the negative effects of Hughes phenomenon.

Conclusion

Comparing the two weighting methods with the traditional SVM, weighting of features was shown to increase classification accuracy of reflectance data set. It was illustrated that the accuracy of classification was influenced by the number of features used and, therefore, was affected by the Hughes phenomenon. Compare with RSVM, we also demonstrated that FRSVM had slightly higher overall classification accuracy. It is explained by the fact that FRSVM uses the spatial distribution information of the pixel in the class and will greatly reduce the effect of noisy pixels.

Acknowledgments

This study was partially supported by the National Natural Science Foundation of China (Grant No. 61077079), by the Ph.D. Programs Foundation of Ministry of Education of China (Grant No. 20102304110013) and by the Academic Leader Foundation of Harbin City in China (Grant No. 2009RFXXG034). The authors would like to thank the support from the China Scholarship Council. Dr. Kolomiets at Texas A&M University is thanked for providing the corn kernels used in this study.

References and links

1.

J. Cohen, “A coefficient of agreement for nominal scales,” Educ. Psychol. Meas. 20(1), 37–46 (1960). [CrossRef]

2.

V. Vapnik, The Nature of Statistical Learning Theory (Springer & New York, 2000), Chap. 1.

3.

C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn. 20(3), 273–297 (1995). [CrossRef]

4.

B. E. Boser, I. M. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in COLT '92 Proceedings of the fifth annual workshop on computational learning theory, D. Haussler, ed. (ACM, New York, NY, 1992), pp. 144–152.

5.

M. Pal and G. M. Foody, “Feature selection for classification of hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens. 48(5), 2297–2307 (2010). [CrossRef]

6.

F. Bovolo, L. Bruzzone, and L. Carlin, “A novel technique for subpixel image classification based on support vector machine,” IEEE Trans. Image Process. 19(11), 2983–2999 (2010). [CrossRef]

7.

A. M. Filippi, R. Archibald, B. L. Bhaduri, and E. A. Bright, “Hyperspectral agricultural mapping using support vector machine-based endmember extraction (SVM-BEE),” Opt. Express 17(26), 23823–23842 (2009). [CrossRef] [PubMed]

8.

B. Ergun, T. Kavzoglu, I. Colkesen, and C. Sahin, “Data filtering with support vector machines in geometric camera calibration,” Opt. Express 18(3), 1927–1936 (2010). [CrossRef] [PubMed]

9.

M. A. Kumar and M. Gopal, “A comparison study on multiple binary-class SVM methods for unilabel text categorization,” Pattern Recognit. Lett. 31(11), 1437–1444 (2010). [CrossRef]

10.

N. Shanthi and K. Duraiswamy, “A novel SVM-based handwritten Tamil character recognition system,” Pattern Anal. Appl. 13(2), 173–180 (2010). [CrossRef]

11.

X. Xu, D. Zhang, and X. Zhang, “An efficient method for human face recognition using nonsubsampled contourlet transform and support vector machine,” Opt. Appl. 39, 601–615 (2009).

12.

B. Guo, S. R. Gunn, R. I. Damper, and J. B. Nelson, “Customizing kernel functions for SVM-based hyperspectral image classification,” IEEE Trans. Image Process. 17(4), 622–629 (2008). [CrossRef] [PubMed]

13.

J. Li, X. Gao, and L. Jiao, “A new feature weighted fuzzy cluster algorithm,” Acta. Electron. 34, 89–92 (2006).

14.

C. Nansen, A. J. Sidumo, and S. Capareda, “Variogram analysis of hyperspectral data to characterize the impact of biotic and abiotic stress of maize plants and to estimate biofuel potential,” Appl. Spectrosc. 64(6), 627–636 (2010). [CrossRef] [PubMed]

15.

L. R. LaMotte and A. McWhorter, “A regression-based linear classification procedure,” Educ. Psychol. Meas. 41(2), 341–347 (1981). [CrossRef]

16.

L. Gao, F. Gao, X. Guan, D. Zhou, and J. Li, “A regression algorithm based on AdaBoost,” in WCICA 2006: Sixth World Congress on Intelligent Control and Automation, D. M. Zhou, ed. (IEEE Computer Society Press, Dalian, Liaoning, 2006), pp. 4400–4404.

17.

K. Kira and L. A. Rendell, “A practical approach to feature selsecion,” in Proceeding of the 9th International Workshop on Machine Learning, D. Sleeman, ed. (Morgan Kaufmann, San Francisco, CA, 1992), pp. 249–256.

18.

T. Kayikcioglu and O. Aydemir, “A polynomial fitting and k-NN based approach for improving classification of motor imagery BCI data,” Pattern Recognit. Lett. 31(11), 1207–1215 (2010). [CrossRef]

19.

F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machine,” IEEE Trans. Geosci. Remote Sens. 42(8), 1778–1790 (2004). [CrossRef]

20.

L. Wang, C. Zhao, Y. Qiao, and W. Chen, “Research on all-around weighting methods of hyperspectral imagery classification,” Int. J. Infrared Millim. Waves 27, 442–446 (2008).

21.

P.-H. Hsu, “Feature extraction of hyperspectral images using wavelet and matching pursuit,” ISPRS J. Photogramm. Remote Sens. 62(2), 78–92 (2007). [CrossRef]

22.

C. Lee and D. A. Landgrebe, “Analyzing high-dimensional multispectral data,” IEEE Trans. Geosci. Remote Sens. 31(4), 792–800 (1993). [CrossRef]

23.

D. J. Sebald and J. A. Bucklew, “Support vector machine techniques for nonlinear equalization,” IEEE Trans. Signal Process. 48(11), 3217–3226 (2000). [CrossRef]

24.

C. Nansen, T. Herrman, and R. Swanson, “Machine vision detection of bonemeal in animal feed samples,” Appl. Spectrosc. 64(6), 637–643 (2010). [CrossRef] [PubMed]

25.

F. A. Mianji and Y. Zhang, “Robust hyperspectral classification using relevance vector machine,” IEEE Trans. Geosci. Remote Sens. 49(6), 2100–2112 (2011). [CrossRef]

OCIS Codes
(100.0100) Image processing : Image processing
(280.0280) Remote sensing and sensors : Remote sensing and sensors

ToC Category:
Image Processing

History
Original Manuscript: August 16, 2011
Revised Manuscript: December 1, 2011
Manuscript Accepted: December 1, 2011
Published: December 15, 2011

Citation
Bin Qi, Chunhui Zhao, Eunseog Youn, and Christian Nansen, "Use of weighting algorithms to improve traditional support vector machine based classifications of reflectance data," Opt. Express 19, 26816-26826 (2011)
http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-19-27-26816


Sort:  Author  |  Year  |  Journal  |  Reset  

References

  1. J. Cohen, “A coefficient of agreement for nominal scales,” Educ. Psychol. Meas.20(1), 37–46 (1960). [CrossRef]
  2. V. Vapnik, The Nature of Statistical Learning Theory (Springer & New York, 2000), Chap. 1.
  3. C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn.20(3), 273–297 (1995). [CrossRef]
  4. B. E. Boser, I. M. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in COLT '92 Proceedings of the fifth annual workshop on computational learning theory, D. Haussler, ed. (ACM, New York, NY, 1992), pp. 144–152.
  5. M. Pal and G. M. Foody, “Feature selection for classification of hyperspectral data by SVM,” IEEE Trans. Geosci. Remote Sens.48(5), 2297–2307 (2010). [CrossRef]
  6. F. Bovolo, L. Bruzzone, and L. Carlin, “A novel technique for subpixel image classification based on support vector machine,” IEEE Trans. Image Process.19(11), 2983–2999 (2010). [CrossRef]
  7. A. M. Filippi, R. Archibald, B. L. Bhaduri, and E. A. Bright, “Hyperspectral agricultural mapping using support vector machine-based endmember extraction (SVM-BEE),” Opt. Express17(26), 23823–23842 (2009). [CrossRef] [PubMed]
  8. B. Ergun, T. Kavzoglu, I. Colkesen, and C. Sahin, “Data filtering with support vector machines in geometric camera calibration,” Opt. Express18(3), 1927–1936 (2010). [CrossRef] [PubMed]
  9. M. A. Kumar and M. Gopal, “A comparison study on multiple binary-class SVM methods for unilabel text categorization,” Pattern Recognit. Lett.31(11), 1437–1444 (2010). [CrossRef]
  10. N. Shanthi and K. Duraiswamy, “A novel SVM-based handwritten Tamil character recognition system,” Pattern Anal. Appl.13(2), 173–180 (2010). [CrossRef]
  11. X. Xu, D. Zhang, and X. Zhang, “An efficient method for human face recognition using nonsubsampled contourlet transform and support vector machine,” Opt. Appl.39, 601–615 (2009).
  12. B. Guo, S. R. Gunn, R. I. Damper, and J. B. Nelson, “Customizing kernel functions for SVM-based hyperspectral image classification,” IEEE Trans. Image Process.17(4), 622–629 (2008). [CrossRef] [PubMed]
  13. J. Li, X. Gao, and L. Jiao, “A new feature weighted fuzzy cluster algorithm,” Acta. Electron.34, 89–92 (2006).
  14. C. Nansen, A. J. Sidumo, and S. Capareda, “Variogram analysis of hyperspectral data to characterize the impact of biotic and abiotic stress of maize plants and to estimate biofuel potential,” Appl. Spectrosc.64(6), 627–636 (2010). [CrossRef] [PubMed]
  15. L. R. LaMotte and A. McWhorter, “A regression-based linear classification procedure,” Educ. Psychol. Meas.41(2), 341–347 (1981). [CrossRef]
  16. L. Gao, F. Gao, X. Guan, D. Zhou, and J. Li, “A regression algorithm based on AdaBoost,” in WCICA 2006: Sixth World Congress on Intelligent Control and Automation, D. M. Zhou, ed. (IEEE Computer Society Press, Dalian, Liaoning, 2006), pp. 4400–4404.
  17. K. Kira and L. A. Rendell, “A practical approach to feature selsecion,” in Proceeding of the 9th International Workshop on Machine Learning, D. Sleeman, ed. (Morgan Kaufmann, San Francisco, CA, 1992), pp. 249–256.
  18. T. Kayikcioglu and O. Aydemir, “A polynomial fitting and k-NN based approach for improving classification of motor imagery BCI data,” Pattern Recognit. Lett.31(11), 1207–1215 (2010). [CrossRef]
  19. F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machine,” IEEE Trans. Geosci. Remote Sens.42(8), 1778–1790 (2004). [CrossRef]
  20. L. Wang, C. Zhao, Y. Qiao, and W. Chen, “Research on all-around weighting methods of hyperspectral imagery classification,” Int. J. Infrared Millim. Waves27, 442–446 (2008).
  21. P.-H. Hsu, “Feature extraction of hyperspectral images using wavelet and matching pursuit,” ISPRS J. Photogramm. Remote Sens.62(2), 78–92 (2007). [CrossRef]
  22. C. Lee and D. A. Landgrebe, “Analyzing high-dimensional multispectral data,” IEEE Trans. Geosci. Remote Sens.31(4), 792–800 (1993). [CrossRef]
  23. D. J. Sebald and J. A. Bucklew, “Support vector machine techniques for nonlinear equalization,” IEEE Trans. Signal Process.48(11), 3217–3226 (2000). [CrossRef]
  24. C. Nansen, T. Herrman, and R. Swanson, “Machine vision detection of bonemeal in animal feed samples,” Appl. Spectrosc.64(6), 637–643 (2010). [CrossRef] [PubMed]
  25. F. A. Mianji and Y. Zhang, “Robust hyperspectral classification using relevance vector machine,” IEEE Trans. Geosci. Remote Sens.49(6), 2100–2112 (2011). [CrossRef]

Cited By

Alert me when this paper is cited

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.


« Previous Article  |  Next Article »

OSA is a member of CrossRef.

CrossCheck Deposited