Elsevier

Neural Networks

Volume 23, Issue 3, April 2010, Pages 365-372
Neural Networks

TSVR: An efficient Twin Support Vector Machine for regression

https://doi.org/10.1016/j.neunet.2009.07.002Get rights and content

Abstract

The learning speed of classical Support Vector Regression (SVR) is low, since it is constructed based on the minimization of a convex quadratic function subject to the pair groups of linear inequality constraints for all training samples. In this paper we propose Twin Support Vector Regression (TSVR), a novel regressor that determines a pair of ϵ-insensitive up- and down-bound functions by solving two related SVM-type problems, each of which is smaller than that in a classical SVR. The TSVR formulation is in the spirit of Twin Support Vector Machine (TSVM) via two nonparallel planes. The experimental results on several artificial and benchmark datasets indicate that the proposed TSVR is not only fast, but also shows good generalization performance.

Introduction

Support vector machine (SVM) is an excellent kernel-based tool for binary data classification and regression (Burges, 1998, Christianini and Shawe-Taylor, 2002, Vapnik, 1995, Vapnik, 1998). This learning strategy introduced by Vapnik and co-worker (Vapnik, 1995, Vapnik, 1998) is a principled and very powerful method in machine learning algorithms. Within a few years after its introduction, SVM has already outperformed most other systems in a wide variety of applications. These include a wide spectrum of research areas, ranging from pattern recognition (Osuna, Freund, & Girosi, 1997), text categorization (Joachims, Ndellec, & Rouveriol, 1998), biomedicine (Brown, Grundy, Lin, et al., 2000), brain–computer interface (Ebrahimi, Garcia, & Vesin, 2003), and financial regression (Ince & Trafalis, 2002), etc.

Although SVM owns better generalization classification ability compared with other machine learning methods like artificial neural network (ANN), its training cost is expensive, i.e., O(l3), where l is the total size of training data. So far, many fast algorithms such as Chunking (Cortes & Vapnik, 1995), SMO (Platt, 1999), ISMO (Keerthi, Shevade, Bhattacharyya et al., 2001), SVMlight (Joachims, 1999), SVMTorch (Collobert & Bengio, 2001), LS-SVM (Suykens and Vandewalle, 1999, Suykens et al., 1999), and LIBSVM (Chang & Lin, 0000) have been presented to reduce the difficulties associated to training. Traditionally, the above algorithms solve the optimization problem of SVM by optimizing a small subset of the variables in the dual during the iteration procedure. Thus, it could be very easy to give readers the impression that this is the only possible way to train an SVM. All the above classifiers discriminate a pattern by determining in which half space it lies. Recently, Jayadeva, Khemchandani, and Chandra (2007) have proposed a Twin Support Vector Machine (TSVM) classifier for binary data classification, which is in the spirit of generalized eigenvalue proximal support vector machine (GEPSVM) (Mangasarian & Wild, 2006). The formulation of TSVM is very much similar to a classical SVM except that it aims at generating two nonparallel planes such that each plane is closer to one class and is as far as possible from the other. TSVM has become one of the popular methods in machine learning because of its low computational complexity; see e.g. Ghorai, Mukherjee, and Dutta (2009) and Kumar and Gopal, 2008, Kumar and Gopal, 2009.

As for Support Vector Regression (SVR), there exist some corresponding algorithms for learning the optimal regressor as classification, such as SMO (Shevade, Keerthi, Bhattacharyya, et al., 2000), Smooth SVR (SSVR) (Lee, Hsieh, & Huang, 2005), LS-SVM (Suykens and Vandewalle, 1999, Suykens et al., 1999), etc. On the other hand, some researchers have proposed some new SVR models based on different loss function, such as the Huber loss function (Vapnik, 1995, Vapnik, 1998). Some other methods include heuristic training (Wang & Xu, 2004), geometric method (Bi & Bennett, 2003), etc. However, all those algorithms for SVR have at least two groups of constraints, each of which ensures that more training samples locate in the given ϵ-insensitive field as far as possible. Introducing more variables and constraints in the formulation enlarges the problem size and can increase the computational complexity for solving the regression problem.

In this paper, we propose a new nonparallel plane regressor in the spirit of TSVM, termed as the Twin Support Vector Regression (TSVR). TSVR also aims at generating two nonparallel functions such that each function determines the ϵ-insensitive down- or up-bounds of the unknown regressor. Similar to TSVM (Jayadeva et al., 2007), TSVR also solves two smaller sized quadratic programming problems (QPPs) instead of solving large one as in a classical SVR. The formulation of TSVR is totally different from that of SVR in one fundamental way. In TSVR we solve a pair of QPPs, whereas, in SVR we only solve one single QPP. Further, in SVR the QPP has two groups of constraints for all data points, but in TSVR, only one group of constraints for all data points are used in each QPP. This strategy of solving two smaller sized QPPs, rather than one large QPP, makes TSVR work faster than standard SVR. Computational comparisons on TSVR and SVR in terms of generalization performance and training time have been made on several artificial and UCI datasets, indicating TSVR is not only fast, but also shows good generalization.

The paper is organized as follows: Section 2 briefly dwells on SVRs and also introduces the notation used in the rest of the paper. Section 3 introduces the linear Twin Support Vector Regression, while, in Section 4, we extend TSVR for nonlinear kernels. Section 5 deals with experimental results and Section 6 contains concluding remarks.

Section snippets

Background

Let the samples to be trained be denoted by a set of l row vector Ai,i=1,2,,l in the n-dimensional real space Rn, where the ith sample Ai=(Ai1,Ai2,,Ain). Also let A=(A1;A2;;Al) and let Y=(y1;y2;;yl) denote the response vector of training samples, where yiR. We now consider the standard SVR and LS-SVR.

Twin Support Vector Regression

In this section, we introduce an efficient approach to SVR which we have termed as Twin Support Vector Regression (TSVR). As mentioned earlier, TSVR is similar to TSVM in spirit, as it also derives a pair of nonparallel planes around the data points. However, there are some differences in essence. First, the targets of TSVR and TSVM are different, TSVR aims to find the suitable regressor while TSVM is to construct the classifier. Second, each of the two QPPs in the TSVM pair has the formulation

Kernel Twin Support Vector Regression

In order to extend our results to nonlinear regressors, we consider the following kernel-generated functions instead of linear functions. f1(x)=K(xT,AT)w1+b1,f2(x)=K(xT,AT)w2+b2. Note that if the linear kernel is used both linear functions are the special ones of (25). As the discussion in Section 3, we construct a pair of optimization problems as follows: min12(Yeϵ1(K(A,AT)w1+eb1))T(Yeϵ1(K(A,AT)w1+eb1))+C1eTξs.t.Y(K(A,AT)w1+eb1)eϵ1ξ,ξ0,min12(Y+eϵ2(K(A,AT)w2+eb2))T(Y+eϵ2(K(A,AT)w2+eb2)

Experiments and discussion

To check the validity of the proposed TSVR, we compare it with the classical SVR and LS-SVR on several datasets, including two groups of artificial datasets and seven benchmark datasets. All the regression methods are implemented in MATLAB (1994–2001) 6.5 on Windows XP running on a PC with system configuration Intel P4 processor (2.4 GHz) with 1 GB of RAM. To compare the CPU time and accuracies of three algorithms, we use the “qp.m” function in Matlab to realize the proposed TSVR and

Conclusions

In this paper we have proposed an SVR approach to data regression, termed TSVR. In TSVR, we solve two quadratic programming problems of a smaller size without any equality constraint instead of a large sized one as we do in traditional SVR. This makes TSVR much faster than a standard SVR. Furthermore, in contrast to a single regressor as given by the traditional SVR, TSVR yields two nonparallel functions such that each function determines one of the insensitive up- and down-bounds of training

Acknowledgements

This work has been partly supported by the Shanghai Leading Academic Discipline Project (No. S30405), and the Natural Science Foundation of Shanghai Normal University (No. SK200937).

References (37)

  • Chang, C. C., & Lin, C. J. (2001). LIBSVM: A library for support vector machines, available from...
  • V. Christianini et al.

    An introduction to support vector machines

    (2002)
  • W. Chu et al.

    An improved conjugate gradient method scheme to the solution of least squares SVM

    IEEE Transactions on Neural Networks

    (2005)
  • R. Collobert et al.

    SVMTorch: support vector machines for large-scale regression problems

    Journal of Machine Learning

    (2001)
  • C. Cortes et al.

    Support vector networks

    Machine Learning

    (1995)
  • T. Ebrahimi et al.

    Joint time-frequency-space classification of EEG in a brain–computer interface application

    Journal of Apply Signal Process

    (2003)
  • R.L. Eubank
  • Fung, G., & Mangasarian, O. L. (2001). Proximal support vector machines. In Processings seventh international...
  • Cited by (424)

    View all citing articles on Scopus
    View full text