Abstract
In the modern age of Internet connectivity, advanced information systems have accumulated huge volumes of data. Such fast growing, tremendous amount of data, collected and stored in large databases has far exceeded our human ability to comprehend without proper tools. There has been a great deal of research conducted to explore the potential applications of Machine Learning technologies in Security Informatics. This article studies the Network Security Detection problems in which predictive models are constructed to detect network security breaches such as spamming. Due to overwhelming volume of data, complexity and dynamics of computer networks and evolving cyber threats, current security systems suffer limited performance with low detection accuracy and high number of false alarms. To address such performance issues, a novel Machine Learning algorithm, namely Boosted Subspace Probabilistic Neural Network (BSPNN), has been proposed which combines a Radial Basis Function Neural Network with an innovative diversity-based ensemble learning framework. Extensive empirical analyses suggested that BSPNN achieved high detection accuracy with relatively small computational complexity compared with other conventional detection methods.
Similar content being viewed by others
References
Androutsopoulos I, Koutsias J, Chandrinos KV, Paliouras G, Spyropoulos CD (2000) An evaluation of naive Bayesian anti-spam filtering. In: Proceedings of the ECML, Barcelona, Spain
Carreras X, Marquez L (2001) Boosting trees for anti-spam email filtering. In: RANLP, Tzigov Chark, Bulgaria
Dhinakaran C, Lee JK, Nagamalai D (2007) Characterizing spam traffic and spammers. In: International conference on convergence information technology, pp 831–836
Dietterich TG (2000) An experimental comparison of three methods for constricting ensembles of decision trees: bagging, boosting and randomization. Mach Learn 40: 139–158
Freund Y, Schapire R (1997) A decision-theoretic generation of on-line learning and an application to boosting. J Comput Syst Sci 55: 119–139
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28: 337–407
Guruswami V, Sahai A (1999) Multiclass learning, boosting, and error-correcting codes. In: Proceedings of 12th annual conference on computational learning theory, pp 145–155
Haykin S (2008) Neural networks and learning machines. 3rd edn. Prentice Hall, Englewood Cliffs
Huang J, Ertekin S, Song Y, Zha H, Giles CL (2007) Efficient multiclass boosting classification with active learning. In: ICDM
Issac B, Jap WJ, Sutanto JH (2009) Improved Bayesian anti-spam filter implementation and analysis on independent spam corpuses. In: International conference on computer engineering and technology, pp 326–330
Kohavi R, Wolpert D (1996) Bias plus variance decomposition for zero-one loss functions. In: Proceedings of international conference on machine learning, Italy, pp 275–283
Kononenko I, Kukar M (2007) Machine learning and data mining: introduction to principles and algorithms. Horwood Publishing Limited, Chichester
Lin C-H, Liu J-C, Ho C-H (2008) Anomaly detection using LibSVM training tools. In: International conference on information security and assurance
Saiful Islam M, Khaled SM, Farhan K, Abdur Rahman M, Rahman J (2009) Modeling spammer behavior: naïve Bayes vs. artificial neural networks. In: International conference on information and multimedia technology, pp 52–55
Sakkis G, Androutsopoulos I, Paliouras G, Karkaletsis V, Spyropoulos CD, Stamatopoulos P (2001) Stacking classifiers for anti-spam filtering of e-mail. In: Proceedings of EMNLP
Sakkis G, Androutsopoulos I, Paliouras G, Karkaletsis V, Spyropoulos CD, Stamatopoulos P (2003) A memory-based approach to anti-spam filtering for mailing lists. In: Information retrieval, vol 6. Kluwer Publishing, pp 49–73
Schapire R (1997) Using output codes to boost multiclass learning problems. In: Proceedings of ICML, pp 313–321
Spetch DF (1991) A general regression neural network. IEEE Trans Neural Netw 2: 568–576
Uemura M, Tabata T (2008) Design and evaluation of a Bayesian-filter-based image spam filtering method. In: International conference on information security and assurance, pp 46–51
Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco
Zaknich A (1998) Introduction to the modified probabilistic neural network for general signal processing applications. IEEE Trans Signal Process 46: 1980–1990
Zaknich A (2003) Neural networks for intelligent signal processing. World Scientific Publishing, Sydney
Zhang W, Tong R, Dong J (2008) Z-AdaBoost: boosting 2-thresholded weak classifiers for object detection. In: Second international symposium on intelligent information technology application, vol 2, pp 839–844
Zhu J, Rosset S, Zhou H, Hastie T (2005) Multiclass adaboost. Ann Appl Stat 2: 1290–1306
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tran, T.P., Nguyen, T.T.S., Tsai, P. et al. BSPNN: boosted subspace probabilistic neural network for email security. Artif Intell Rev 35, 369–382 (2011). https://doi.org/10.1007/s10462-010-9198-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-010-9198-2