International Science Index


Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Abstract:Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.
[1] M. F. F. Oliveira, “An´alise de mercado: uma ferramenta de mapeamento de oportunidades de neg´ocio em t´ecnicas de geomarketing e aprendizado de m´aquina,” 2016.
[2] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification techniques,” 2007.
[3] S. Chebrolu, A. Abraham, and J. P. Thomas, “Feature deduction and ensemble design of intrusion detection systems,” Computers & security, vol. 24, no. 4, pp. 295–307, 2005.
[4] I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani, and S. U. Khan, “The rise of “big data” on cloud computing: Review and open research issues,” Information Systems, vol. 47, pp. 98–115, 2015.
[5] G. Malik and M. Tarique, “On machine learning techniques for multi class classification,” International Journal of Advancements in Research & Technology, vol. 3, no. 2, 2014.
[6] J. Liu, S. Ranka, and T. Kahveci, “Classification and feature selection algorithms for multi-class cgh data,” Bioinformatics, vol. 24, no. 13, pp. i86–i95, 2008.
[7] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016.
[8] M. Fernandez-Delgado, E. Cernadas, S. Barro, and D. Amorim, “Do we need hundreds of classifiers to solve real world classification problems,” J. Mach. Learn. Res, vol. 15, no. 1, pp. 3133–3181, 2014.
[9] J. A. Saez, J. Luengo, and F. Herrera, “Evaluating the classifier behavior with noisy data considering performance and robustness: the equalized loss of accuracy measure,” Neurocomputing, vol. 176, pp. 26–35, 2016.
[10] Y. Ren, L. Zhang, and P. N. Suganthan, “Ensemble classification and regression-recent developments, applications and future directions
[review article],” IEEE Computational Intelligence Magazine, vol. 11, no. 1, pp. 41–53, 2016.
[11] J.-C. Levesque, C. Gagne, and R. Sabourin, “Bayesian hyperparameter optimization for ensemble learning,” arXiv preprint arXiv:1605.06394, 2016.
[12] L. M. Vriesmann, Selec¸ ˜ao Dinˆamica de Subconjunto de Classificadores. PhD thesis, Pontif´ıcia Universidade Cat´olica do Paran´a, 2012.
[13] R. Vilalta and Y. Drissi, “A perspective view and survey of meta-learning,” Artificial Intelligence Review, vol. 18, no. 2, pp. 77–95, 2002.
[14] A. K. Seewald, Towards understanding stacking: studies of a general ensemble learning scheme. na, 2003.
[15] S. Dˇzeroski and B. ˇ Zenko, “Is combining classifiers with stacking better than selecting the best one?,” Machine learning, vol. 54, no. 3, pp. 255–273, 2004.
[16] R. Lorbieski and S. Nassar, “Performance evaluation in multi-level ensemble.,” 2017. Manuscript submitted for publication.
[17] A. Ledezma, R. Aler, A. Sanchis, and D. Borrajo, “Ga-stacking: Evolutionary stacked generalization,” Intelligent Data Analysis, vol. 14, no. 1, pp. 89–119, 2010.
[18] D. H. Wolpert, “Stacked generalization,” Neural networks, vol. 5, no. 2, pp. 241–259, 1992.
[19] G. Sigletos, G. Paliouras, C. D. Spyropoulos, and M. Hatzopoulos, “Combining information extraction systems using voting and stacked generalization,” Journal of Machine Learning Research, vol. 6, no. Nov, pp. 1751–1782, 2005.
[20] L. Breiman, “Bagging predictors,” Machine learning, vol. 24, no. 2, pp. 123–140, 1996.
[21] K. M. Ting and I. H. Witten, “Issues in stacked generalization,” J. Artif. Intell. Res.(JAIR), vol. 10, pp. 271–289, 1999.
[22] G. Tsirogiannis, D. Frossyniotis, J. Stoitsis, S. Golemati, A. Stafylopatis, and K. Nikita, “Classification of medical data with a robust multi-level combination scheme,” in Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on, vol. 3, pp. 2483–2487, IEEE, 2004.
[23] T. Li, S. Zhu, and M. Ogihara, “Using discriminant analysis for multi-class classification: an experimental investigation,” Knowledge and information systems, vol. 10, no. 4, pp. 453–472, 2006.
[24] A. K. Tanwani, J. Afridi, M. Z. Shafiq, and M. Farooq, “Guidelines to select machine learning scheme for classification of biomedical datasets,” in European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, pp. 128–139, Springer, 2009.
[25] T. Windeatt and R. Ghaderi, “Coding and decoding strategies for multi-class learning problems,” Information Fusion, vol. 4, no. 1, pp. 11–21, 2003.
[26] G. Tsoumakas and I. Vlahavas, “Random k-labelsets: An ensemble method for multilabel classification,” in European Conference on Machine Learning, pp. 406–417, Springer, 2007.
[27] M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, “An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes,” Pattern Recognition, vol. 44, no. 8, pp. 1761–1776, 2011.
[28] A. Jurek, Y. Bi, S. Wu, and C. Nugent, “A survey of commonly used ensemble-based classification techniques,” The Knowledge Engineering Review, vol. 29, no. 05, pp. 551–581, 2014.
[29] E. Menahem, L. Rokach, and Y. Elovici, “Troika–an improved stacking schema for classification tasks,” Information Sciences, vol. 179, no. 24, pp. 4097–4122, 2009.
[30] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The weka data mining software: an update,” ACM SIGKDD explorations newsletter, vol. 11, no. 1, pp. 10–18, 2009.
[31] A. Prodromidis, P. Chan, and S. Stolfo, “Meta-learning in distributed data mining systems: Issues and approaches,” Advances in distributed and parallel knowledge discovery, vol. 3, pp. 81–114, 2000.
[32] Y. Freund, R. E. Schapire, et al., “Experiments with a new boosting algorithm,” in icml, vol. 96, pp. 148–156, 1996.
[33] N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian network classifiers,” Machine learning, vol. 29, no. 2-3, pp. 131–163, 1997.
[34] J. Quinlan, “C4. 5: Programs for empirical learning morgan kaufmann,” San Francisco, CA, 1993.
[35] G. H. John and P. Langley, “Estimating continuous distributions in bayesian classifiers,” in Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, pp. 338–345, Morgan Kaufmann Publishers Inc., 1995.
[36] W. Iba and P. Langley, “Induction of one-level decision trees,” in Proceedings of the ninth international conference on machine learning, pp. 233–240, 1992.
[37] K. Bache and M. Lichman, “Uci machine learning repository,” 2013.
[38] D. J. Hand and R. J. Till, “A simple generalisation of the area under the roc curve for multiple class classification problems,” Machine learning, vol. 45, no. 2, pp. 171–186, 2001.
[39] B. Cohen, Explaining Psychological Statistics. Wiley, 2013.