A Filter-SQP strategy for training Support Vector Machine models

Tiago Lino Bello; Luiz Carlos Matioli; Lucas Garcia Pedroso; Daniela Miray Igarashi

doi:10.35819/remat2023v9i2id6241

Authors

Tiago Lino Bello Universidade Federal do Paraná (UFPR), Curitiba, PR, Brazil https://orcid.org/0000-0002-1095-9587
Luiz Carlos Matioli Federal University of Paraná (UFPR), Curitiba, PR, Brazil https://orcid.org/0000-0002-6506-3550
Lucas Garcia Pedroso Federal University of Paraná (UFPR), Curitiba, PR, Brazil https://orcid.org/0009-0003-1516-119X
Daniela Miray Igarashi Federal University of Paraná (UFPR), Curitiba, PR, Brazil https://orcid.org/0000-0002-7163-759X

DOI:

https://doi.org/10.35819/remat2023v9i2id6241

Keywords:

support vector machine, training, optimization, filter method, sequential quadratic programming

Abstract

This paper introduces a filtering strategy for addressing optimization problems arising in binary Support Vector Machine classification. The training optimization problem aims to solve the dual formulation which involves a quadratic objective function subjected to a linear and box constraints. Our approach employs a Filter algorithm with Sequential Quadratic Programming iterations that minimize the quadratic Lagrangian approximations. Notably, we utilize the exact Hessian matrix in our numerical experiments to seek the desired classification function. Moreover, we present a Filter algorithm combined with the Augmented Lagrangian method aiming to accelerate the algorithm convergence. To substantiate our method's effectiveness, we conduct numerical experiments through MATLAB, comparing outcomes with alternative methodologies detailed in existing literature. Numerical experiments shows that the Filter--SQP combined with Augmented Lagrangian method is competitive and efficient method compared with an interior-point based solver and LIBSVM software in relation of classification metrics and CPU-time.

Downloads

Download data is not yet available.

Author Biographies

Tiago Lino Bello, Universidade Federal do Paraná (UFPR), Curitiba, PR, Brazil

http://lattes.cnpq.br/1698921535580651
Luiz Carlos Matioli, Federal University of Paraná (UFPR), Curitiba, PR, Brazil

http://lattes.cnpq.br/6084324625677114
Lucas Garcia Pedroso, Federal University of Paraná (UFPR), Curitiba, PR, Brazil

http://lattes.cnpq.br/0906070603571189
Daniela Miray Igarashi, Federal University of Paraná (UFPR), Curitiba, PR, Brazil

http://lattes.cnpq.br/8676771586561454

References

BALDI, P. ; BRUNAK, S.; CHAUVIN, Y.; ANDERSEN, C.; A. F.; NIELSEN, H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics, v. 16, n. 5, p. 412-424, 2000. DOI: https://doi.org/10.1093/bioinformatics/16.5.412.

BISHOP, C. M. Pattern recognition and machine learning. [S.l.]: Springer, 2006.

BOSER, B. E.; GUYON, I. M.; VAPNIK, V. N. A training algorithm for optimal margin classifiers. In: FIFTH ANNUAL WORKSHOP ON COMPUTATIONAL LEARNING THEORY, Pittsburgh, Pennsylvania, USA, 1992. Proceedings [...]. New York, NY, USA: Association for Computing Machinery, 1992, p. 144-152. DOI: https://doi.org/10.1145/130385.130401.

CHANG, C.-C.; LIN, C.-J. LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology, v. 2, n. 3, p. 1-27, 2011. DOI: https://doi.org/10.1145/1961189.1961199.

CHAUHAN, V. K.; DAHIYA, K.; SHARMA, A. Problem formulations and solvers in linear SVM: a review. Artificial Intelligence Review, v. 52, n. 2, p. 803-855, 2019. DOI: https://doi.org/10.1007/s10462-018-9614-6.

CHEN, H. L.; YANG, B.; WANG, S. J.; Wang, G.; LIU, D. Y.; LI, H. Z.; LIU, W. B. Towards an optimal support vector machine classifier using a parallel particle swarm optimization strategy. Applied Mathematics and Computation, v. 239, p. 180-197, 2014. DOI: https://doi.org/10.1016/j.amc.2014.04.039.

CORTES, C.; VAPNIK, V. Support-vector networks. Machine Learning, v. 20, p. 273-297, 1995. DOI: https://doi.org/10.1007/BF00994018.

CRISTIANINI, N.; SHAWE-TAYLOR, J. An introduction to support vector machines and other kernel-based learning methods. [S.l.]: Cambridge University Press, 2000. DOI: https://doi.org/10.1017/CBO9780511801389.

DENG, N.; TIAN, Y.; ZHANG, C. Support Vector Machines: Optimization based theory, algorithms, and extensions. [S.l.]: Chapman & Hall, CRC Press, 2012.

DOLAN, E. D; MORÉ, J. J. Benchmarking optimization software with performance profiles. Mathematical Programming, v. 91, p. 201-213, 2002. DOI: https://doi.org/10.1007/s101070100263.

FAN, R.-E.; CHANG, K.-W.; HSIEH, C.-J.; WANG, X.-R.; LIN, C-J. Liblinear: A library for large linear classification. Journal of Machine Learning Research, v. 9, p. 1871-1874, 2008.

FERRIS, M. C.; MUNSON, T. S. Interior-point methods for massive support vector machines. SIAM Journal on Optimization, v. 13, n. 3, p. 783-804, 2002. DOI: https://doi.org/10.1137/S1052623400374379.

FLETCHER, R.; LEYFFER, S. Nonlinear programming without a penalty function. Mathematical Programming, v. 91, p. 239-269, 2002. DOI: https://doi.org/10.1007/s101070100244.

FROHLICH, H.; ZELL, A. Efficient parameter selection for support vector machines in classification and regression via model-based global optimization. In: IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, Montreal, QC, Canada, 2005. Proceedings [...]. v. 3, 2005, p. 1431-1436. DOI: https://doi.org/10.1109/IJCNN.2005.1556085.

GONZAGA, C. C.; KARAS, E.; VANTI, M. A globally convergent filter method for nonlinear programming. SIAM Journal on Optimization, v. 14, n. 3, p. 646-669, 2004. DOI: https://doi.org/10.1137/S1052623401399320.

GOULD, N. I. M.; LEYFFER, S.; TOINT, P. L. A multidimensional filter algorithm for nonlinear equations and nonlinear least-squares. SIAM Journal on Optimization, v. 15, n. 1, p. 17-38, 2004. DOI: https://doi.org/10.1137/S1052623403422637.

HASTIE, T.; TIBSHIRANI, R.; FRIEDMAN, J. The elements of statistical learning: data mining, inference, and prediction. Springer Series in Statistics. 2. ed. [S.l.]: Springer, 2009. DOI: https://doi.org/10.1007/978-0-387-84858-7.

HSIEH, C.-J.; CHANG, K.-W.; LIN, C.-J.; KEERTHI, S S.; SUNDARARAJAN, S. A dual coordinate descent method for large-scale linear SVM. In: 25TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING, Helsinki, Finland, 2008. Proceedings [...]. New York, NY, USA: Association for Computing Machinery, 2008, p. 408-415. DOI: https://doi.org/10.1145/1390156.1390208.

IZMAILOV, A. F.; SOLODOV, M. V. Newton-type methods for optimization and variational problems. Springer Series in Operations Research and Financial Engineering. [S.l.]: Springer, 2014. DOI: https://doi.org/10.1007/978-3-319-04247-3.

JOACHIMS, T. Making large-scale SVM learning practical. [S.l.]: [S.n.], 1998, p. 41-56.

JOACHIMS, T. Training linear SVMs in linear time. In: 12TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, Philadelphia, PA, USA, 2006. Proceedings [...]. New York, NY, USA: Association for Computing Machinery, 2006, p. 217-226. DOI: https://doi.org/10.1145/1150402.1150429.

LIN, C.-J.; WENG, R. C.; KEERTHI, S. S. Trust region newton method for logistic regression. Journal of Machine Learning Research, v. 9, n. 22, p. 627-650, 2008. Available in: https://www.jmlr.org/papers/v9/lin08b.html. Access at: October 29, 2023.

MANGASARIAN, O. L.; MUSICANT, D. R. Lagrangian support vector machines. Journal of Machine Learning Research, v. 1, n. 3, p. 161-177, 2001.

MATTHEWS, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, v. 405, n. 2, p. 442-451, 1975. DOI: https://doi.org/10.1016/0005-2795(75)90109-9.

NIU, D.; WANG, C.; TANG, P.; WANG, Q.; SONG, E. A sparse semismooth Newton based augmented Lagrangian method for large-scale support vector machines. arXiv:1910.01312, 2019. DOI: https://doi.org/10.48550/arXiv.1910.01312.

NOCEDAL, J.; WRIGHT, S. Numerical optimization. Springer Series in Operations Research and Financial Engineering. 2. ed. [S.l.]: Springer, 2006. DOI: https://doi.org/10.1007/978-0-387-40065-5.

PERIÇARO, G. A.; RIBEIRO, A. A.; KARAS, E. W. Global convergence of a general filter algorithm based on an efficiency condition of the step. Applied Mathematics and Computation, v. 219, n. 17, p. 9581-9597, 2013. DOI: https://doi.org/10.1016/j.amc.2013.03.012.

PICCIALLI, V.; SCIANDRONE, M. Nonlinear optimization and support vector machines. 4OR, v. 16, p. 111-149, 2018. DOI: https://doi.org/10.1007/s10288-018-0378-2.

PLATT, J. Sequential minimal optimization: A fast algorithm for training support vector machines. [S.l.]: Microsoft, 1998. Available in: https://www.microsoft.com/en-us/research/publication/sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines. Access at: October 29, 2023.

SCHÖLKOPF, B.; SMOLA, A. J. Learning with kernels: support vector machines, regularization, optimization, and beyond. [S.l.]: MIT Press, 2001.

SMOLA, A. J.; VISHWANATHAN, S. V. N.; LE, Q. V. Bundle methods for machine learning. In: TWENTY-FIRST ANNUAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS, Vancouver, British Columbia, Canada, 2007. Proceedings [...]. [S.l.]: Curran Associates, 2007, p. 1377-1384. Available in: https://proceedings.neurips.cc/paper_files/paper/2007/file/26337353b7962f533d78c762373b3318-Paper.pdf. Access at: October 1, 2023.

SHALEV-SHWARTZ, S.; Singer, Y.; Srebro, N.; Cotter, A. Pegasos: primal estimated sub-gradient solver for SVM. Mathematical Programming, v. 127, p. 3-30, 2011. DOI: https://doi.org/10.1007/s10107-010-0420-4.

SHAWE-TAYLOR, J.; CRISTIANINI, N. Kernel methods for pattern analysis. [S.l.]: Cambridge University Press, 2004.

TORREALBA, E. M. R.; SILVA, J. G.; MATIOLI, L. C.; KOLOSSOSKI, O.; SANTOS, P. S. M. Augmented Lagrangian algorithms for solving the continuous nonlinear resource allocation problem. European Journal of Operational Research, v. 299, n. 1, p. 46-59, 2022. DOI: https://doi.org/10.1016/j.ejor.2021.11.027.

WOODSEND, K.; GONDZIO, J. Exploiting separability in large-scale linear support vector machine training. Computational Optimization and Applications, v. 49, p. 241-269, 2011. DOI: https://doi.org/10.1007/s10589-009-9296-8.

WOODSEND, K.; GONDZIO, J. Hybrid mpi/openmp parallel linear support vector machine training. Journal of Machine Learning Research, v. 10, p. 1937-1953, 2009.

YAN, Y.; LI, Q. An efficient augmented lagrangian method for support vector machine. Optimization Methods and Software, v. 35, n. 4, p. 855-883, 2020. DOI: https://doi.org/10.1080/10556788.2020.1734002.

YIN, J.; LI, Q. A semismooth newton method for support vector classification and regression. Computational Optimization and Applications, v. 73, p. 477-508, 2019. DOI: https://doi.org/10.1007/s10589-019-00075-z.