Application of boosting in recommender systems
- Authors: Zharova M.A.1, Tsurkov V.I.2
- 
							Affiliations: 
							- Moscow Institute of Physics and Technology (MIPT)
- Federal Research Center “Computer Science and Control”, Russian Academy of Sciences
 
- Issue: No 6 (2024)
- Pages: 91-110
- Section: ARTIFICIAL INTELLIGENCE
- URL: https://medbiosci.ru/0002-3388/article/view/282642
- DOI: https://doi.org/10.31857/S0002338824060083
- EDN: https://elibrary.ru/sudevr
- ID: 282642
Cite item
Abstract
In today's digital era, recommender systems have gained a strong foothold, becoming an important tool for effectively managing information flows. Their demand is largely due to the dynamics of current society, namely information overload and the need to personalize data. With the expansion of the scope of application of recommendation algorithms, many non-standard cases appear, for which the use of classical approaches is not so effective. This paper examines one of these: a small number of objects with a relatively large number of users in conditions of high correlation between some objects. For modeling, it is proposed to use gradient boosting, a machine learning algorithm based on an ensemble of decision trees.
Keywords
About the authors
M. A. Zharova
Moscow Institute of Physics and Technology (MIPT)
							Author for correspondence.
							Email: zharova.ma@phystech.edu
				                					                																			                												                	Russian Federation, 							Dolgoprudny, Moscow oblast						
V. I. Tsurkov
Federal Research Center “Computer Science and Control”, Russian Academy of Sciences
														Email: v.tsurkov@frccsc.ru
				                					                																			                												                	Russian Federation, 							Moscow						
References
- Cano E., Morisio M. Hybrid Recommender Systems: A Systematic Literature Review // Intelligent Data Analysis. 2017. V. 21. P. 1487–1524.
- Al-bashiri H., Abdulhak M., Romli A., Hujainah F. Collaborative Filtering Recommender System: Overview and Challenges // J. Computational and Theoretical Nanoscience. 2017. V. 23. P. 9045–9049.
- Jahrer M., Toscher A. Collaborative Filtering Ensemble // J. Machine Learning Research. 2012. V. 18. P. 61–74.
- Ahn H., Kang H., Lee J. Selecting a Small Number of Products for Effective User Profiling in Collaborative Filtering // Expert Systems with Applications. 2010. V. 37. P. 3055–3062.
- Zharova M., Tsurkov V. Neural Network Approaches for Recommender Systems // J. Computer and Systems Sciences International. 2024. V. 62. P. 1048–1062.
- Castells P., Moffat A. Offline Recommender System Evaluation: Challenges and New Directions // AI Magazine. 2022. V. 43. P. 225–238.
- Bokde D., Girase S., Mukhopadhyay D. Matrix Factorization Model in Collaborative Filtering Algorithms: A Survey // Procedia Computer Science. 2015. V. 49. P. 136–146.
- Filho T., Song H., Perello-Nieto M. Classifer Calibration: a Survey on How to Assess and Improve Predicted Class Probabilities // Machine Learning. 2023. P. 3211–3260.
- Alzubaidi L., Bai J., Al-Sabaawi A. A Survey on Deep Learning Tools Dealing with Data Scarcity: Definitions, Challenges, Solutions, Tips, and Applications // J. Big Data. 2023. V. 10. № 46.
- Grinsztajn L., Oyallon E., Varoquaux G. Why do Tree-based Models Still Outperform Deep Learning on Tabular Data? // arXiv:2207.08815v1, 2022.
- Alzubaidi L., Zhang J., Humaidi A. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions // arXiv:2207.08815v1, 2022.
- Borisov V., Leemann T., Sebler K. Deep Neural Networks and Tabular Data: A Survey // arXiv:2110.01889v3, 2022.
- Bentejac C., Csorgo A., Martinez-Munoz G. A Comparative Analysis of Gradient Boosting Algorithms // Artificial Intelligence Review. 2020. V. 54. P. 1937–1967.
- Sahour H., Gholami V., Torkaman J. Random Forest and Extreme Gradient Boosting Algorithms for Streamflow Modeling Using Vessel Features and Tree-rings // Environmental Earth Sciences. 2021. V. 80. № 747.
- Имплементация модели LightGBM на Python // GitHub. Microsoft LightGBM: webcite https://github.com/microsoft/LightGBM (accessed: 10.07.2024).
- Имплементация модели XGBoost на Python // GitHub. Distributed (Deep) Machine Learning Community XGBoost: webcite https://github.com/dmlc/xgboost (accessed: 10.07.2024).
- Имплементация модели CatBoost на Python // GitHub. CatBoost: webcite https://github.com/catboost/catboost (accessed: 10.07.2024).
- Ke1 G., Meng Q., Finley T. LightGBM: A Highly Efficient Gradient Boosting Decision Tree // Advances in Neural Information Processing Systems. 2017. P. 3146–3154.
- Эксперименты с моделью LightGBM // Kaggle. LightGBM experiments: webcite https://www.kaggle.com/code/prashant111/lightgbm-classifier-in-python (accessed: 10.07.2024).
- Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // arXiv:1603.02754v3, 2016.
- Dorogush A., Prokhorenkova L., Gusev G. CatBoost: Unbiased Boosting with Categorical Features // arXiv:1706.09516v5, 2019.
- Pargentn F., Pfisterer F., Thomas J., Bischl D. Regularized Target Encoding Outperforms Traditional Methods in Supervised Machine Learning with High Cardinality Features // Computational Statistics. 2022. V. 37. P. 2671–2692.
- Niculescu-Mizil A., Caruana R. Predicting Good Probabilities with Supervised Learning // Machine Learning, Proc. 22nd Intern. Conf. (ICML). Bonn, Germany, 2005. P. 625–632.
- Guo C., Pleiss G., Sun Y., Weinberger K. On Calibration of Modern Neural Networks // arXiv:1706.04599v2, 2017.
- Barlow R., Bartholomew D., Bremner J., Brunk H. Statistical Inference under Order Restrictions: The Theory and Application of Isotonic Regression // Royal Statistical Society. Series A: General. 1974. V. 137. P. 92–93.
- Zadrozny B., Elkan C. Transforming Classifier Scores into Accurate Multiclass Probability Estimates // The Eighth ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. Edmonton, 2002.
- Platt J. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods // Advances in Large Margin Classifiers. Cambridge: MIT Press, 2000. P. 61–74.
- Zadrozny B., Elkan C. Transforming Classifier Scores into Accurate Multiclass Probability Estimates // Proc. 8th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. N. Y., 2002. P. 694–699.
- Guo C., Pleiss G., Sun Y., Weinberger K. On Calibration of Modern Neural Networks // Machine Learning, Proc. 34th Intern. Conf. (ICML). Sydney, 2017.
- Gupta C., Ramdas A. Distribution-free Calibration Guarantees for Histogram Binning without Sample Splitting // arXiv:2105.04656v2, 2021.
- Naeini M., Cooper G. Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models // Knowledge and Information Systems. 2018. V. 54. P. 151–170.
- Filho T., Song H., Perello-Nieto M. Classifier Calibration: a Survey on How to Assess and Improve Predicted Class Probabilities // Machine Learning. 2023. V. 112. P. 3211–3260.
- Wang H., Liang Q., Hancock J., Khoshgoftaar T. Feature Selection Strategies: a Comparative Analysis of SHAP-value and Importance-based Methods // J. Big Data. 2024. V. 11. № 44.
- Gebreyesus Y., Dalton D., Nixon S., Chiara D., Chinnic M. Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP) // Future Internet. 2023. V. 15. № 88.
- Имплементация библиотеки для подбора гиперпараметров Optuna на Python // GitHub. Optuna: webcite https://github.com/optuna/ optuna (accessed: 20.07.2024).
Supplementary files
 
				
			 
					 
						 
						 
						 
						 
				
 
  
  
  Email this article
			Email this article 
 Open Access
		                                Open Access Access granted
						Access granted Subscription Access
		                                		                                        Subscription Access
		                                					