Application of boosting in recommender systems

M. A. Zharova; Жарова М. А.; V. I. Tsurkov; Цурков В. И.

doi:10.31857/S0002338824060083

Application of boosting in recommender systems

Authors: Zharova M.A.¹, Tsurkov V.I.²
Affiliations:
1. Moscow Institute of Physics and Technology (MIPT)
2. Federal Research Center “Computer Science and Control”, Russian Academy of Sciences
Issue: No 6 (2024)
Pages: 91-110
Section: ARTIFICIAL INTELLIGENCE
URL: https://rjmseer.com/0002-3388/article/view/683140
DOI: https://doi.org/10.31857/S0002338824060083
EDN: https://elibrary.ru/sudevr
ID: 683140

Cite item

Full Text

Open Access
Restricted Access

Access granted
Restricted Access

Subscription or Fee Access

Abstract
Full Text
About the authors
References
Supplementary files
Statistics

Abstract

In today's digital era, recommender systems have gained a strong foothold, becoming an important tool for effectively managing information flows. Their demand is largely due to the dynamics of current society, namely information overload and the need to personalize data. With the expansion of the scope of application of recommendation algorithms, many non-standard cases appear, for which the use of classical approaches is not so effective. This paper examines one of these: a small number of objects with a relatively large number of users in conditions of high correlation between some objects. For modeling, it is proposed to use gradient boosting, a machine learning algorithm based on an ensemble of decision trees.

Keywords

recommender systems, boosting, users and items, correlation, calibration

Full Text

About the authors

M. A. Zharova

Moscow Institute of Physics and Technology (MIPT)

Author for correspondence.
Email: zharova.ma@phystech.edu
Russian Federation, Dolgoprudny, Moscow oblast

V. I. Tsurkov

Federal Research Center “Computer Science and Control”, Russian Academy of Sciences

Email: v.tsurkov@frccsc.ru
Russian Federation, Moscow

References

Cano E., Morisio M. Hybrid Recommender Systems: A Systematic Literature Review // Intelligent Data Analysis. 2017. V. 21. P. 1487–1524.
Al-bashiri H., Abdulhak M., Romli A., Hujainah F. Collaborative Filtering Recommender System: Overview and Challenges // J. Computational and Theoretical Nanoscience. 2017. V. 23. P. 9045–9049.
Jahrer M., Toscher A. Collaborative Filtering Ensemble // J. Machine Learning Research. 2012. V. 18. P. 61–74.
Ahn H., Kang H., Lee J. Selecting a Small Number of Products for Effective User Profiling in Collaborative Filtering // Expert Systems with Applications. 2010. V. 37. P. 3055–3062.
Zharova M., Tsurkov V. Neural Network Approaches for Recommender Systems // J. Computer and Systems Sciences International. 2024. V. 62. P. 1048–1062.
Castells P., Moffat A. Offline Recommender System Evaluation: Challenges and New Directions // AI Magazine. 2022. V. 43. P. 225–238.
Bokde D., Girase S., Mukhopadhyay D. Matrix Factorization Model in Collaborative Filtering Algorithms: A Survey // Procedia Computer Science. 2015. V. 49. P. 136–146.
Filho T., Song H., Perello-Nieto M. Classifer Calibration: a Survey on How to Assess and Improve Predicted Class Probabilities // Machine Learning. 2023. P. 3211–3260.
Alzubaidi L., Bai J., Al-Sabaawi A. A Survey on Deep Learning Tools Dealing with Data Scarcity: Definitions, Challenges, Solutions, Tips, and Applications // J. Big Data. 2023. V. 10. № 46.
Grinsztajn L., Oyallon E., Varoquaux G. Why do Tree-based Models Still Outperform Deep Learning on Tabular Data? // arXiv:2207.08815v1, 2022.
Alzubaidi L., Zhang J., Humaidi A. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions // arXiv:2207.08815v1, 2022.
Borisov V., Leemann T., Sebler K. Deep Neural Networks and Tabular Data: A Survey // arXiv:2110.01889v3, 2022.
Bentejac C., Csorgo A., Martinez-Munoz G. A Comparative Analysis of Gradient Boosting Algorithms // Artificial Intelligence Review. 2020. V. 54. P. 1937–1967.
Sahour H., Gholami V., Torkaman J. Random Forest and Extreme Gradient Boosting Algorithms for Streamflow Modeling Using Vessel Features and Tree-rings // Environmental Earth Sciences. 2021. V. 80. № 747.
Имплементация модели LightGBM на Python // GitHub. Microsoft LightGBM: webcite https://github.com/microsoft/LightGBM (accessed: 10.07.2024).
Имплементация модели XGBoost на Python // GitHub. Distributed (Deep) Machine Learning Community XGBoost: webcite https://github.com/dmlc/xgboost (accessed: 10.07.2024).
Имплементация модели CatBoost на Python // GitHub. CatBoost: webcite https://github.com/catboost/catboost (accessed: 10.07.2024).
Ke1 G., Meng Q., Finley T. LightGBM: A Highly Efficient Gradient Boosting Decision Tree // Advances in Neural Information Processing Systems. 2017. P. 3146–3154.
Эксперименты с моделью LightGBM // Kaggle. LightGBM experiments: webcite https://www.kaggle.com/code/prashant111/lightgbm-classifier-in-python (accessed: 10.07.2024).
Chen T., Guestrin C. XGBoost: A Scalable Tree Boosting System // arXiv:1603.02754v3, 2016.
Dorogush A., Prokhorenkova L., Gusev G. CatBoost: Unbiased Boosting with Categorical Features // arXiv:1706.09516v5, 2019.
Pargentn F., Pfisterer F., Thomas J., Bischl D. Regularized Target Encoding Outperforms Traditional Methods in Supervised Machine Learning with High Cardinality Features // Computational Statistics. 2022. V. 37. P. 2671–2692.
Niculescu-Mizil A., Caruana R. Predicting Good Probabilities with Supervised Learning // Machine Learning, Proc. 22nd Intern. Conf. (ICML). Bonn, Germany, 2005. P. 625–632.
Guo C., Pleiss G., Sun Y., Weinberger K. On Calibration of Modern Neural Networks // arXiv:1706.04599v2, 2017.
Barlow R., Bartholomew D., Bremner J., Brunk H. Statistical Inference under Order Restrictions: The Theory and Application of Isotonic Regression // Royal Statistical Society. Series A: General. 1974. V. 137. P. 92–93.
Zadrozny B., Elkan C. Transforming Classifier Scores into Accurate Multiclass Probability Estimates // The Eighth ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. Edmonton, 2002.
Platt J. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods // Advances in Large Margin Classifiers. Cambridge: MIT Press, 2000. P. 61–74.
Zadrozny B., Elkan C. Transforming Classifier Scores into Accurate Multiclass Probability Estimates // Proc. 8th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining. N. Y., 2002. P. 694–699.
Guo C., Pleiss G., Sun Y., Weinberger K. On Calibration of Modern Neural Networks // Machine Learning, Proc. 34th Intern. Conf. (ICML). Sydney, 2017.
Gupta C., Ramdas A. Distribution-free Calibration Guarantees for Histogram Binning without Sample Splitting // arXiv:2105.04656v2, 2021.
Naeini M., Cooper G. Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models // Knowledge and Information Systems. 2018. V. 54. P. 151–170.
Filho T., Song H., Perello-Nieto M. Classifier Calibration: a Survey on How to Assess and Improve Predicted Class Probabilities // Machine Learning. 2023. V. 112. P. 3211–3260.
Wang H., Liang Q., Hancock J., Khoshgoftaar T. Feature Selection Strategies: a Comparative Analysis of SHAP-value and Importance-based Methods // J. Big Data. 2024. V. 11. № 44.
Gebreyesus Y., Dalton D., Nixon S., Chiara D., Chinnic M. Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP) // Future Internet. 2023. V. 15. № 88.
Имплементация библиотеки для подбора гиперпараметров Optuna на Python // GitHub. Optuna: webcite https://github.com/optuna/ optuna (accessed: 20.07.2024).

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

2. Fig. 1. Formation of training and test datasets over time.

Download (11KB)

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register

No 5 (2025)

No 5 (2025)

Application of boosting in recommender systems

Full Text

Abstract

Keywords

Full Text

About the authors

M. A. Zharova

V. I. Tsurkov

References

Supplementary files