
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Dota 2 Hero Buff And Nerf Predictions Based On Professional Match Data Using Random Forest
Corresponding Author(s) : Muhamad Azrino Gustalika
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control,
Vol. 11, No. 3, August 2026 (Article in Progress)
Abstract
Balancing updates (buffs and nerfs) are critical in Multiplayer Online Battle Arena games because small parameter changes can shift the competitive metagame and reduce hero diversity. This study proposed a data-driven pipeline to classify each Dota 2 hero as overpowered, underpowered, or balanced from professional match telemetry and to translate these classes into balance recommendations (nerf, buff, or balance). Most prior Dota 2 studies focus on match outcome or micro-event prediction and do not evaluate hero-centric balance recommendations against official patch actions across patch transitions. To address this gap, this work contributes a patch-to-patch external validation protocol that compares recommendations from patch t with developer actions in patch t+1 using patch notes. Professional match records were collected from public sources and aggregated per hero and per patch into combat, economy, and impact features (e.g., kills, deaths, assists, gold per minute, experience per minute, damage dealt, tower damage, and healing). Labels were derived from win-rate and pick-rate distributions using statistical control limits (μ ± kσ, k = 0.3) to ensure transparent and repeatable labeling. A Random Forest classifier was trained using grid-searched hyperparameters and evaluated using stratified 6-fold cross-validation with macro-averaged F1 to address class imbalance. Internal evaluation achieved 0.94 accuracy and 0.84 macro-F1. For external validation, recommendations from patch t were compared with official balance actions in patch t+1 across six consecutive transitions; accuracy ranged from 0.436 to 0.672 (mean 0.559), with the best result on 7.39b to 7.39c (84/125). These results indicated that professional telemetry could support interpretable balance monitoring and provide early signals for buff/nerf candidate review
Keywords
Download Citation
Endnote/Zotero/Mendeley (RIS)BibTeX
- C. Huang and S. D. Bruda, “Improved balance in multiplayer online battle arena games,” Acta Univ. Sapientiae, Inform., vol. 12, no. 2, pp. 183–204, Dec. 2020, doi: 10.2478/ausi-2020-0011.
- Y. Su, P. Backlund, and H. Engström, “Comprehensive review and classification of game analytics,” Serv. Oriented Comput. Appl., vol. 15, no. 2, pp. 141–156, Jun. 2021, doi: 10.1007/s11761-020-00303-z.
- X. Zhong and J. Xu, “Measuring the effect of game updates on player engagement: A cue from DOTA2,” Entertain. Comput., vol. 43, p. 100506, Aug. 2022, doi: 10.1016/j.entcom.2022.100506.
- S. García-Méndez and F. de Arriba-Pérez, “Explainable e-sports win prediction through Machine Learning classification in streaming,” Entertain. Comput., vol. 55, p. 101027, Sep. 2025, doi: 10.1016/j.entcom.2025.101027.
- J. Losada-Rodríguez, P. A. Castillo, A. Mora, and P. García-Sánchez, “The Explainability of Machine Learning Algorithms for Victory Prediction in the Video Game Dota 2,” in ITISE 2025, Basel Switzerland: MDPI, Aug. 2025, p. 26. doi: 10.3390/cmsf2025011026.
- C. Zhao, H. Zhao, Y. Ge, R. Wu, and X. Shen, “Winning Tracker: A New Model for Real-time Winning Prediction in MOBA Games,” in Proceedings of the ACM Web Conference 2022, New York, NY, USA: ACM, Apr. 2022, pp. 3387–3395. doi: 10.1145/3485447.3512274.
- Y. Peng, “The Application of Machine Learning in Predicting the Results of Popular eSports Games: Win Rate Prediction in MOBA and FPS Games,” Highlights Sci. Eng. Technol., vol. 85, pp. 1150–1156, Mar. 2024, doi: 10.54097/dg0nm289.
- S. Yangibaev, J. Mattiev, and S. Mokwena, “DotA 2 Match Outcome Prediction System Using Decision Tree Ensemble Algorithms,” Big Data Cogn. Comput., vol. 9, no. 12, p. 302, Nov. 2025, doi: 10.3390/bdcc9120302.
- V. P. K. Turlapati and M. R. Prusty, “Outlier-SMOTE: A refined oversampling technique for improved detection of COVID-19,” Intell. Med., vol. 3–4, p. 100023, Dec. 2020, doi: 10.1016/j.ibmed.2020.100023.
- M. Riess, “Automating model management: a survey on metaheuristics for concept-drift adaptation,” J. Data, Inf. Manag., vol. 4, no. 3–4, pp. 211–229, Dec. 2022, doi: 10.1007/s42488-022-00075-5.
- E. Yu, Y. Song, G. Zhang, and J. Lu, “Learn-to-adapt: Concept drift adaptation for hybrid multiple streams,” Neurocomputing, vol. 496, pp. 121–130, Jul. 2022, doi: 10.1016/j.neucom.2022.05.025.
- D. M. V. Sato, S. C. De Freitas, J. P. Barddal, and E. E. Scalabrin, “A Survey on Concept Drift in Process Mining,” ACM Comput. Surv., vol. 54, no. 9, pp. 1–38, Dec. 2022, doi: 10.1145/3472752.
- A. L. Suárez-Cetrulo, D. Quintana, and A. Cervantes, “A survey on machine learning for recurring concept drifting data streams,” Expert Syst. Appl., vol. 213, p. 118934, Mar. 2023, doi: 10.1016/j.eswa.2022.118934.
- D. Lukats, O. Zielinski, A. Hahn, and F. Stahl, “A benchmark and survey of fully unsupervised concept drift detectors on real-world data streams,” Int. J. Data Sci. Anal., vol. 19, no. 1, pp. 1–31, Jan. 2025, doi: 10.1007/s41060-024-00620-y.
- Q.-T. Tran, N.-A. Le-Khac, and M. Bertolotto, “Concept drift detection in image data stream: a survey on current literature, limitations and future directions,” Artif. Intell. Rev., vol. 59, no. 1, p. 33, Dec. 2025, doi: 10.1007/s10462-025-11428-y.
- R. R. Fernández, I. Martín de Diego, V. Aceña, A. Fernández-Isabel, and J. M. Moguerza, “Random forest explainability using counterfactual sets,” Inf. Fusion, vol. 63, pp. 196–207, Nov. 2020, doi: 10.1016/j.inffus.2020.07.001.
- G. Hooker, L. Mentch, and S. Zhou, “Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance,” Stat. Comput., vol. 31, no. 6, p. 82, Nov. 2021, doi: 10.1007/s11222-021-10057-z.
- K. Takahashi, K. Yamamoto, A. Kuchiba, and T. Koyama, “Confidence interval for micro-averaged F1 and macro-averaged F1 scores,” Appl. Intell., vol. 52, no. 5, pp. 4961–4972, Mar. 2022, doi: 10.1007/s10489-021-02635-5.
- D. R. Cano, “Statistics and stats of characters in video games,” 2021.
- X. “Arcadia” Zhang and B. C. Keegan, “Characterizing disruptions in online gaming behavior following software patches,” vol. 1, no. 1, pp. 1–21, 2022, [Online]. Available: http://arxiv.org/abs/2207.02736
- C. Ringer et al., “Time to Die 2: Improved in-game death prediction in Dota 2,” Mach. Learn. with Appl., vol. 12, no. November 2022, p. 100466, 2023, doi: 10.1016/j.mlwa.2023.100466.
- P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A Review of Machine Learning Interpretability Methods,” Entropy, vol. 23, no. 1, p. 18, Dec. 2020, doi: 10.3390/e23010018.
- S. Tymochko, E. Munch, J. Dunion, K. Corbosiero, and R. Torn, “Using persistent homology to quantify a diurnal cycle in hurricanes,” Pattern Recognit. Lett., vol. 133, pp. 137–143, May 2020, doi: 10.1016/j.patrec.2020.02.022.
- E. E. Cranmer, D.-I. D. Han, M. van Gisbergen, and T. Jung, “Esports matrix: Structuring the esports research agenda,” Comput. Human Behav., vol. 117, p. 106671, Apr. 2021, doi: 10.1016/j.chb.2020.106671.
- O. Sagi and L. Rokach, “Explainable decision forest: Transforming a decision forest into an interpretable tree,” Inf. Fusion, vol. 61, pp. 124–138, Sep. 2020, doi: 10.1016/j.inffus.2020.03.013.
- L. Hakim, Z. Sari, A. R. Aristyo, and S. Pangestu, “Optimzing Android Program Malware Classification Using GridSearchCV Optimized Random Forest,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, May 2024, doi: 10.22219/kinetik.v9i2.1944.
- H. A. Rosyid, U. Pujianto, and M. R. Yudhistira, “Classification of Lexile Level Reading Load Using the K-Means Clustering and Random Forest Method,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, pp. 139–146, May 2020, doi: 10.22219/kinetik.v5i2.897.
- U. Krishnan and P. Sangar, “A Rebalancing Framework for Classification of Imbalanced Medical Appointment No-show Data,” J. Data Inf. Sci., vol. 6, no. 1, pp. 178–192, Feb. 2021, doi: 10.2478/jdis-2021-0011.
References
C. Huang and S. D. Bruda, “Improved balance in multiplayer online battle arena games,” Acta Univ. Sapientiae, Inform., vol. 12, no. 2, pp. 183–204, Dec. 2020, doi: 10.2478/ausi-2020-0011.
Y. Su, P. Backlund, and H. Engström, “Comprehensive review and classification of game analytics,” Serv. Oriented Comput. Appl., vol. 15, no. 2, pp. 141–156, Jun. 2021, doi: 10.1007/s11761-020-00303-z.
X. Zhong and J. Xu, “Measuring the effect of game updates on player engagement: A cue from DOTA2,” Entertain. Comput., vol. 43, p. 100506, Aug. 2022, doi: 10.1016/j.entcom.2022.100506.
S. García-Méndez and F. de Arriba-Pérez, “Explainable e-sports win prediction through Machine Learning classification in streaming,” Entertain. Comput., vol. 55, p. 101027, Sep. 2025, doi: 10.1016/j.entcom.2025.101027.
J. Losada-Rodríguez, P. A. Castillo, A. Mora, and P. García-Sánchez, “The Explainability of Machine Learning Algorithms for Victory Prediction in the Video Game Dota 2,” in ITISE 2025, Basel Switzerland: MDPI, Aug. 2025, p. 26. doi: 10.3390/cmsf2025011026.
C. Zhao, H. Zhao, Y. Ge, R. Wu, and X. Shen, “Winning Tracker: A New Model for Real-time Winning Prediction in MOBA Games,” in Proceedings of the ACM Web Conference 2022, New York, NY, USA: ACM, Apr. 2022, pp. 3387–3395. doi: 10.1145/3485447.3512274.
Y. Peng, “The Application of Machine Learning in Predicting the Results of Popular eSports Games: Win Rate Prediction in MOBA and FPS Games,” Highlights Sci. Eng. Technol., vol. 85, pp. 1150–1156, Mar. 2024, doi: 10.54097/dg0nm289.
S. Yangibaev, J. Mattiev, and S. Mokwena, “DotA 2 Match Outcome Prediction System Using Decision Tree Ensemble Algorithms,” Big Data Cogn. Comput., vol. 9, no. 12, p. 302, Nov. 2025, doi: 10.3390/bdcc9120302.
V. P. K. Turlapati and M. R. Prusty, “Outlier-SMOTE: A refined oversampling technique for improved detection of COVID-19,” Intell. Med., vol. 3–4, p. 100023, Dec. 2020, doi: 10.1016/j.ibmed.2020.100023.
M. Riess, “Automating model management: a survey on metaheuristics for concept-drift adaptation,” J. Data, Inf. Manag., vol. 4, no. 3–4, pp. 211–229, Dec. 2022, doi: 10.1007/s42488-022-00075-5.
E. Yu, Y. Song, G. Zhang, and J. Lu, “Learn-to-adapt: Concept drift adaptation for hybrid multiple streams,” Neurocomputing, vol. 496, pp. 121–130, Jul. 2022, doi: 10.1016/j.neucom.2022.05.025.
D. M. V. Sato, S. C. De Freitas, J. P. Barddal, and E. E. Scalabrin, “A Survey on Concept Drift in Process Mining,” ACM Comput. Surv., vol. 54, no. 9, pp. 1–38, Dec. 2022, doi: 10.1145/3472752.
A. L. Suárez-Cetrulo, D. Quintana, and A. Cervantes, “A survey on machine learning for recurring concept drifting data streams,” Expert Syst. Appl., vol. 213, p. 118934, Mar. 2023, doi: 10.1016/j.eswa.2022.118934.
D. Lukats, O. Zielinski, A. Hahn, and F. Stahl, “A benchmark and survey of fully unsupervised concept drift detectors on real-world data streams,” Int. J. Data Sci. Anal., vol. 19, no. 1, pp. 1–31, Jan. 2025, doi: 10.1007/s41060-024-00620-y.
Q.-T. Tran, N.-A. Le-Khac, and M. Bertolotto, “Concept drift detection in image data stream: a survey on current literature, limitations and future directions,” Artif. Intell. Rev., vol. 59, no. 1, p. 33, Dec. 2025, doi: 10.1007/s10462-025-11428-y.
R. R. Fernández, I. Martín de Diego, V. Aceña, A. Fernández-Isabel, and J. M. Moguerza, “Random forest explainability using counterfactual sets,” Inf. Fusion, vol. 63, pp. 196–207, Nov. 2020, doi: 10.1016/j.inffus.2020.07.001.
G. Hooker, L. Mentch, and S. Zhou, “Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance,” Stat. Comput., vol. 31, no. 6, p. 82, Nov. 2021, doi: 10.1007/s11222-021-10057-z.
K. Takahashi, K. Yamamoto, A. Kuchiba, and T. Koyama, “Confidence interval for micro-averaged F1 and macro-averaged F1 scores,” Appl. Intell., vol. 52, no. 5, pp. 4961–4972, Mar. 2022, doi: 10.1007/s10489-021-02635-5.
D. R. Cano, “Statistics and stats of characters in video games,” 2021.
X. “Arcadia” Zhang and B. C. Keegan, “Characterizing disruptions in online gaming behavior following software patches,” vol. 1, no. 1, pp. 1–21, 2022, [Online]. Available: http://arxiv.org/abs/2207.02736
C. Ringer et al., “Time to Die 2: Improved in-game death prediction in Dota 2,” Mach. Learn. with Appl., vol. 12, no. November 2022, p. 100466, 2023, doi: 10.1016/j.mlwa.2023.100466.
P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A Review of Machine Learning Interpretability Methods,” Entropy, vol. 23, no. 1, p. 18, Dec. 2020, doi: 10.3390/e23010018.
S. Tymochko, E. Munch, J. Dunion, K. Corbosiero, and R. Torn, “Using persistent homology to quantify a diurnal cycle in hurricanes,” Pattern Recognit. Lett., vol. 133, pp. 137–143, May 2020, doi: 10.1016/j.patrec.2020.02.022.
E. E. Cranmer, D.-I. D. Han, M. van Gisbergen, and T. Jung, “Esports matrix: Structuring the esports research agenda,” Comput. Human Behav., vol. 117, p. 106671, Apr. 2021, doi: 10.1016/j.chb.2020.106671.
O. Sagi and L. Rokach, “Explainable decision forest: Transforming a decision forest into an interpretable tree,” Inf. Fusion, vol. 61, pp. 124–138, Sep. 2020, doi: 10.1016/j.inffus.2020.03.013.
L. Hakim, Z. Sari, A. R. Aristyo, and S. Pangestu, “Optimzing Android Program Malware Classification Using GridSearchCV Optimized Random Forest,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, May 2024, doi: 10.22219/kinetik.v9i2.1944.
H. A. Rosyid, U. Pujianto, and M. R. Yudhistira, “Classification of Lexile Level Reading Load Using the K-Means Clustering and Random Forest Method,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, pp. 139–146, May 2020, doi: 10.22219/kinetik.v5i2.897.
U. Krishnan and P. Sangar, “A Rebalancing Framework for Classification of Imbalanced Medical Appointment No-show Data,” J. Data Inf. Sci., vol. 6, no. 1, pp. 178–192, Feb. 2021, doi: 10.2478/jdis-2021-0011.