Investigating the Impact of Gameplay Hours on Player Recommendations in Steam Games: A Comparative Analysis Using Logistic Regression and Random Forest Classifiers
Main Article Content
The study delves into the complex relationship between gameplay hours and player recommendations on the Steam platform, leveraging both Logistic Regression and Random Forest classifiers to analyze the data. The findings underscore a strong correlation between hours played and the likelihood of recommending a game. Specifically, longer gameplay hours generally indicate higher engagement levels, which often translate into a greater propensity for players to recommend the game. However, this trend is not universally applicable; a subset of users with high playtime did not recommend their games, highlighting that engagement alone does not guarantee satisfaction. Factors such as game quality, unmet player expectations, and individual preferences may influence these outcomes. The Logistic Regression model provided a clear linear understanding of the data, demonstrating that hours played significantly affect recommendation likelihood. Its coefficients suggested a positive relationship, making it a useful tool for interpreting the odds of recommendation changes based on gameplay hours. Nonetheless, the model's limitations became evident in its inability to capture intricate, non-linear patterns within the data. In contrast, the Random Forest classifier excelled by capturing complex interactions and offering robust predictive accuracy. This model utilized ensemble learning to analyze various decision trees, thereby revealing more nuanced insights into player behaviors. Feature importance scores derived from Random Forest confirmed that hours played was a critical variable, but also highlighted the potential significance of other factors contributing to player recommendations. Model performance metrics further reinforced these observations. The Random Forest classifier outperformed Logistic Regression in terms of accuracy (82.65% compared to 81.26%), precision, recall, and the F1-score, while also delivering a higher Area Under the Curve (AUC-ROC), indicating superior discriminative power. These results suggest that Random Forest is more suitable for capturing the multifaceted dynamics of player engagement and recommendations. This comprehensive comparison illustrates how different modeling approaches can yield valuable, yet varying, insights into gaming data.