Predicting Cyberbullying Victimization Levels In University Students Using Psychosocial Indicators and Machine Learning
Cyberbullying research has traditionally focused on detecting harmful content in online communications. However, such approaches often require monitoring private digital interactions, raising significant ethical and privacy concerns, particularly in educational environments. To address this limitation, this study proposes a privacy-preserving predictive framework to estimate self-reported cyberbullying victimization levels among university students using psychosocial, behavioral, and contextual survey-based indicators.The study analyzes a dataset of 604 university students. The victimization construct, derived from exposure-related Likert-scale items, demonstrated strong internal consistency (Cronbach’s α = 0.887), supporting its use as a reliable composite measure. Based on this construct, a victimization score was computed and operationally categorized into low, medium, and high levels using interpretable thresholds grounded in the five-point Likert scale. The resulting class distribution was imbalanced, with a predominance of low-level cases and a limited representation of high victimization. Four supervised machine learning models were evaluated: multinomial logistic regression, Support Vector Machine, Random Forest, and Gradient Boosting. Results indicate that all models achieved meaningful predictive performance, particularly for majority classes, while performance for the minority high-victimization class remained constrained. In the final hold-out evaluation, Gradient Boosting achieved the best overall weighted performance, whereas Support Vector Machine showed competitive results during cross-validation. These findings demonstrate that self-reported cyberbullying victimization levels can be effectively estimated from structured survey data without requiring access to private communications. The proposed approach provides a privacy-preserving alternative that may support screening-oriented assessment and early intervention strategies in higher education contexts.
