Application of Machine Learning Classification Models For Intrusion Detection In The Dnp3 Protocol In Sdn
This study presents an SDN-centric architecture for intrusion detection in DNP3 traffic, in which the SDN controller applies dynamic rules based on machine learning models to mitigate malicious access. Using the set DNP3 Stop Application Attack (1,440 instances, 80 variables), 15 relevant variables were selected, and a high correlation was observed between ten variables. Seven supervised classifiers (random forest, XGBoost, logistic regression, decision tree, KNN, SVC and LDA) were trained and evaluated using 5-fold stratified cross-validation. Per-fold metrics were subjected to nonparametric statistical tests (Wilcoxon signed-rank and Friedman tests) and rank correla-tion analyses (Spearman’s test). Several models, such as Random Forest, XGBoost, Logistic Regression, and Decision Tree, achieved near-perfect per-formance (Accuracy, Precision, Recall, F1, and ROC-AUC ≈ 1.0). Statistical tests showed no significant differences between the highest-performing groups. An explanatory analysis using SHAP revealed that a small subset of features accounted for the greatest predictive contribution and exhibited clear discriminatory thresholds, which explained the high observed separabil-ity. However, the poor performance of the SVC and the possible existence of dominant variables increase the risk of data leakage or preprocessing arti-facts. Therefore, sensitivity testing (elimination of dominant variables and duplicate checking) and evaluation of independent sets are recommended. Considering the accuracy, interpretability, inference latency, and ease of de-ployment in SDN controllers, Random Forest was proposed as the preferred operational candidate, subjecting its adoption to additional validations and the integration of monitoring and retraining mechanisms to combat concept drift.
