Dynamic Firewall Configuration and Real-Time Security Policy Optimization of the College English Platform Using the DDPG Algorithm

Main Article Content

Shaojing Xiang

Abstract

This paper solves the problems of low efficiency of dynamic firewall configuration and untimely security policy response in the college English platform in a complex network environment. The DDPG (Deep Deterministic Policy Gradient) algorithm is used to achieve real-time configuration optimization, enhance the platform’s security and flexibility, and consider system performance and user experience. This paper constructs a reinforcement learning environment, maps network traffic features into states, and maps firewall configuration operations into actions. At the same time, a comprehensive reward function that includes security, efficiency, and cost is designed to guide the model in optimizing the protection strategy. In model training, the experience replay mechanism is introduced to break the sample correlation, and the small batch gradient descent method is used to update the Actor and Critic network parameters. The soft update mechanism is used to alleviate the oscillation problem of the target network. The experimental settings cover normal traffic, various abnormal traffic, and unprecedented complex traffic scenarios, including SQL (Structured Query Language) injection, DDoS (Distributed Denial of Service) attacks, etc. In the performance evaluation, DDPG performed best in terms of threat detection success rate, with an average detection success rate of 95.2%, higher than DQN (Deep Q-Network), PPO (Proximal Policy Optimization) and static firewalls. Regarding response time and policy adjustment frequency, DDPG maintained at 45ms and 6 times/min respectively, both of which were better than the comparison model. In 7 unseen traffic scenarios, DDPG optimized the firewall configuration and the data fluctuation of the detection success rate was only 0.4%, which was very stable. The dynamic firewall configuration of the college English platform using the DDPG algorithm has significant advantages in improving security, response efficiency and platform performance, and can provide new ideas and practical guidance for firewall optimization in dynamic and complex network environments.

Article Details

Section
ARTICLES