Geographical Distribution Characteristics of Tourism Consumption Behavior and Network Security Strategies Based on Big Data Analysis
Main Article Content
Abstract
As the tourism industry and big data promote the flow of tourist information, network risk identification has become a critical issue that needs to be solved urgently, but existing protection measures often need more pertinence to regional differences. This study aims to formulate regional data protection and encryption strategies based on the geographical distribution of tourist consumption behavior. Large-scale consumption and network security data from various regions are collected, integrated into independent samples, and uploaded to HDFS (Hadoop Distributed File System). Data proximity is analyzed through K distance graphs under the Spark framework; the neighborhood radius and MinPts (Minimum Points) are determined; different clusters are generated using DBSCAN (Density-based Spatial Clustering of Applications with Noise). Clusters are divided into high-consumption, high-frequency consumption, and low-consumption groups based on consumption behavior. The network information of each cluster is counted, and differentiated security strategies are designed: the high-consumption cluster uses RSA (Rivest-Shamir-Adleman)-4096 and quantum key distribution to enhance data encryption and combines SIEM (Security Information and Event Management) deployment to detect threats in real-time; the high-frequency consumption cluster applies Isolation Forest anomaly detection, and cooperates with multi-factor authentication to ensure security; the low-consumption cluster uses lightweight firewalls and dynamic scoring card models to achieve essential protection and flexible risk management. The results show that the system has a success rate of 93.96% in DDoS (Distributed Denial of Service) attack protection, and the illegal public leakage rates of data for high-consumption, high-frequency consumption, and low-consumption clusters are 1%, 2.5%, and 3.2% respectively when the data volume reaches 2TB. The method used can effectively formulate security strategies based on the geographical distribution characteristics of consumers and improve network protection and privacy protection capabilities.