基于元学习与强化学习的跨域自组织网络流量测量优化方法

宋健; 聂来森; 陶醉; 袁奇恩东

doi:10.11993/j.issn.2096-3920.2024-0094

基于元学习与强化学习的跨域自组织网络流量测量优化方法

doi: 10.11993/j.issn.2096-3920.2024-0094

西北工业大学电子信息学院, 陕西西安, 710072

基金项目: 国家自然科学基金面上项目(62171378).

详细信息

作者简介:
宋健：宋　健(1999-), 男, 在读博士, 主要研究方向为网络流量预测

通讯作者:
聂来森(1985-), 男, 博士, 副教授, 主要研究方向为跨域通信组网、网络安全.

中图分类号: TJ6; U675.7
计量
- 文章访问数: 455
- HTML全文浏览量: 199
- PDF下载量: 46
- 被引次数: 0
出版历程
- 收稿日期: 2024-05-28
- 修回日期: 2024-07-05
- 录用日期: 2024-07-15
- 网络出版日期: 2024-07-16

Traffic Measurement Optimization for Cross-Domain Ad Hoc Networks Based on Meta-Learning and Reinforcement Learning

School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China

摘要

摘要: 跨域自组织网络是一种将不同介质上的节点进行自组织、网络拓扑自适应的网络。在跨域通信网络中, 直接测量技术可获得准确的端到端网络流量信息。但跨域网络中部分节点的低算力和低存储特性, 影响了所有节点运行网络流量测量进程。针对此, 文中提出一种基于元学习与近端策略优化的网络流量测量优化方法, 该方法根据上一时隙网络运行环境, 来确定下一时隙执行网络流量测量的节点集合, 目标是在尽可能少的节点上执行测量进程从而获取尽可能多的网络流量信息。文中同时通过3个网络数据集对所提方法进行仿真验证, 实验结果表明, 基于元学习和强化学习的跨域自组织网络流量测量优化算法可以有效选择流经流量大的节点, 具有较快的收敛速度和测量效率。
- 跨域自组织网络 /
- 网络流量测量 /
- 元学习 /
- 近端策略优化 /
- 强化学习
Abstract: Cross-domain Ad Hoc network is a network that self-organizes nodes on different media and adapts to network topology. In cross-domain communication networks, direct measurement technology helps obtain accurate end-to-end network traffic information. However, the low computational power and low storage characteristics of some nodes in the cross-domain network hinder all nodes from running the network traffic measurement process. To address this issue, a network traffic measurement optimization method based on meta-learning and proximal policy optimization(PPO) was proposed. This method determined the set of nodes that performed network traffic measurement in the next time slot according to the network operating environment of the previous time slot, so as to perform the measurement process on as few nodes as possible to obtain as much network traffic information as possible. Three network datasets were used to verify the proposed method. The experimental results show that the traffic measurement optimization algorithm for cross-domain Ad Hoc networks based on meta-learning and reinforcement learning can effectively select the nodes with large traffic flow, with faster convergence speed and higher measurement efficiency.
- cross-domain Ad Hoc network /
- network traffic measurement /
- meta-learning /
- proximal policy optimization /
- reinforcement learning

HTML全文

图 1 跨域自组织网络示意图

Figure 1. Schematic diagram of cross-domain Ad Hoc network

下载: 全尺寸图片幻灯片

图 2 网络流量测量优化示意图

Figure 2. Schematic diagram of network traffic measurement optimization

下载: 全尺寸图片幻灯片

图 3 强化学习示意图

Figure 3. Schematic diagram of reinforcement learning

下载: 全尺寸图片幻灯片

图 4 PPO算法流程图

注: SGD表示随机梯度下降(stochastic gradient descent)

Figure 4. Flow chart of PPO algorithm

下载: 全尺寸图片幻灯片

图 5 强化学习MAML算法流程图

Figure 5. Flow chart of reinforcement learning MAML algorithm

下载: 全尺寸图片幻灯片

图 6 基于元学习和PPO算法的网络流量测量优化算法流程图

Figure 6. Flow chart of network traffic measurement optimization algorithm based on meta-learning and PPO algorithm

下载: 全尺寸图片幻灯片

图 7 基于元学习与PPO的流量测量优化方法示意图

Figure 7. Schematic diagram of traffic measurement optimization method based on meta-learning and PPO

下载: 全尺寸图片幻灯片

图 8 Abilene网络拓扑结构图

Figure 8. Topological structure of Abilene network

下载: 全尺寸图片幻灯片

图 9 GÉANT网络拓扑结构图

Figure 9. Topological structure of GÉANT network

下载: 全尺寸图片幻灯片

图 10 NS3仿真节点移动场景

Figure 10. Scene of NS3 simulation node movement

下载: 全尺寸图片幻灯片

图 11 不同网络中3种方法仿真结果比较

Figure 11. Comparison of simulation results of three methods in different networks

下载: 全尺寸图片幻灯片

图 12 不同网络中2种方法仿真结果比较

Figure 12. Comparison of simulation results between two methods in different networks

下载: 全尺寸图片幻灯片

图 13 无线自组织网络元学习与正态分布初始参数训练结果对比图

Figure 13. Comparison of meta-learning and normal distribution initial parameter training results in wireless Ad Hoc networks

下载: 全尺寸图片幻灯片

表 1 NS3仿真无线自组织网络参数

Table 1. Parameters of NS3 simulated wireless Ad Hoc network

参数	数值
节点个数	$7 \times 7$
网络运行时间/s	2 700
移动模型	高斯马尔可夫模型
网络移动边界	限制在$450 \times 450 \times 20$的长方体内
单个节点移动边界	限制在$50 \times 50 \times 20$的长方体内

下载: 导出CSV

表 2 基于元学习与PPO网络的测量优化结构参数

Table 2. Structural parameters for measurement optimization based on meta-learning with PPO network

模型	层级	参数值
策略网络	输入层	神经单元个数: M (M表示OD流总数)
	FC-1、FC-2、 FC-3: 全连接层	神经单元个数: 100 激活函数: ReLU函数
	输出层	神经单元个数: N (N表示节点个数) 激活函数: SoftMax函数
评价网络	输入层	神经单元个数: M
	FC-1、FC-2、 FC-3: 全连接层	神经单元个数: 100 激活函数: ReLU函数
	输出层	神经单元个数: 1(奖励值)

下载: 导出CSV

表 3 仿真过程关键参数

Table 3. Key parameters in the simulation process

参数	数值
动作网络学习率	0.000 1
评论网络学习率	0.000 1
奖励衰退因子$ \gamma $	0.9
PPO中截断范围的参数	0.2
优势函数的缩放因子	0.95
经验回放池大小	200
内层训练轮数	10
外层训练轮数	40

下载: 导出CSV

参考文献(24)

[1]	RAMPHULL D, MUNGUR A, ARMOOGUM S, et al. A review of mobile Ad Hoc network(MANET) protocols and their applications[C]//2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS). Chongqing, China: IEEE, 2021: 204-211.
[2]	CHO J H, SWAMI A, CHEN R. A survey on trust management for mobile Ad Hoc networks[J]. IEEE Communications Surveys & Tutorials, 2010, 13(4): 562-583.
[3]	CONTI M, GIORDANO S. Mobile Ad Hoc networking: Milestones, challenges, and new research directions[J]. IEEE Communications Magazine, 2014, 52(1): 85-96. doi: 10.1109/MCOM.2014.6710069
[4]	HAMDI M M, AUDAH L, RASHID S A, et al. A review of applications, characteristics and challenges in vehicular Ad Hoc networks(VANETs)[C]//2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications(HORA). Ankara, Turkey: IEEE, 2020: 1-7.
[5]	JHAVERI R H, PATEL S J, JINWALA D C. DoS attacks in mobile Ad Hoc networks: A survey[C]//2012 Second International Conference on Advanced Computing & Communication Technologies. Rohtak, India: IEEE, 2012: 535-541.
[6]	SAHINGOZ O K. Networking models in flying Ad Hoc networks(FANETs): Concepts and challenges[J]. Journal of Intelligent & Robotic Systems, 2014, 74: 513-527.
[7]	CLEMM A, ZHANI M F, BOUTABA R. Network management 2030: Operations and control of network 2030 services[J]. Journal of Network and Systems Management, 2020, 28(4): 721-750. doi: 10.1007/s10922-020-09517-0
[8]	FATIMA M, KHURSHEED A. Heterogeneous Ad Hoc network management: An overview[M]//SANJOY D. Cloud computing enabled big-data analytics in 2ireless Ad-Hoc networks. Boca Raton: CRC Press, 2022: 103-123.
[9]	GHODE S D, BHOYAR K K. NEMA: Node energy monitoring algorithm for zone head selection in mobile Ad Hoc network using residual battery power of node[C]//2016 International Conference on Wireless Communications, Signal Processing and Networking(WiSPNET). Chennai, India: IEEE, 2016: 1999-2004.
[10]	SALSANO S, PATRIARCA F, PRESTI F L, et al. Accurate and efficient measurements of IP level performance to drive interface selection in heterogeneous wireless networks[J]. IEEE Transactions on Mobile Computing, 2018, 17(10): 2223-2235. doi: 10.1109/TMC.2018.2807842
[11]	YU Y, NING Z, SONG Q, et al. A dynamic cooperative monitor node selection algorithm in wireless mesh networks[C]//2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems. New York, NY, USA: IEEE, 2015: 1800-1805.
[12]	YU Y, GUO L, HUANG J, et al. A cross-layer security monitoring selection algorithm based on traffic prediction[J]. IEEE Access, 2018, 6: 35382-35391. doi: 10.1109/ACCESS.2018.2851993
[13]	PHAN X T, MARTINEZ-CASANUEVA I D, FUKUDA K. Adaptive and distributed monitoring mechanism in software-defined networks[C]//2017 13th International Conference on Network and Service Management(CNSM). Tokyo, Japan: IEEE, 2017: 1-5.
[14]	SHIN D H, BAGCHI S, WANG C C. Toward optimal distributed monitoring of multi-channel wireless networks[J]. IEEE Transactions on Mobile Computing, 2015, 15(7): 1826-1838.
[15]	BOUHTOU M, KLOPFENSTEIN O. Robust optimization for selecting netflow points of measurement in an IP network[C]//IEEE GLOBECOM 2007-IEEE Global Telecommunications Conference. Washington, D.C., USA: IEEE, 2007: 2581-2585.
[16]	NIE L, WANG H, JIANG X, et al. Traffic measurement optimization based on reinforcement learning in large-scale its-oriented backbone networks[J]. IEEE Access, 2020, 8: 36988-36996. doi: 10.1109/ACCESS.2020.2975238
[17]	ABURUMMAN A, SEO W J, ISLAM R, et al. A secure cross-domain sip solution for mobile Ad Hoc network using dynamic clustering[C]//Security and Privacy in Communication Networks: 11th EAI International Conference, SecureComm 2015. Dallas, TX, USA: Springer International Publishing, 2015: 649-664.
[18]	YANG Y, WU J, LONG C, et al. A blockchain-based cross-domain authentication for conditional privacy preserving in vehicular ad-hoc network[C]//2021 The 3rd International Conference on Blockchain Technology. Shanghai, China: Association for Computing Machinery, 2021: 183-188.
[19]	Li S E. Deep reinforcement learning[M]//Reinforcement Learning for Sequential Decision and Optimal Control. Singapore: Springer Nature Singapore, 2023: 365-402.
[20]	OROOJLOOY A, HAJINEZHAD D. A review of cooperative multi-agent deep reinforcement learning[J]. Applied Intelligence, 2023, 53(11): 13677-13722. doi: 10.1007/s10489-022-04105-y
[21]	YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of PPO in cooperative multi-agent games[J]. Advances in Neural Information Processing Systems, 2022, 35: 24611-24624.
[22]	HOSPEDALES T, ANTONIOU A, MICAELLI P, et al. Meta-Learning in neural networks: A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5149-5169.
[23]	WANG J, HU J, MIN G, et al. Fast adaptive task offloading in edge computing based on meta reinforcement learning[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 32(1): 242-253.
[24]	JEONG T, KIM H. OOD-MAML: Meta-learning for few-shot out-of-distribution detection and classification[J]. Advances in Neural Information Processing Systems, 2020, 33: 3907-3916.