When you run network simulation in ns3, the traffic workload is is key to evaluate the network performance. Using Keddah, you can replay network traffic that extracted from a Hadoop cluster. For details, please refer to paper:
@article{deng2019keddah,
title={Keddah: network evaluation powered by simulating distributed application traffic},
author={Deng, Jie and Tyson, Gareth and Cuadrado, Felix and Uhlig, Steve},
journal={ACM Transactions on Modeling and Computer Simulation (TOMACS)},
volume={29},
number={3},
pages={1--25},
year={2019},
publisher={ACM New York, NY, USA}
}
To use Keddah:
1, install ns3 https://www.nsnam.org/wiki/Installation
2, create module with name dctest without the code: ns-allinone-3.23/ns-3.23/src$ ./create-module.py dctest
3, configure the module: ./waf configure
4, download Keddah code, put folder dctest into ns-allinone-3.23/ns-3.23/src , replace the folder
5, compile the module again: ./waf configure && ./waf
6, download the experiment DisCompSimulator.cc into ns-allinone-3.23/ns-3.23/scratch
7, compile again: ./waf
Run:
NS_LOG='TeraSort:Kmeans' ./waf --run "scratch/DisCompSimulator --tracejob=500 --stop_time=1300 --network=1 --job_rate=6 --ftnodnum=20"
--filesize: Size of data file need to be sort. [3000000000]
--tracejob: Which job to trace. [100]
--network: The type of network: (0 star) (1 fattree) (2 dcell) (3 camcube) [0]
--stop_time: Number of seconds to run the simulation [1200]
--job_rate: Exponential mean of job submit interval [20]
--linktrad: the trade-off rate of link and bandwidth in cam cube [1]
--ftnodnum: number of nodes per pod in FatTree [8]