Send an alarm email when your GPUs are idle for a long time. May be useful if you rent GPUs from cloud services to train AI models.
- Install dependencies with
pip install click nvidia-ml-py pyyaml
orpip install -r requirements.txt
. - Prepare an email account, generate an authorization token and fill these values in
smtp.yaml
. - Run
python alarm.py
. For more configurations, runpython alarm.py --help
. - By default, the program will check your GPU utilization rates every 10 seconds and send an alarm email if some of the rates are below 20% for more than 30 minutes.
The program only supports tracking utilization rates of NVIDIA GPUs.