Submitted by nateharada t3_10do40p in MachineLearning
nateharada OP t1_j4ne979 wrote
Reply to comment by Zealousideal_Low1287 in [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
Nice! Right now you can use the end_process
trigger to just return 0 when the trigger is hit from the process, but it should be fairly straightforward to externalize the API a little bit more. This would let you do something like this in your script:
from gpu_sentinel import Sentinel, get_gpu_usage
sentinel = Sentinel(
arm_duration=10,
arm_threshold=0.7,
kill_duration=60,
kill_threshold=0.7,
kill_fn=my_callback_fn,
)
while True:
gpu_usage = get_gpu_usage(device_ids=[0, 1, 2, 3])
sentinel.tick(gpu_usage)
time.sleep(1)
Is that something that would be useful? You can define the callback function yourself so maybe you trigger an alert, etc.
Zealousideal_Low1287 t1_j4neybb wrote
Yeah, that’s something which would be useful indeed. Don’t worry yourself though, I can put in a PR.
nateharada OP t1_j4ngy65 wrote
It's actually almost entirely ready now, I just need to alter a few things. I'll go ahead and push it soon! Need to do some final tests.
EDIT: The above code should work! See the README on the Github for a complete example.
Viewing a single comment thread. View all comments