-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Analyzer GPU Memory Usage Differences #847
Comments
Additional Context: I am using an instance with two GPUs, though the model is limited to a single instance. I have noticed that if I added up the GPU memory of both GPUs from csv, then divide by 2, I (470.8 + 1592.8) / 2 = 1031.8, i'm getting near the pdf result. Could be a coincidence? |
Hi @KimiJL, sorry for the slow response. I just returned from vacation. I suspect that your observation is not a coincidence and that there is a bug. We will have to investigate further. May I ask, were you running in local mode? Or docker or remote? |
Hi @tgerdesnv thanks for the response, I was running in in |
@KimiJL I have confirmed that the values in the pdfs are in fact the averages across the GPUs. The values in metrics-model-gpu.csv are the raw values per-gpu. So, in your case, the total maximum memory usage by the model on your machine would be 470.8 + 1592.8 I will fix Model Analyzer to show total memory usage, or clarify the labels to indicate that it is average memory usage. |
@tgerdesnv great, thank you for the clarification, that makes sense! |
Version: nvcr.io/nvidia/tritonserver:24.01-py3-sdk
For a profiled model, the GPU Memory Usage (MB) shown in results/metrics-model-gpu.csv is different from model result_summary.pdf.
In my case, metrics-model-gpu.csv shows 1592.8 while the pdf report shows 1031.
Could be my misunderstanding, do these two metrics represent the same thing? I am looking for the maximum GPU usage for a given model, so which would be the more accurate result?
The text was updated successfully, but these errors were encountered: