You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using model-analyzer with --triton-launch-mode=remoted, I encounter connectivity issues.
Context:
I have successfully started Triton Inference Server on the same server, loaded the model add, and verified functionality by testing inference requests and monitoring endpoints within the Triton SDK container. However, when attempting to run performance analysis using model-analyzer, I receive an error indicating inability to connect to Triton Server's GPU metrics monitor.
Error encountered: Traceback (most recent call last): File "/usr/local/bin/model-analyzer", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/model_analyzer/entrypoint.py", line 278, in main analyzer.profile( File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 123, in profile self._get_server_only_metrics(client, gpus) File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 224, in _get_server_only_metrics self._metrics_manager.profile_server() File "/usr/local/lib/python3.10/dist-packages/model_analyzer/record/metrics_manager.py", line 188, in profile_server self._start_monitors(capture_gpu_metrics=capture_gpu_metrics) File "/usr/local/lib/python3.10/dist-packages/model_analyzer/record/metrics_manager.py", line 488, in _start_monitors raise TritonModelAnalyzerException( model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Failed to connect to Tritonserver's GPU metrics monitor. Please check that the `triton_metrics_url` value is set correctly: http://localhost:8002/metrics.
The text was updated successfully, but these errors were encountered:
RheaRia
changed the title
wrong with --triton-launch-mode=remoted
wrong with --triton-launch-mode=remote
Jul 4, 2024
@RheaRia Thank you for the detailed steps, however I cannot reproduce this failure. All steps are successful for me and I am getting a response from the metrics monitor.
Problem:
When using model-analyzer with --triton-launch-mode=remoted, I encounter connectivity issues.
Context:
I have successfully started Triton Inference Server on the same server, loaded the model add, and verified functionality by testing inference requests and monitoring endpoints within the Triton SDK container. However, when attempting to run performance analysis using model-analyzer, I receive an error indicating inability to connect to Triton Server's GPU metrics monitor.
Steps to Reproduce:
1. Start Triton Server:
Version: 23.10
Loaded Model: add
docker run -it --gpus all --privileged -p 8000:8000 -p 8001:8001 -p 8002:8002 --rm --shm-size=1G --ulimit memlock=-1 --ulimit stack=67108864 -v/data/ti-platform/xury/triton_docker_test_file/model_analyzer_test/model_analyzer-main/examples/bak:/models nvcr.io/nvidia/tritonserver:23.10-vllm-python-py3 /bin/bash
tritonserver --model-repository=/models --model-control-mode explicit --load-model add
2. Start Triton SDK container:
docker run --gpus all -ti -v /var/run/docker.sock:/var/run/docker.sock --net=host --privileged --rm -v /data/reports:/data/reports nvcr.io/nvidia/tritonserver:23.10-py3-sdk bash
3. Test inference request in SDK container:
curl -X POST http://localhost:8000/v2/models/add/infer -H "Content-Type: application/json" -d '{ "inputs": [ {"name": "INPUT0", "datatype": "FP32", "shape": [4], "data": [1.0, 2.0, 3.0, 4.0]}, {"name": "INPUT1", "datatype": "FP32", "shape": [4], "data": [5.0, 6.0, 7.0, 8.0]} ] }'
Successful response received.
{"model_name":"add","model_version":"1","outputs":[{"name":"OUTPUT","datatype":"FP32","shape":[4],"data":[6.0,8.0,10.0,12.0]}]}r
4. Test Triton Server metrics endpoint in SDK container:
curl http://localhost:8002/metrics
Successful response received.
5. Attempt to run model-analyzer for performance profiling:
model-analyzer profile --profile-models add --triton-launch-mode=remote --output-model-repository-path /data/reports/add --export-path profile_results --triton-http-endpoint localhost:8000 --triton-metrics-url http://localhost:8002/metrics --run-config-search-max-concurrency 2 --run-config-search-max-model-batch-size 2 --run-config-search-max-instance-count 2 --override-output-model-repository
Error encountered:
Traceback (most recent call last): File "/usr/local/bin/model-analyzer", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/model_analyzer/entrypoint.py", line 278, in main analyzer.profile( File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 123, in profile self._get_server_only_metrics(client, gpus) File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 224, in _get_server_only_metrics self._metrics_manager.profile_server() File "/usr/local/lib/python3.10/dist-packages/model_analyzer/record/metrics_manager.py", line 188, in profile_server self._start_monitors(capture_gpu_metrics=capture_gpu_metrics) File "/usr/local/lib/python3.10/dist-packages/model_analyzer/record/metrics_manager.py", line 488, in _start_monitors raise TritonModelAnalyzerException( model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Failed to connect to Tritonserver's GPU metrics monitor. Please check that the `triton_metrics_url` value is set correctly: http://localhost:8002/metrics.
The text was updated successfully, but these errors were encountered: