Add IB verbs logging and enable traces through install.sh #1511
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Details
This PR will facilitate IB Verbs calls logging to help with network-level tracing of work requests at the IB Verbs level.
Work item: Internal.
What were the changes?
VERBS
as a type ofNCCL_DEBUG_SUBSYS
. Note that TheNET
subsystem level is broader in its coverage and adds a lot of unneeded info if used just to view IB verbs call tracesstruct ncclIbDevInfo
so that it is possible to show both sender and receiver NIC index on the sender sideWhy were the changes made?
Many want to understand how RCCL interacts with the network fabric in a fine-grain fashion. This will enable logging all other IB verbs interactions incremantally
How was the outcome achieved?
Through making TRACE calls for both FIFO and data ibv_post_sends, and adding a flag for enabling traces (
--log-trace
) to the install scriptAdditional Documentation:
What else should the reviewer know?
Approval Checklist
Do not approve until these items are satisfied.