Skip to content

[Issue]: ibgda over RoCE v2 #27

@GeofferyGeng

Description

@GeofferyGeng

How is this issue impacting you?

Lower performance than expected

Share Your Debug Logs

Hi, developers

I have noticed that in IBGDA, nvshmem use ah_attr.dlid, which is set 49152 in RoCE v2, to set udp_sport.

However, if all QPs use the same sport, it will cause traffic to be poorly load balanced across the network.

May I ask what the reasoning is behind fixing udp_sport to 49152? Is this necessary?

DEVX_SET(qpc, qpc, primary_address_path.udp_sport, ah_attr.dlid);

Steps to Reproduce the Issue

No response

NVSHMEM Version

3.3.24+cuda12.6

Your platform details

No response

Error Message & Behavior

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions