Question
__device__ NVSHMEMI_STATIC NVSHMEMI_DEVICE_ALWAYS_INLINE int ibgda_poll_cq(
nvshmemi_ibgda_device_cq_t *cq, uint64_t idx, int *error) {
int status = 0;
struct mlx5_cqe64 *cqe64 = (struct mlx5_cqe64 *)cq->cqe;
CONSTANT_ADDRESS_SPACE nvshmemi_ibgda_device_state_t *state = ibgda_get_state();
const uint32_t ncqes = cq->ncqes;
In the ibgda_poll_cq function, only the first CQE is used. But when creating the CQ, the number of CQEs is greater than 1.
Does this mean that there is a waste of CQ buffer space?