-
Notifications
You must be signed in to change notification settings - Fork 57
Fix for redundant thread contention in CsdNextMessage #3916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I had also noticed a lot of time spend in locking the Node Queue. I implemented a similar fix: index 9581f234f..718bb9cc0 100644
--- a/src/conv-core/convcore.C
+++ b/src/conv-core/convcore.C
@@ -1766,7 +1766,7 @@ void *CsdNextMessage(CsdSchedulerState_t *s) {
/*#warning "CsdNextMessage: CMK_NODE_QUEUE_AVAILABLE" */
if (NULL!=(msg=CmiGetNonLocalNodeQ())) return msg;
#if !CMK_NO_MSG_PRIOS
- if(CmiTryLock(s->nodeLock) == 0) {
+ if(!CqsEmpty(s->nodeQ) && (CmiTryLock(s->nodeLock) == 0)) {
if (!CqsEmpty(s->nodeQ)
&& CqsPrioGT(CqsGetPriority(s->schedQ),
CqsGetPriority(s->nodeQ))) {but I never got around to benchmarking the change. Neverless I think this is a good idea for machines with a high core count. |
yeah, I had also started with one-line fix but then decided to add one more basic block to make the commentary more understandable. |
|
Thank you for approving changes! But I do not see "merge" button, is this because of CI stall on darwin? Or it will be manually merged by reviewers? |
|
We're working through a fix to the CI issue with darwin. When we have that
in place, I'll rebase your change to get it merged.
…On Wed, Jan 7, 2026 at 11:21 AM Vladimir Polin ***@***.***> wrote:
*vladimir-polin* left a comment (charmplusplus/charm#3916)
<#3916 (comment)>
Thank you for approving changes! But I do not see "merge" button, is this
because of CI stall on darwin? Or it will be manually merged by reviewers?
—
Reply to this email directly, view it on GitHub
<#3916 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3HFH35NQO4PRHSXHH4DJL4FU6CFAVCNFSM6AAAAACP5CKIBSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMJZHEYTONJQGE>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
|
So nothing is needed from my side yet, thanks for letting me know. |
Fix for redundant thread contention in CsdNextMessage (CmiTryLock/pthread_mutex_trylock) for node-level threading parallelism with high number of threads.
Instead of simple try_lock call this patch checks lock-free first that the queue is not empty and if it is not empty then there is try_lock and the queue is checked again inside the critical section to avoid a potential data race.