-
Notifications
You must be signed in to change notification settings - Fork 848
Avoid Stripe Mutex lock contention for RWW #12794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -28,9 +28,12 @@ | |||||
| #include "PreservationTable.h" | ||||||
| #include "Stripe.h" | ||||||
|
|
||||||
| #include "iocore/eventsystem/Event.h" | ||||||
| #include "tscore/hugepages.h" | ||||||
| #include "tscore/Random.h" | ||||||
| #include "ts/ats_probe.h" | ||||||
| #include "tsutil/Bravo.h" | ||||||
| #include <mutex> | ||||||
|
|
||||||
| #ifdef LOOP_CHECK_MODE | ||||||
| #define DIR_LOOP_THRESHOLD 1000 | ||||||
|
|
@@ -46,6 +49,7 @@ DbgCtl dbg_ctl_dir_clean{"dir_clean"}; | |||||
| #ifdef DEBUG | ||||||
|
|
||||||
| DbgCtl dbg_ctl_cache_stats{"cache_stats"}; | ||||||
| DbgCtl dbg_ctl_cache_open_dir{"cache_open_dir"}; | ||||||
|
||||||
| DbgCtl dbg_ctl_cache_open_dir{"cache_open_dir"}; |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment formatting uses "ATS UNUSED" with a space, but elsewhere in the codebase it's typically "ATS_UNUSED" with an underscore. Consider maintaining consistency with the existing pattern.
| OpenDir::signal_readers(int event, Event * /* ATS UNUSED*/) | |
| OpenDir::signal_readers(int event, Event * /* ATS_UNUSED */) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,6 +28,7 @@ | |
| #include "iocore/eventsystem/Continuation.h" | ||
| #include "iocore/aio/AIO.h" | ||
| #include "tscore/Version.h" | ||
| #include "tsutil/Bravo.h" | ||
|
|
||
| #include <cstdint> | ||
| #include <ctime> | ||
|
|
@@ -225,16 +226,23 @@ struct OpenDirEntry { | |
| } | ||
| }; | ||
|
|
||
| struct OpenDir : public Continuation { | ||
| Queue<CacheVC, Link_CacheVC_opendir_link> delayed_readers; | ||
| DLL<OpenDirEntry> bucket[OPEN_DIR_BUCKETS]; | ||
| class OpenDir : public Continuation | ||
| { | ||
| public: | ||
| OpenDir(); | ||
|
|
||
| int open_write(CacheVC *c, int allow_if_writers, int max_writers); | ||
| int close_write(CacheVC *c); | ||
| OpenDirEntry *open_read(const CryptoHash *key) const; | ||
| int signal_readers(int event, Event *e); | ||
|
|
||
| OpenDir(); | ||
| // event handler | ||
| int signal_readers(int event, Event *e); | ||
|
|
||
| private: | ||
| mutable ts::bravo::shared_mutex _shared_mutex; | ||
|
|
||
| Queue<CacheVC, Link_CacheVC_opendir_link> _delayed_readers; | ||
| DLL<OpenDirEntry> _bucket[OPEN_DIR_BUCKETS]; | ||
| }; | ||
|
Comment on lines
+229
to
246
|
||
|
|
||
| struct CacheSync : public Continuation { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -101,14 +101,15 @@ class StripeSM : public Continuation, public Stripe | |
|
|
||
| int recover_data(); | ||
|
|
||
| int open_write(CacheVC *cont, int allow_if_writers, int max_writers); | ||
| int open_write_lock(CacheVC *cont, int allow_if_writers, int max_writers); | ||
| int close_write(CacheVC *cont); | ||
| int begin_read(CacheVC *cont) const; | ||
| // unused read-write interlock code | ||
| // currently http handles a write-lock failure by retrying the read | ||
| // OpenDir API | ||
| int open_write(CacheVC *cont, int allow_if_writers, int max_writers); | ||
| int open_write_lock(CacheVC *cont, int allow_if_writers, int max_writers); | ||
| int close_write(CacheVC *cont); | ||
| OpenDirEntry *open_read(const CryptoHash *key) const; | ||
| int close_read(CacheVC *cont) const; | ||
|
|
||
| // PreservationTable API | ||
| int begin_read(CacheVC *cont) const; | ||
| int close_read(CacheVC *cont) const; | ||
|
Comment on lines
+104
to
+112
|
||
|
|
||
| int clear_dir_aio(); | ||
| int clear_dir(); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential lock ordering issue: This code calls stripe->open_read() (which now acquires _shared_mutex as a reader) while holding stripe->mutex. In other places like CacheRead.cc line 212, the same pattern exists. However, in Cache.cc line 570, open_read() is called without holding stripe->mutex. While this change is intentional to reduce contention, ensure that this mixed lock ordering (sometimes stripe->mutex then _shared_mutex, sometimes just _shared_mutex) doesn't introduce potential deadlock scenarios. The Bravo reader-writer lock should handle this safely since multiple readers can coexist, but this should be verified through testing under high concurrency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is fine, because prior to the change, when we jump to the
CacheVC::openReadFromWriterfrom theLwriterlabel, it's out of scope of StripeSM mutex lock.CacheVC::openReadFromWriterhas another check of OpenDir.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trafficserver/src/iocore/cache/CacheRead.cc
Line 212 in a5363d2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the reader lock of OpenDir might be need to be hold while this OpenDirEntry is used in the
CacheVC::openReadFromWriter. Let me consider in case.