Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix deadlock in Key_Shared PIP-379 implementation #23854

Merged
merged 9 commits into from
Jan 16, 2025

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Jan 15, 2025

Fixes #23848

Motivation

The Pulsar 4.0 Key_Shared PIP-379 implementation could deadlock and this problem was first captured in test environment heapdump where the thread stack information included in the heapdump showed multiple threads in blocked state, hinting a deadlock.
With the help of @pdolif, the problem was reproduced in isolation by simply adding this type of test code in a KeySharedSubscriptionTest test case.

// test code that triggered the deadlock issue
new Thread(() -> {
    while(true) { pulsar.getBrokerService().updateRates(); }
}).start();

In the test case, this triggered a similar pattern of blocked threads and therefore could be considered a reproduction of the previously faced dead lock issue in a test environment.

threaddump with deadlock:

Modifications

  • add test case to reproduce the dead lock case which will prevent future regressions in the same area
  • fix the dead lock issue by replacing synchronization with read-write locks
    • locks are for minimum "scope"
    • callbacks are called outside of a lock scope to reduce the chances of potential other deadlocks
  • use volatile fields for counters in DrainingHashEntry so that entry methods can be used without separate locks

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@lhotari lhotari added this to the 4.1.0 milestone Jan 15, 2025
@lhotari lhotari self-assigned this Jan 15, 2025
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 15, 2025
@lhotari lhotari requested a review from heesung-sn January 15, 2025 22:15
@lhotari lhotari requested a review from heesung-sn January 16, 2025 07:14
@lhotari lhotari merged commit 8f04945 into apache:master Jan 16, 2025
51 of 52 checks passed
lhotari added a commit that referenced this pull request Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Deadlock in DrainingHashesTracker due to synchronized methods
2 participants