-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use actual thread local queues instead of using a RwLock #93
Conversation
c7d9723
to
3e05297
Compare
Whoa, that miri failure is a big red flag. Not sure how that's happening. Looks like a soundness bug with EDIT: This happens regardless of whether we're chaining the iterators or not, iterating over elements always seems to trigger Miri. |
Filed an issue regarding the UB: Amanieu/thread_local-rs#70 |
The miri issue seems to be fixed with Amanieu/thread_local-rs#72, so this is likely going to be blocked on that being merged and released. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main issue here is that thread_local
's MSRV might exceed ours at some point. I've considered this in the past. However, the maintainer of thread_local
has stated that the MSRV for thread-local
might exceed Rust v1.63 in the future.
The issue is that, if thread-local
's MSRV is bumped, we would be forced to depend on an older version of thread-local
. For our dependents with higher MSRVs this would cause duplicate dependencies in the tree.
In the past we'd encountered this issue with once_cell
, see smol-rs/async-io#93. This is why I avoid the use of once_cell
throughout smol
, instead opting to prefer async_lock::OnceCell
.
Co-authored-by: John Nunley <[email protected]>
Is there a location where the MSRV policy for crates under this organization is documented? All things considered, I think the update frequency for thread_local is so low that this should be pretty low risk. That said, the once_cell dependency also is under a similar situation. |
The MSRV policy is here. I should really add it explicitly to all crates. Even if the risk is low it's not zero. It was a headache the first time and I'd like to avoid a repeat if I can. |
I'm fine with merging it for now; from my perspective it doesn't look like |
Did a quick benchmark comparison after all of the aforementioned changes, not sure what to make of these results:
The impact to |
Cross checking the benchmark results for #37, it seems like the results are to be expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me
|
||
// Pick a random starting point in the iterator list and rotate the list. | ||
let n = local_queues.len(); | ||
let n = local_queues.iter().count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a cold operation? It seems like this would take a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one part I'm not so sure about. Generally this shouldn't be under contention, since the cost to spin up new threads is going to be higher than it is to scan over the entire container, unless you have literally thousands of threads. It otherwise is just a scan through fairly small buckets.
We could use an atomic counter to track how many there are, but since you can't remove items from the ThreadLocal, there will be residual thread locals from currently unused threads (as thread IDs are reused), that may get out of sync.
Co-authored-by: John Nunley <[email protected]>
Currently, runner local queues rely on a
RwLock<Vec<Arc<ConcurrentQueue>>>>
to store the queues instead of using actual thread-local storage.This adds
thread_local
as a dependency, but this should allow the executor to work steal without needing to hold a lock, as well as allow tasks to schedule onto the local queue directly, where possible, instead of always relying on the global injector queue.Fixes #62.