Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge in 1.4.1 #1

Open
wants to merge 98 commits into
base: spsc
Choose a base branch
from
Open

Merge in 1.4.1 #1

wants to merge 98 commits into from

Conversation

nullchinchilla
Copy link
Member

No description provided.

taiki-e and others added 30 commits April 18, 2021 22:39
Replace vec-arena with slab
Since Rust 1.64, Clippy respects `rust-version` field in Cargo.toml.
rust-lang/rust@b776fb8
* m: Migrate to criterion

* Update CI
async-lock 2.7.0 requires Rust 1.48.

```
error[E0658]: use of unstable library feature 'future_readiness_fns'
   --> /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/async-lock-2.7.0/src/once_cell.rs:430:45
    |
430 |             self.initialize_or_wait(move || std::future::ready(closure()), &mut Blocking),
    |                                             ^^^^^^^^^^^^^^^^^^
    |
```
notgull and others added 30 commits March 12, 2024 20:38
This should catch the errors from earlier.

Signed-off-by: John Nunley <[email protected]>
It turns out that with the current strategy it is possible for tasks to
be stuck in the local queue without any hope of being picked back up.
In practice this seems to happen when the only entities polling the
system are tickers, as opposed to runners. Since tickets don't steal
tasks, it is possible for tasks to be left over in the local queue that
don't filter out.

One possible solution is to make it so tickers steal tasks, but this
kind of defeats the point of tickers. So I've instead elected to replace
the current strategy with one that accounts for the corner cases with
local queues.

The main difference is that I replace the Sleepers struct with two
event_listener::Event's. One that handles tickers subscribed to the
global queue and one that handles tickers subscribed to the local queue.
The other main difference is that each local queue now has a reference
counter. If this count reaches zero, no tasks will be pushed to this
queue. Only runners increment or decrement this counter.

This makes the previously instituted tests pass, so hopefully this works
for most use cases.

Signed-off-by: John Nunley <[email protected]>
Signed-off-by: John Nunley <[email protected]>
For some workloads many tasks are spawned at a time. This requires
locking and unlocking the executor's inner lock every time you spawn a
task. If you spawn many tasks this can be expensive.

This commit exposes a new "spawn_batch" method on both types. This
method allows the user to spawn an entire set of tasks at a time.

Closes #91

Signed-off-by: John Nunley <[email protected]>
Signed-off-by: John Nunley <[email protected]>
Signed-off-by: John Nunley <[email protected]>
Fixes #89. Uses @notgull's suggestion of using a `AtomicPtr` with a racy initialization instead of a `OnceCell`.

For the addition of more `unsafe`, I added the `clippy::undocumented_unsafe_blocks` lint at a warn, and fixed a few of the remaining open clippy issues (i.e. `Waker::clone_from` already handling the case where they're equal).

Removing `async_lock` as a dependency shouldn't be a SemVer breaking change.
Motivation: FallibleTask is part of the public interface of this crate, in that Task::fallible returns FallibleTask. However, in order to name that type, users need to add a direct dependency on async_task and ensure the crates versions are compatible. Reexporting allows crate users to name the type directly.
Signed-off-by: John Nunley <[email protected]>
This commit aims to add benchmarks that more realistically reflect
workloads that might happen in the real world.

These benchmarks are as follows:

- "channels", which sets up TASKS tasks, where each task uses a channel
  to wake up the next one.
- "server", which tries to simulate a web server-type scenario.

Signed-off-by: John Nunley <[email protected]>
Resolves #111. Creates a `StaticExecutor` type under a feature flag and allows 
constructing it from an `Executor` via `Executor::leak`. Unlike the executor 
it came from, it's a wrapper around a `State` and omits all changes to 
`active`.

Note, unlike the API proposed in #111, this PR also includes a unsafe 
`StaticExecutor::spawn_scoped` for spawning non-'static tasks, where the 
caller is responsible for ensuring that the task doesn't outlive the borrowed 
state. This would be required for Bevy to migrate to this type, where we're 
currently using lifetime transmutation on `Executor` to enable 
`Thread::scope`-like APIs for working with borrowed state. `StaticExecutor` 
does not have an external lifetime parameter so this approach is infeasible 
without such an API.

The performance gains while using the type are substantial:

```
single_thread/executor::spawn_one
                        time:   [1.6157 µs 1.6238 µs 1.6362 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
single_thread/executor::spawn_batch
                        time:   [28.169 µs 29.650 µs 32.196 µs]
Found 19 outliers among 100 measurements (19.00%)
  10 (10.00%) low severe
  3 (3.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe
single_thread/executor::spawn_many_local
                        time:   [6.1952 ms 6.2230 ms 6.2578 ms]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
single_thread/executor::spawn_recursively
                        time:   [50.202 ms 50.479 ms 50.774 ms]
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
single_thread/executor::yield_now
                        time:   [5.8795 ms 5.8883 ms 5.8977 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

multi_thread/executor::spawn_one
                        time:   [1.2565 µs 1.2979 µs 1.3470 µs]
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe
multi_thread/executor::spawn_batch
                        time:   [38.009 µs 43.693 µs 52.882 µs]
Found 22 outliers among 100 measurements (22.00%)
  21 (21.00%) high mild
  1 (1.00%) high severe
Benchmarking multi_thread/executor::spawn_many_local: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 386.6s, or reduce sample count to 10.
multi_thread/executor::spawn_many_local
                        time:   [27.492 ms 27.652 ms 27.814 ms]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
Benchmarking multi_thread/executor::spawn_recursively: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 16.6s, or reduce sample count to 30.
multi_thread/executor::spawn_recursively
                        time:   [165.82 ms 166.04 ms 166.26 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
multi_thread/executor::yield_now
                        time:   [22.469 ms 22.649 ms 22.798 ms]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) low severe
  3 (3.00%) low mild

single_thread/leaked_executor::spawn_one
                        time:   [1.4717 µs 1.4778 µs 1.4832 µs]
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe
single_thread/leaked_executor::spawn_many_local
                        time:   [4.2622 ms 4.3065 ms 4.3489 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
single_thread/leaked_executor::spawn_recursively
                        time:   [26.566 ms 26.899 ms 27.228 ms]
single_thread/leaked_executor::yield_now
                        time:   [5.7200 ms 5.7270 ms 5.7342 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

multi_thread/leaked_executor::spawn_one
                        time:   [1.3755 µs 1.4321 µs 1.4892 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
multi_thread/leaked_executor::spawn_many_local
                        time:   [4.1838 ms 4.2394 ms 4.2989 ms]
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild
multi_thread/leaked_executor::spawn_recursively
                        time:   [43.074 ms 43.159 ms 43.241 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild
multi_thread/leaked_executor::yield_now
                        time:   [23.210 ms 23.257 ms 23.302 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild
```
Signed-off-by: John Nunley <[email protected]>
For example:

warning: unresolved link to `tick`
   --> src/static_executors.rs:306:47
    |
306 | /// to consistently drive the executor with [`tick`] or [`run`] will cause the all spawned
    |                                               ^^^^ no item named `tick` in scope
    |
    = help: to escape `[` and `]` characters, add '\' before them like `\[` or `\]`
    = note: `#[warn(rustdoc::broken_intra_doc_links)]` on by default
Signed-off-by: John Nunley <[email protected]>
Signed-off-by: John Nunley <[email protected]>
Closes #135.

This enables the executor to be used in presence of panics in user
callbacks, such as the iterator and `impl Extend` in `spawn_many`.

Mutex poisoning is more of a lint than a safety requirement, as
containers (such as `Slab`) and wakers have to be sound in presence of
panics anyway. In this particular case, the exact behavior of `active`
is not relied upon for soundness.
Closes #131

Signed-off-by: John Nunley <[email protected]>
By creating the future manually instead of relying on `async { .. }`, we
workaround rustc's inefficient future layouting. On
[a simple benchmark](https://github.com/hez2010/async-runtimes-benchmarks-2024)
spawning 1M of tasks, this reduces memory use from about 512 bytes per
future to about 340 bytes per future.

More context: hez2010/async-runtimes-benchmarks-2024#1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.