-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CHERI-RISC-V] Report true for __atomic_always_lock_free(sizeof(__intcap)) #721
base: dev
Are you sure you want to change the base?
Conversation
060240c
to
d488825
Compare
These projects now check for __atomic_always_lock_free on uintptr_t which returns false for purecap until the pull request #721 is merged.
These projects now check for __atomic_always_lock_free on uintptr_t which returns false for purecap until the pull request #721 is merged.
056a8a4
to
a541ac1
Compare
unsigned FailureOrdering : 4; // enum AtomicOrdering | ||
unsigned FailureOrdering : 3; // enum AtomicOrdering | ||
/// Whether capability comparisons are exact for cmpxchg atomic operations. | ||
unsigned ExactCompare : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered adding two new ISD:: operations for the exact CMPXCHG, but adding it here seems less invasive.
This will be used in the follow-up commits to determine whether to lower cmpxchg with an exact capability comparison or just the address.
This will be used to support exact comparisons in cmxpchg lowering.
This also allows dropping one test that is now redundant since only the pointee type differed.
This should use an exact capability comparison instead of an address compare.
If the exact flag is set on the cmpxchg instruction, we use an exact comparison in the LL/SC expansion. This will allow use of capability cmpxchg for capability-sized integers (since we have to compare all bits in that case) and will also make it possible to correctly handle the exact-equals semantics in the future.
In order to report true for __atomic_always_lock_free(sizeof(uintptr_t), 0) we will have to also expand i128/i64 atomics using capability ops.
We can perform an atomic capability load and extract the raw bits. As can be seen from the test case, the IR-level conversion to i128 using shift+or is converted to a direct read into the register holding the high part of an i128.
We can perform an atomic capability store from the raw bits. As can be seen from the test case, the IR-level i128 shift is elided and directly replaced with a csethigh.
Now that we have an exact bit on cmpxchg we can lower 2*XLen integer cmpxchg using a capability operation with the raw bits. While this also compares the tag bit in addition to the data bits this should not matter. IMO writing a valid capability with the same bit pattern but a different tag bit to the location should be treated as a conflicting store.
This completes the set of atomic operations on i128/i64 and will allow use to return true for `__atomic_always_lock_free(sizeof(__intcap))`.
This is a lot more efficient that a CMPXCHG loop.
We were just missing the necessary tablegen patterns to support this.
With the latest commit we can lower i64/i128 loads stores as an atomic capability store with the appropriate fences.
We don't yet return true here for purecap CHERI since right now clang will emit libcalls for non-capability sizeof(uintptr_t) operations such as __int128 for 64-bit RISC-V.
…cap)) Now that we can expand all 2*XLen atomics inline (at least for purecap), we can report true for this builtin. This fixes problems such as std::atomic<void*>::is_lock_free reporting false in C++14 mode as well as a compilation error in compiler-rt atomic.c.
We are assuming there can only ever be one memory operand on the pseudo instructions but there could be more than one if the branch-folder pass merges identical blocks. Found while building CheriBSD.
Using separate pseudos for exact and inexact comparsions ensures that our lowering does not depend on the MachineMemOperand (after SDAG) since passes could drop it (which means use the most conservative approach). This adds a bit of boilerplate but it's not as bad as I expected and is less fragile than the previous approach.
a541ac1
to
0eb4a79
Compare
// This can happen when expanding a RMW to a cmxchg that also needs | ||
// expanding. | ||
// TODO: assert that we don't recurse more than once or use a worklist? | ||
runOnFunction(F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed to just run the expansion when needed. Otherwise this PR should be ready for review.
This fixes problems such as std::atomic<void*>::is_lock_free reporting
false in C++14 mode as well as a compilation error in compiler-rt
atomic.c.
Ideally the builtin would take a type rather than a size but unfortunately this cannot be changed any more. So instead we just ensure that all capability-sized atomics are lock free.
Making this possible involves quite a lot of changes:
Note: we do not yet report true here for hybrid mode since we don't yet support all atomic ops with capability pointers: see #490
This is a rather large change but each commit should be small enough to review.