-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISA Specification bug: Bounds check during taken branches and jumps is either redundant or wrong #18
Comments
Thanks Murali, this is an interesting one. I think the current check will precisely identify the instruction responsible for an out of bounds PC in most cases (e.g. a computed jump gone wrong, possibly due to an attack). I expect the case of jumping to the last instruction in PCC (whether 2 or 4-bytes) would be quite rare, but the example you raise is an interesting one. I think @davidchisnall or @nwf-msr are better qualified to comment there. In terms of the potential solutions: a) At first glance this looks bad because it means you can't jump to a compressed instruction at the end of PCC but this could be solved by padding with a Other options: d) retain status quo and accept current limitation of the check, even if it is a bit inconsistent I'd be interested to hear from @kliuMsft re. the relative hw cost of each of these, especially since the current check is potentially a little costly. |
I think the pain involved here depends a lot on the use case. I think the ABI requires functions to be aligned to 4-byte boundaries and so you almost certainly need padding after the end of anything where PCC ends on a 2-byte boundary, and adding a c.nop and strongly aligning the end is easier for statically linked code. For JIT'd code, it depends a bit on what your memory management looks like. I think the trampoline example is very unlikely to exist on embedded code: on-device generation of fine-grained code like this is typically limited to the kinds of VMs that require 8+ MiBs of RAM to run, so I don't object to the current version (if you really want to track trampolines like this, allocate 4 bytes). If moving to a 4-byte check in the cj[al][r] instruction would simplify the implementation (@kliuMsft ?) then I don't think it will affect and sensible code and shouldn't hurt our ABI in any way. |
I dont think this will be a lot of hardware complexity - at least not in the critical path. The check will happen during instruction fetch, and you just need to add a mux when writing the EPC in the case of a taken previous instruction and out-of-bounds exception.
Right, this would give the software freedom to do whatever it wants - it can completely ignore the previous PC (making it really simple). Not sure why s/w gets more complex.
Given how early you are in the spec process, I don't think it's a good idea.
This is also bad when I am writing an automatic jitter. I will need to fix both the caller code and callee code. First I will fix the callee code, and then on encountering the out-of-bounds exception, I will fix the caller code. |
I think there's one more issue - representability checks. Previously, the bounds check at jumps/branches for the target+2 ensured the cap is representable. But if one removes the check during jump/branch execution, then one still needs to do the representability check during jump/branch execution. Alternatively, the bounds check during fetch can be done using the previous pc bounds and current pc address for taken branches/jumps. This makes the overall solution less than ideal. But I still contend that it's an improvement. |
I think #37 addresses this. Feel free to comment further if you disagree. |
It looks like the bounds check uses "minimum-inst-bytes" when checking if a taken address is in-bounds. This means that if the jump/branch target is a 32-bit instruction, where the last 16 bits are out-of-bounds, we will get an out-of-bounds error after fetching the target instruction, which kind of defeats the purpose w.r.t. triaging why the branch or the jump failed; we might as well not do out-of-bounds checks during jumps/taken-branches and wait for the instruction fetch to perform the bounds check.
A potential scenario where this exception (as it is spec-ed currently) becomes useless is as follows:
We have a compressed trampoline initially, allowing us to jump correctly to the trampoline and out of it with just a 2-byte bounds. Now the trampoline is removed and the code is inlined (perhaps some JIT-optimization). When we are debugging/trap handling an out-of-bounds exception because of this, we will still want access to the old PC, so that we can fix the PC bounds there. The way the spec is, we will lose the previous PC information in this scenario, which is not ideal.
Potential fixes:
a) check for 4-bytes in-bounds on taken-branch/jump. This has the problem if the target is a single instruction compressed trampoline with precisely set bounds - we will cause a spurious exception while executing the previous branch/jump.
b) store the previous PC and taken-ness in some micro-architectural state, and if the new instruction's fetch fails check for boundedness, set the EPC with the previous PC if the taken-ness is set. If taken-ness is not set, set EPC to current PC.
c) store the previous PC and taken-ness in some architectural state, and every boundedness failure during instruction fetch stores the previous PC and current PC. This could be easier for writing the trap handlers - the return address for the trap handler is still the current PC, but the fix of the bounds will be done on the previous PC as opposed to (b) where the trap handler has to compute the PC to return to.
The text was updated successfully, but these errors were encountered: