-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split_byte_interval
and join_byte_intervals
break jump tables in instrumented binaries
#15
Comments
split_byte_interval
split_byte_interval
and join_byte_intervals
break jump tables in instrumented binaries
Accidentally created empty issue, updated now with information. |
I should note that even if |
Thanks for filing this. The current behavior is definitely wrong in two ways:
I don't have any cycles to work on this currently, but a temporary workaround is to pre-process the IR before rewriting and add explicit alignment for those blocks in advance (aligned to a 4-byte boundary). That should prevent rewriting from inserting the problematic padding bytes. |
Just checking, it appears this commit solves this issue: dcc4624 ? |
Mostly, yep. gtirb-rewriting no longer infers block alignments based on the input alignment, but it still will insert padding bytes to preserve the explicit alignment. It probably shouldn't be doing that either but I don't think it will affect this issue. |
Instrumentation I have been writing for x64 Windows binaries has been breaking on any complex program due to faulty indirect jumps, coming from broken switch jump tables or C++ jump tables.
I believe this is because the
split_byte_interval
function called before rewriting separates applies an alignment directive to each DataBlock before separating each DataBlock to its own ByteInterval:gtirb-rewriting/gtirb_rewriting/intervalutils.py
Lines 94 to 98 in 1e116e6
After rewriting,
join_byte_intervals
may add null bytes to a DataBlock to implement this alignment:gtirb-rewriting/gtirb_rewriting/intervalutils.py
Lines 229 to 231 in 1e116e6
This means that depending on how added instrumentation effects alignment, padding bytes may be added between DataBlocks in the instrumented binary. This breaks any kind of jump table or structure in memory that depends on items having a fixed relative offset from each other.
Below is a minimal example. I compiled the below program using x64 MSVC as such:
cl /Zi .\main.c /Feswitch64.exe
In the compiled binary, the generated switch table is positioned immediately after the
execute_case
function. I added the following instrumentation to last block of this function. The binary I compiled and used for this test is here: switch64.instrumented.exe.zipThis is what the generated (broken) binary looks like:
Compared to the original:
This is the switch table in the generated assembly, that contains the padding bytes breaking the jump table:
I have found a temporary workaround is commenting out the lines in
split_byte_interval
that apply alignment. As my binaries don't appear to have any alignment information in their IR to begin with, this doesn't cause any issues and fixes the problem.A proper fix might involve modifying
split_byte_interval
to never split contiguous DataBlocks into different ByteIntervals. This documentation: https://grammatech.github.io/gtirb/python/gtirb.byteinterval.html states: "If two blocks are in two different ByteIntervals, then it should be considered safe (that is, preserving of program semantics) to move one block relative to the other in memory". I think it is a fair assumption that contiguous DataBlocks should maintain the same relative offsets to each other post instrumentation, and with the previous documentation in mind this would mean they shouldn't be split into seperate ByteIntervals.The text was updated successfully, but these errors were encountered: