Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split_byte_interval and join_byte_intervals break jump tables in instrumented binaries #15

Open
avncharlie opened this issue Jul 4, 2024 · 5 comments

Comments

@avncharlie
Copy link

avncharlie commented Jul 4, 2024

Instrumentation I have been writing for x64 Windows binaries has been breaking on any complex program due to faulty indirect jumps, coming from broken switch jump tables or C++ jump tables.

I believe this is because the split_byte_interval function called before rewriting separates applies an alignment directive to each DataBlock before separating each DataBlock to its own ByteInterval:

if block.address is None:
# Align the offset, since we don't know the actual address
alignment[block] = effective_alignment(block.offset)
else:
alignment[block] = effective_alignment(block.address)

After rewriting, join_byte_intervals may add null bytes to a DataBlock to implement this alignment:

else:
BlockType = gtirb.DataBlock
pad_bytes = b"\x00"

This means that depending on how added instrumentation effects alignment, padding bytes may be added between DataBlocks in the instrumented binary. This breaks any kind of jump table or structure in memory that depends on items having a fixed relative offset from each other.

Below is a minimal example. I compiled the below program using x64 MSVC as such: cl /Zi .\main.c /Feswitch64.exe

#include <stdio.h>

void execute_case(int case_num) {
    switch(case_num) {
        case 0:
            printf("Case 0\n");
            break;
        case 1:
            printf("Case 1\n");
            break;
        case 2:
            printf("Case 2\n");
            break;
        case 3:
            printf("Case 3\n");
            break;
        case 4:
            printf("Case 4\n");
            break;
        case 5:
            printf("Case 5\n");
            break;
        case 6:
            printf("Case 6\n");
            break;
        case 7:
            printf("Case 7\n");
            break;
        case 8:
            printf("Case 8\n");
            break;
        case 9:
            printf("Case 9\n");
            break;
        default:
            printf("Default case\n");
            break;
    }
}

int main() {
    for (int i = 0; i < 11; i++) {
        execute_case(i);
    }
    return 0;
}

In the compiled binary, the generated switch table is positioned immediately after the execute_case function. I added the following instrumentation to last block of this function. The binary I compiled and used for this test is here: switch64.instrumented.exe.zip

class BreakSwitchPass(Pass):
    def begin_module(self, module, functions, rewriting_ctx):
        for x in module.code_blocks:
            if x.address == 0x14000721c:
                rewriting_ctx.register_insert(
                    SingleBlockScope(x, BlockPosition.ENTRY),
                    Patch.from_function(lambda _:f'''
                    nop
                    nop
                    nop
                    nop
                    ''', Constraints(x86_syntax=X86Syntax.INTEL))
                )

This is what the generated (broken) binary looks like:
image
Compared to the original:
image

This is the switch table in the generated assembly, that contains the padding bytes breaking the jump table:
image

I have found a temporary workaround is commenting out the lines in split_byte_interval that apply alignment. As my binaries don't appear to have any alignment information in their IR to begin with, this doesn't cause any issues and fixes the problem.

A proper fix might involve modifying split_byte_interval to never split contiguous DataBlocks into different ByteIntervals. This documentation: https://grammatech.github.io/gtirb/python/gtirb.byteinterval.html states: "If two blocks are in two different ByteIntervals, then it should be considered safe (that is, preserving of program semantics) to move one block relative to the other in memory". I think it is a fair assumption that contiguous DataBlocks should maintain the same relative offsets to each other post instrumentation, and with the previous documentation in mind this would mean they shouldn't be split into seperate ByteIntervals.

@avncharlie avncharlie changed the title split_byte_interval split_byte_interval and join_byte_intervals break jump tables in instrumented binaries Jul 4, 2024
@avncharlie
Copy link
Author

Accidentally created empty issue, updated now with information.

@avncharlie
Copy link
Author

I should note that even if join_byte_intervals doesn't add null bytes, the jump table will still be aligned in such a way that a gap will be inserted between jump table entries and still break the jump table

@jranieri-grammatech
Copy link
Collaborator

jranieri-grammatech commented Jul 8, 2024

Thanks for filing this. The current behavior is definitely wrong in two ways:

  • gtirb-rewriting should stop generating padding bytes ever and let gtirb-pprinter do that, since gtirb-rewriting does not know what the final layout of the module will be
  • gtirb-rewriting's heuristics for inferring block alignment should be re-evaluated due to situations like this and the AVX issue you previously filed

I don't have any cycles to work on this currently, but a temporary workaround is to pre-process the IR before rewriting and add explicit alignment for those blocks in advance (aligned to a 4-byte boundary). That should prevent rewriting from inserting the problematic padding bytes.

@avncharlie
Copy link
Author

Just checking, it appears this commit solves this issue: dcc4624 ?

@jranieri-grammatech
Copy link
Collaborator

Mostly, yep. gtirb-rewriting no longer infers block alignments based on the input alignment, but it still will insert padding bytes to preserve the explicit alignment. It probably shouldn't be doing that either but I don't think it will affect this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants