-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crysis 2 Maximum edition: Audio thread completely saturated with x87, breaks audio #4252
Comments
Interesting - this is a good example to extract potential blocks for optimization in the x87 stack optimization pass. |
An easy win I noticed here is that vector stores aren't handling the small store ranges that are possible in arm64. Loads seem to handle it although. "Test": {
"x86InstructionCount": 4,
"ExpectedInstructionCount": 15,
"x86Insts": [
"fld dword [ebp+16380]",
"fstp dword [eax-0x4]",
"fld dword [ebp-0x8]",
"fstp dword [eax+16370]"
],
"ExpectedArm64ASM": [
"ldr s2, [x9, #16380]",
"sub w20, w4, #0x4 (4)",
"str s2, [x20]",
"ldur s2, [x9, #-8]",
"mov w20, #0x3ff2",
"add w20, w4, w20",
"str s2, [x20]",
"ldrb w20, [x28, #1019]",
"add w20, w20, #0x7 (7)",
"and w20, w20, #0x7",
"ldrb w21, [x28, #1298]",
"mov w22, #0x1",
"lsl w20, w22, w20",
"bic w20, w21, w20",
"strb w20, [x28, #1298]"
]
} Happens for both 64-bit mode with 64-bit addressing, and 32-bit mode with 32-bit addressing. Should be an easy win to fold those immediates in to str and stur. |
Interesting - I am a bit confused about this. Are you on a pvt branch or sitting on uncommitted patches? I get 22 insts. This is the full json I am testing:
|
Maybe because I had a few options enabled.
|
The difference was not actually those options but the bitness. 32bits result in 15 insts, 64bits result in 22 insts. I have so many questions... :) |
Oh right, the example block is from a 32-bit game, so yea would need a 32-bit option enabled. |
Yeah, I am fixing this. I made an incorrect assumption about memory optimization when I implemented the x87 stack optimization pass. |
Mentioned initially in FEX-Emu#4252.
Includes tests and instcountci files and tests. When the x87 optimizations were implement, we missed optimizing different addressing modes. This commit addresses this issue. Discussed in FEX-Emu#4252.
Includes tests and instcountci files and tests. When the x87 optimizations were implement, we missed optimizing different addressing modes. This commit addresses this issue. Discussed in FEX-Emu#4252.
Includes tests and instcountci files and tests. When the x87 optimizations were implement, we missed optimizing different addressing modes. This commit addresses this issue. Discussed in FEX-Emu#4252.
Includes tests and instcountci files and tests. When the x87 optimizations were implement, we missed optimizing different addressing modes. This commit addresses this issue. Discussed in FEX-Emu#4252.
The first level of the game completely saturates its audio thread, resulting in it dropping samples. Even in reduced precision mode. Pretty much all the CPU time is spent in its fmodex.dll. and most of the blocks are just a ton of x87.
fmodex.dll base address at 0x2e20000 in this capture.
First block:
The text was updated successfully, but these errors were encountered: