Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86_64: Implement integer saturating left shifting codegen #22529

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xtexChooser
Copy link
Contributor

@xtexChooser xtexChooser commented Jan 18, 2025

Simliarly to shl_with_overflow, we first SHL/SAL the integer, then SHR/SAR it back to compare if overflow happens.
If overflow happened, set result to the upper limit to make it saturating.

Theoretically, if the left shifting instruction is lowered as a single SHL/SAL opcode, and the left operand fits into a register (so no truncation is needed), the CF flag can be used to check for overflow. However the optimization is not implemented right now (for my laziness).

Bug: #17645

@xtexChooser xtexChooser changed the title x86_64: Implement integer saturing left shifting codegen x86_64: Implement integer saturating left shifting codegen Jan 18, 2025
@xtexChooser
Copy link
Contributor Author

What's the CI error. It seems that I didn't touch that region of code.

@andrewrk andrewrk requested a review from jacobly0 January 19, 2025 03:29
@mlugg
Copy link
Member

mlugg commented Jan 19, 2025

Don't mind that failure -- it's an inconsistent issue which is being debugged as I type this. Once it's fixed, a rebase should solve that error.

@jacobly0
Copy link
Member

jacobly0 commented Jan 19, 2025

I don't like that this introduces a miscomp, I would rather the backend error on unsupported types rather than silently produce incorrect code.

@xtexChooser
Copy link
Contributor Author

@jacobly0 which unsupported types? there is already a compiler error when unsupported type is hit.

https://github.com/ziglang/zig/blob/ec2df79cbf25e53c79d53dd6d461bd9d7457bd85/src/arch/x86_64/CodeGen.zig#L12944-L12946

@jacobly0
Copy link
Member

$ cat check.zig
const std = @import("std");

fn shlSat(x: anytype, y: std.math.Log2Int(@TypeOf(x))) @TypeOf(x) {
    return x <<| y;
}

fn testType(comptime T: type) bool {
    var ok = true;
    comptime var rhs: std.math.Log2Int(T) = 0;
    inline while (true) : (rhs += 1) {
        comptime var lhs: T = std.math.minInt(T);
        inline while (true) : (lhs += 1) {
            ok = shlSat(lhs, rhs) == lhs <<| rhs and ok;
            if (lhs == std.math.maxInt(T)) break;
        }
        if (rhs == @bitSizeOf(T) - 1) break;
    }
    return ok;
}

pub fn main() void {
    var ok = true;
    ok = testType(i2) and ok;
    ok = testType(u2) and ok;
    ok = testType(i3) and ok;
    ok = testType(u3) and ok;
    ok = testType(i4) and ok;
    ok = testType(u4) and ok;
    std.debug.print("{s}\n", .{if (ok) "ok" else "bad"});
}
$ zig run -fllvm check.zig
ok
$ zig run -fno-llvm check.zig
bad

@xtexChooser
Copy link
Contributor Author

@jacobly0 ahh! Sorry for my mistake and thanks for your careful review!

I forgot that negative values need the minimum value to saturate and it seems that the saturating_arithmetic.zig behavior test is not strong enough. The force-push above should solve the error and the check you have given is passing successfully.

However, x86 does not have a saturating shifting instruction so we have to emit up to two shifting and two conditional branches, totally ~10 MIR instructions, I doubt if the shl_sat grammar and IR is useful, as it seems to be a pretty rare case.

Copy link
Member

@jacobly0 jacobly0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good enough to get in the next release, if I don't manage to finish the rewrite before then.

.signed => {
// check the sign of lhs
try self.genBinOpMir(.{ ._, .cmp }, lhs_ty, lhs_mcv, try self.genTypedValue(try self.pt.intValue(lhs_ty, 0)));
const sign_reloc_condbr = try self.genCondBrMir(lhs_ty, .{ .eflags = Condition.g });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ZF is not reliable after genBinOpMir(.{ ._, .cmp }, ..), please use a condition that does not depend on it instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to use one more SHR to get the sign bit and pass it directly as condition. Thank you for the catch (:

It seems that airCmp is also using genBinOpMir with cmp and returning eflags conditions. is this intended?
https://github.com/ziglang/zig/blob/5cfcb015033864c769235726975ab9919c217ef9/src/arch/x86_64/CodeGen.zig#L20662-L20663

Copy link
Member

@jacobly0 jacobly0 Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see here, all conditions depending on ZF are not possible in that code path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eq and neq are also ZF conditions, am I understanding wrong? If they are okay to use, then the current code should be working (the gt has been replaced by sign bit == 1), or else the equal condition above this piece of code also needs to be replaced.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, airShlWithOverflow is also wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about hacking genBinOpMir to emit JCC after each asmRegisterRegister?

Copy link
Member

@jacobly0 jacobly0 Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only it were that simple... I'm sure there exists a jcc that would work for each desired condition, but we don't know the desired condition there.

Either extract the contents of airCmp into a separate function, or fail if the int is > 64 bits.

src/arch/x86_64/CodeGen.zig Show resolved Hide resolved
src/arch/x86_64/CodeGen.zig Show resolved Hide resolved
src/arch/x86_64/CodeGen.zig Show resolved Hide resolved
Simliarly to shl_with_overflow, we first SHL/SAL the integer, then
SHR/SAR it back to compare if overflow happens.
If overflow happened, set result to the upper limit to make it saturating.

Bug: ziglang#17645
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants