Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange split types coercion #585

Closed
hummy123 opened this issue Dec 18, 2024 · 3 comments · Fixed by #586
Closed

Strange split types coercion #585

hummy123 opened this issue Dec 18, 2024 · 3 comments · Fixed by #586

Comments

@hummy123
Copy link

hummy123 commented Dec 18, 2024

Hi there.

I'm having fun making a little game with SML and MLton ((link)[https://github.com/hummy123/quad-tree-sml/]) but I encountered a compiler bug to do with strange type coercion with MLton (freshly built from the latest commit on GitHub so not a reemergence of the bug I reported in May).

To trigger the issue on aarch64-linux (only platform I have tested on):

  • Install the GLFW C library ((link)[https://www.glfw.org/docs/latest/quick.html]
  • Clone the game repository (link)[https://github.com/hummy123/quad-tree-sml]
  • Run the build-unix.sh script
  • The error should have occurred

I did look into the cause for some time and found what part of my code was triggering the error. There is a comment on line 78 of (fcore/player.sml)[https://github.com/hummy123/quad-tree-sml/blob/main/fcore/player.sml] describing what I think the issue is.

I'm having a lot of fun using SML and MLton for side projects (a small GUI app and a Vim clone) and am happy they run on my obscure platform, and appreciate the work that has gone into it.

@MatthewFluet
Copy link
Member

Confirmed (on amd64-linux); since it occurs during compilation of the SML code, one doesn't even need the link argument or .c files.

The bug is actually from the Useless pass; typically, the first thing to do with an internal compiler error is to compile with -type-check true, which instructs the compiler to run the IR type checkers after each transformation. Although not every optimization bug manifests as a type error, many of them do. I get the following:

$ mlton -type-check true -verbose 2 oms.mlb
MLton 20241212.104417-gf862c4112 starting
   Compile SML starting
...
         splitTypes1 starting
            splitTypes1:typeCheck starting
            splitTypes1:typeCheck finished in 0.00 + 0.00 (0% GC)
         splitTypes1 finished in 0.00 + 0.00 (0% GC)
         useless starting
            useless:typeCheck starting
               useless:typeCheck raised: Fail: TypeError (SSA): Ssa.TypeCheck.coerce ({from = (t_0, word64, (bool, word32, word32, player_y_axis_0) tuple, t_0, ((word32, word32, word32, word32, word32) tuple) vector) tuple, to = (t_0, unit, (bool, word32, word32, player_y_axis_0) tuple, t_0, ((word32, word32, word32, word32, word32) tuple) vector) tuple}) in goto loop_0 (x_5, x_4, x_3, x_2, x_1, x_0) in L_0 in helpLoop_0
            useless:typeCheck raised in 0.00 + 0.00 (0% GC)
         useless raised in 0.01 + 0.00 (0% GC)
      ssaSimplify raised in 0.07 + 0.02 (20% GC)
   Compile SML raised in 1.84 + 0.55 (23% GC)
MLton 20241212.104417-gf862c4112 raised in 1.84 + 0.55 (23% GC)

Possibly introduced by #569, though the type error concerns a component of a tuple (from type: (t_0, [word64], ... tuple, t_0, ... vector) tuple, to type (t_0, [unit], ... tuple, t_0, ... vector) tuple).

@MatthewFluet
Copy link
Member

Actually, on second thought and looking at your comment in player.sml, the bug-triggering code seems to be that which manipulates the platforms: platform vector component of the game_type type. #569 allowed certain sequences (arrays and vectors) whose components weren't useful to be simplified to word64 (if the length was useful, but not the contents) or to unit (if neither the length nor the contents were useful). And, looking at the oms.useless.pre.ssa and oms.useless.post.ssa, it does seem that a ((word32, word32, word32, word32) tuple) vector is being optimized to word64 in some places, but to a unit in other places.

@hummy123
Copy link
Author

I double-checked (built and installed the compiler version with the commit hash d688f0e which is from March this year, before the linked issue), and can confirm you're right about the cause being #569 .

I do see that the code I wrote also has a bug/some regressions now that I can compile and test it (collision detection broke) and appreciate the hint that allowed me to do that.

MatthewFluet added a commit to MatthewFluet/mlton that referenced this issue Dec 19, 2024
In the `Value.t` representation of the Useless optimization, sequence and tuple
elements are "slots", which combine a `Useful.t` lattice element with an
`Exists.t` lattice element.  When a value flows, vectors and tuples coerce the
`Useful.t` component of slots but unify the `Exists.t` component.  This ensures
that agree on whether or not the element exists even if they disagree that the
element is useful.  If the "from"'s element is useful (and necessarily existing)
but the "to"'s element is not useful, then forcing the "to"'s element to exist
avoids a potentially expensive runtime coercion (e.g., to drop a component of a
tuple or, worse, to drop a component of a tuple that is the contents of a
vector).

Previously, the length of a sequence was represented simply as a `Useful.t`;
this allowed a vector (whose contents were not useful) with a useful length to
flow to a vector (whose contents are necessarily not useful) with a useless
length.  1aad5fe (Optimize representation of sequences in Useless pass;
MLton#569) allowed the Useless optimization to change the representation
of sequence with useless contents.  In particular, a vector with useless
contents but useless length becomes a `word64` and a vector with useless
contents and useless length becomes a `unit`.

However, when such vectors are themselves components of a tuple, then the
program may have a flow of tuples, where the source tuple is changed from
`(..., ?? vector, ...)` to `(..., word64, ...)` but the destination tuple is
changed from `(..., ?? vector, ...)` to `(..., unit, ...)`.  (Note that the
unification of the `Exist.t` components of the corresponding tuple elements is
what makes the destination tuple have a `unit` element.)

This commit treats the length of arrays and vectors as a "slot", so that they
track both usefulness and existence.  Now, a vector with useless contents but a
length that must exist becomes a `word64` (even if the length is useless) and a
vector with useless contents and a length that need not exist (and is
necessarily useless) becomes a `unit`.

Fixes MLton#585.
MatthewFluet added a commit to MatthewFluet/mlton that referenced this issue Dec 20, 2024
In the `Value.t` representation of the Useless optimization, sequence and tuple
elements are "slots", which combine a `Useful.t` lattice element with an
`Exists.t` lattice element.  When a value flows, vectors and tuples coerce the
`Useful.t` component of slots but unify the `Exists.t` component.  This ensures
that agree on whether or not the element exists even if they disagree that the
element is useful.  If the "from"'s element is useful (and necessarily existing)
but the "to"'s element is not useful, then forcing the "to"'s element to exist
avoids a potentially expensive runtime coercion (e.g., to drop a component of a
tuple or, worse, to drop a component of a tuple that is the contents of a
vector).

Previously, the length of a sequence was represented simply as a `Useful.t`;
this allowed a vector (whose contents were not useful) with a useful length to
flow to a vector (whose contents are necessarily not useful) with a useless
length.  1aad5fe (Optimize representation of sequences in Useless pass;
MLton#569) allowed the Useless optimization to change the representation
of sequence with useless contents.  In particular, a vector with useless
contents but useless length becomes a `word64` and a vector with useless
contents and useless length becomes a `unit`.

However, when such vectors are themselves components of a tuple, then the
program may have a flow of tuples, where the source tuple is changed from
`(..., ?? vector, ...)` to `(..., word64, ...)` but the destination tuple is
changed from `(..., ?? vector, ...)` to `(..., unit, ...)`.  (Note that the
unification of the `Exist.t` components of the corresponding tuple elements is
what makes the destination tuple have a `unit` element.)

This commit treats the length of arrays and vectors as a "slot", so that they
track both usefulness and existence.  Now, a vector with useless contents but a
length that must exist becomes a `word64` (even if the length is useless) and a
vector with useless contents and a length that need not exist (and is
necessarily useless) becomes a `unit`.

Fixes MLton#585.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants