Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize representation of sequences in Useless pass #569

Merged
merged 8 commits into from
May 22, 2024

Conversation

MatthewFluet
Copy link
Member

No description provided.

Rather than always forcing a `Value.Array`'s length to be useful whenever an
array element exists, only force the `Array_alloc`'s argument to be useful
whenever an array element exists.
The `Useless` optimization pass may determine that the contents of a
sequence (array or vector) is useless; similarly, it may determine that the
length of a sequence is useless; finally, it may determine that the identity of
an array is useless.

However, the transformation performed by the `Useless` optimization pass would,
at most, change a `t array` or `t vector` to `unit array` or `unit vector`.

This commit allows the `Useless` optimization pass to change sequences with
useless contents to simpler types.  In particular:

 * a `t array` with useless contents, useless length, and useless identity
   becomes a `unit`
 * a `t array` with useless contents, useless length, and useful identity
   becomes a `unit ref`
 * a `t array` with useless contents, useful length, and useless identity
   becomes a `word64` (or a `word32` on 32-bit platforms)
 * a `t array` with useless contents, useful length, and useful identity
   becomes a `word64 ref` (or a `word32 ref` on 32-bit platforms)
 * a `t vector` with useless contents and useless length
   becomes a `unit`
 * a `t vector` with useless contents and useful length
   becomes a `word64` (or a `word32` on 32-bit platforms)

Such optimizations do not happen frequently, but there are a few instances in a
self-compile.
@MatthewFluet MatthewFluet merged commit 08b85af into MLton:master May 22, 2024
12 checks passed
@MatthewFluet MatthewFluet deleted the useless-update branch May 22, 2024 20:56
MatthewFluet added a commit to MatthewFluet/mlton that referenced this pull request Dec 19, 2024
In the `Value.t` representation of the Useless optimization, sequence and tuple
elements are "slots", which combine a `Useful.t` lattice element with an
`Exists.t` lattice element.  When a value flows, vectors and tuples coerce the
`Useful.t` component of slots but unify the `Exists.t` component.  This ensures
that agree on whether or not the element exists even if they disagree that the
element is useful.  If the "from"'s element is useful (and necessarily existing)
but the "to"'s element is not useful, then forcing the "to"'s element to exist
avoids a potentially expensive runtime coercion (e.g., to drop a component of a
tuple or, worse, to drop a component of a tuple that is the contents of a
vector).

Previously, the length of a sequence was represented simply as a `Useful.t`;
this allowed a vector (whose contents were not useful) with a useful length to
flow to a vector (whose contents are necessarily not useful) with a useless
length.  1aad5fe (Optimize representation of sequences in Useless pass;
MLton#569) allowed the Useless optimization to change the representation
of sequence with useless contents.  In particular, a vector with useless
contents but useless length becomes a `word64` and a vector with useless
contents and useless length becomes a `unit`.

However, when such vectors are themselves components of a tuple, then the
program may have a flow of tuples, where the source tuple is changed from
`(..., ?? vector, ...)` to `(..., word64, ...)` but the destination tuple is
changed from `(..., ?? vector, ...)` to `(..., unit, ...)`.  (Note that the
unification of the `Exist.t` components of the corresponding tuple elements is
what makes the destination tuple have a `unit` element.)

This commit treats the length of arrays and vectors as a "slot", so that they
track both usefulness and existence.  Now, a vector with useless contents but a
length that must exist becomes a `word64` (even if the length is useless) and a
vector with useless contents and a length that need not exist (and is
necessarily useless) becomes a `unit`.

Fixes MLton#585.
MatthewFluet added a commit to MatthewFluet/mlton that referenced this pull request Dec 20, 2024
In the `Value.t` representation of the Useless optimization, sequence and tuple
elements are "slots", which combine a `Useful.t` lattice element with an
`Exists.t` lattice element.  When a value flows, vectors and tuples coerce the
`Useful.t` component of slots but unify the `Exists.t` component.  This ensures
that agree on whether or not the element exists even if they disagree that the
element is useful.  If the "from"'s element is useful (and necessarily existing)
but the "to"'s element is not useful, then forcing the "to"'s element to exist
avoids a potentially expensive runtime coercion (e.g., to drop a component of a
tuple or, worse, to drop a component of a tuple that is the contents of a
vector).

Previously, the length of a sequence was represented simply as a `Useful.t`;
this allowed a vector (whose contents were not useful) with a useful length to
flow to a vector (whose contents are necessarily not useful) with a useless
length.  1aad5fe (Optimize representation of sequences in Useless pass;
MLton#569) allowed the Useless optimization to change the representation
of sequence with useless contents.  In particular, a vector with useless
contents but useless length becomes a `word64` and a vector with useless
contents and useless length becomes a `unit`.

However, when such vectors are themselves components of a tuple, then the
program may have a flow of tuples, where the source tuple is changed from
`(..., ?? vector, ...)` to `(..., word64, ...)` but the destination tuple is
changed from `(..., ?? vector, ...)` to `(..., unit, ...)`.  (Note that the
unification of the `Exist.t` components of the corresponding tuple elements is
what makes the destination tuple have a `unit` element.)

This commit treats the length of arrays and vectors as a "slot", so that they
track both usefulness and existence.  Now, a vector with useless contents but a
length that must exist becomes a `word64` (even if the length is useless) and a
vector with useless contents and a length that need not exist (and is
necessarily useless) becomes a `unit`.

Fixes MLton#585.
MatthewFluet added a commit to MatthewFluet/mlton that referenced this pull request Dec 20, 2024
In the `Value.t` representation of the Useless optimization, sequence and tuple
elements are "slots", which combine a `Useful.t` lattice element with an
`Exists.t` lattice element.  When a value flows, vectors and tuples coerce the
`Useful.t` component of slots but unify the `Exists.t` component.  This ensures
that agree on whether or not the element exists even if they disagree that the
element is useful.  If the "from"'s element is useful (and necessarily existing)
but the "to"'s element is not useful, then forcing the "to"'s element to exist
avoids a potentially expensive runtime coercion (e.g., to drop a component of a
tuple or, worse, to drop a component of a tuple that is the contents of a
vector).

Previously, the length of a sequence was represented simply as a `Useful.t`;
this allowed a vector (whose contents were not useful) with a useful length to
flow to a vector (whose contents are necessarily not useful) with a useless
length.  1aad5fe (Optimize representation of sequences in Useless pass;
MLton#569) allowed the Useless optimization to change the representation
of sequence with useless contents.  In particular, a vector with useless
contents but useful length becomes a `word64` and a vector with useless contents
and useless length becomes a `unit`.

However, when such vectors are themselves components of a tuple, then the
program may have a flow of tuples, where the source tuple is changed from
`(..., ?? vector, ...)` to `(..., word64, ...)` but the destination tuple is
changed from `(..., ?? vector, ...)` to `(..., unit, ...)`.  (Note that the
unification of the `Exist.t` components of the corresponding tuple elements is
what makes the destination tuple have a `unit` element.)

This commit treats the length of arrays and vectors as a "slot", so that they
track both usefulness and existence.  Now, a vector with useless contents but a
length that must exist becomes a `word64` (even if the length is useless) and a
vector with useless contents and a length that need not exist (and is
necessarily useless) becomes a `unit`.

Fixes MLton#585.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant