-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matching block params to allocations #24
Comments
You can get this information by adding a dummy instruction at the start of the block which uses all block parameters with the |
I thought of that, but doesn't this also constrain the register allocator? For example, it might want to put some block parameters in spills, but the dummy instruction would force it to load them into registers first. In fact the parameters might even be dead, so the dummy instruction might increase register pressure by making the values live longer than necesary. EDIT: I didn't know about the |
In my own compiler I have a liveness pass that eliminates dead block parameters when lowering, so I haven't considered this. I don't think regalloc2 will eliminate dead block parameters on its own. However there is another situation where a block parameter could be without an allocation: if the value is rematerializable then it could have an allocation of |
(I also have such a pass, but since there are micro optimizations all over the place I would not be too surprised if regalloc found some dead stuff anyway.) I wasn't expecting it to eliminate dead block parameters per se, only set them to |
This is fixed by #27 . |
Currently,
checker.rs
has to perform a dataflow analysis to derive the mapping from block params to allocations, but the block params are an input and conceivably the register allocator could provide this information as part of its output. In my compiler, I need this information as part of regular compilation, and while I could do a similar dataflow analysis, I imagine that the register allocator could provide this information more efficiently than if I were to reconstruct it.The API I am thinking of is something like:
Output
contains a fieldblock_params: Vec<Allocation>
andblock_param_offsets: Vec<u32>
, encoding a list of slices similar to instructions[Allocation]
slice for a given blockbl
is parallel tof.block_params(bl)
Allocation
for a given block parameter is eitherNone
, or aReg
orStack
value according to where it happens to be stored at the moment.This would enable an alternative non-fixpoint implementation of
checker.rs
, where the checker state is initialized directly from the block params: each mapping ofvreg -> alloc
(wherealloc
is notNone
) implies an entryalloc -> CheckerValue::Reg(vreg, reftyped(vreg))
in the initial state. There is no distinction betweenUnknown
andConflicted
here, but the constraint is that at the bottom of each basic block, the state at the successor blocks is obtained from this state by setting some allocations toNone
. Unlike the fixpoint algorithm this does not require that the output of the checker is a least fixpoint, only a fixpoint, but it is a witness of correctness either way.Thoughts? I imagine that some compilers don't want this information and/or don't want to pay for constructing this output data, so it can be behind a flag (a runtime flag would be useful for applications like the checker, and the cost should be minimal if the flag is off, since it's just two more empty vectors in
Output
, but it could also be a compile time flag). But for some applications this data is quite relevant for connecting pre-regalloc data to post-regalloc data when generating auxiliary information about the code (or even the code itself, if the ISA requires some kind of capabilities for the used registers, although this seems unlikely).The text was updated successfully, but these errors were encountered: