Don't copy if `should_copy` is false #251

jumerckx · 2024-10-07T12:26:10Z

No description provided.

gkronber · 2024-10-07T13:02:05Z

Copying the nodes for the parent list is on purpose. (see #239)

jumerckx · 2024-10-07T15:08:42Z

Ah sorry, should've checked the blame.
My use-case is that I'm storing additional data linked to enodes outside of the egraph. This copy makes it such that the enodes stored in an eclass don't have the same objectid as the keys of memo.
I'm not yet seeing what problems this can cause. Alternatively, I should probably stop using an IdDict{VecExpr, T} change how I store this external data.

gkronber · 2024-10-08T06:34:19Z

Storing enodes outside of the egraph is problematic as enodes are canonicalized (mutated) after each rebuilding phase and equal enodes after canonicalization are deleted from the internal datastructures.

Probably you can change your implementation to use semantic analysis values (for eclasses)?

jumerckx · 2024-10-09T11:22:28Z

Storing enodes outside of the egraph is problematic as enodes are canonicalized (mutated) after each rebuilding phase and equal enodes after canonicalization are deleted from the internal datastructures.

In my case that's okay since canonicalized enodes are supposed to have the same external data as well.

Probably you can change your implementation to use semantic analysis values (for eclasses)?

The reason I didn't go this route from the start is because I also have enodes that are never added to the egraph for which I also want to store this data. But on second thought, it makes more sense to use an eclass analysis and handle these external enodes separately. Thanks for the advice!

I still don't really get what can go wrong with the egraph invariants when the copy is omitted, but feel free to close this pr if this is something that definitely shouldn't happen.

0x0f0f0f · 2024-10-10T21:37:48Z

@jumerckx

I still don't really get what can go wrong with the egraph invariants when the copy is omitted, but feel free to close this pr if this is something that definitely shouldn't happen.

discussion was in #239 :)

jumerckx · 2024-10-11T07:16:56Z

@0x0f0f0f

discussion was in #239 :)

In that discussion it is said that enodes in memo and in nodes need to be different because in nodes they can change by canonicalization. However, every access to memo in the codebase is preceded by a call to canonicalize so I believe there's no real issue there? The keys in memo will simply also be canonicalized.

gkronber · 2024-10-11T07:52:46Z

@jumerckx, you might be right that the copy is not necessary. Could you please check whether the unit tests pass with check_memo and check_analysis set to true in SaturationParams?

The code is a bit tricky because canonicalization of nodes that contained in memo may have the effect that haskey(g.memo, n) is false (because of new hash values) even though memo still contains the key.

jumerckx · 2024-10-11T11:12:02Z

Could you please check whether the unit tests pass with check_memo and check_analysis set to true in SaturationParams?

The tests pass with these flags set to true.

However, I now understand the issue, dictionary keys shouldn't get modified. So this pr would indeed introduce a bug 😅. I believe it might be possible to have a dict-like data structure specifically for vecexprs that doesn't include the eclasses in its hash and instead verifies these on lookup.
But the copy here is clearly much more pragmatic and since I've been able to solve my problem much more nicely with eclass Analysis I don't see a reason to further pursue this.

Thanks a lot for the feedback, learned a lot about Metatheory's internals!

gkronber · 2024-10-11T12:59:38Z

It's a bit of a mess. memo is used as a deduplication mechanism for enodes. The pseudo-code in the egg paper removes enodes from memo before canonicalization and then adds the canonical enodes again. In the current egg implementation, however, they seem to just ignore this and mutate memo keys.

I'm currently working on this again in #253 , trying to reduce the number of enode copying.

don't copy if should_copy is false

2debe68

0x0f0f0f closed this Oct 10, 2024

jumerckx deleted the patch-1 branch October 11, 2024 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't copy if `should_copy` is false #251

Don't copy if `should_copy` is false #251

jumerckx commented Oct 7, 2024

gkronber commented Oct 7, 2024

jumerckx commented Oct 7, 2024

gkronber commented Oct 8, 2024

jumerckx commented Oct 9, 2024

0x0f0f0f commented Oct 10, 2024

jumerckx commented Oct 11, 2024

gkronber commented Oct 11, 2024

jumerckx commented Oct 11, 2024

gkronber commented Oct 11, 2024

Don't copy if should_copy is false #251

Don't copy if should_copy is false #251

Conversation

jumerckx commented Oct 7, 2024

gkronber commented Oct 7, 2024

jumerckx commented Oct 7, 2024

gkronber commented Oct 8, 2024

jumerckx commented Oct 9, 2024

0x0f0f0f commented Oct 10, 2024

jumerckx commented Oct 11, 2024

gkronber commented Oct 11, 2024

jumerckx commented Oct 11, 2024

gkronber commented Oct 11, 2024

Don't copy if `should_copy` is false #251

Don't copy if `should_copy` is false #251