Make topological sort fully general. #1138

porcuquine · 2024-02-15T21:13:45Z

This PR generalizes the topological sort used when computing the provenances. The previous code relied on limitations on possible DAG shape in the singly self-recursive queries implemented so far. These changes should be fully general and support any dependency structure recorded by the queries.

huitseeker

Please see inline comments.

huitseeker · 2024-02-16T14:01:30Z

src/coroutine/memoset/mod.rs

+                if dep_count == 0 {
+                    // Queries are ready if they have no missing dependencies.
+                    // Initially, this will be the base cases -- which have no dependencies.
+                    ready.insert(key);
                }
+                (key, dep_count)


When a query is ready here, we still include (key, 0) in missing_dependency_counts...

Yep, a ready query has no missing dependencies. This also means that no other queries (its formerly missing dependencies) will trigger modification of that stat in the future.

huitseeker · 2024-02-16T14:03:39Z

src/coroutine/memoset/mod.rs

                for dependent in dependents {
-                    if provenances.get(dependent).is_none() {
-                        next.insert(dependent);
-                    }
+                    missing_dependency_counts
+                        .entry(dependent)
+                        .and_modify(|missing_count| {
+                            *missing_count -= 1;
+                            if *missing_count == 0 {
+                                next.insert(dependent);
+                            }
+                        });


... which means IIUC that here we may receive a missing_dependency_counts with some (key, 0) entry.
This would then underflow

.and_modify(|missing_count| { *missing_count -= 1; if *missing_count == 0 { next.insert(dependent); } }

Is this the behavior we want?

No, we are looking up the missing dependency counts for the dependent of the query that is ready. In other words, after we process this query, then everything that depends on it is a little more ready.

Whenever missing_count actually reaches zero, that query becomes immediately ready. The situation you are worried about could (maybe?) happen only if a query depends on itself. However, this is forbidden and already leads to unprovability. Moreover, it can't actually happen because (besides being illegal and unprovable via the very provenances we are assembling here) such a query would never have become ready in the first place.

It's possible there is some other bug in this code, but I don't think your concern here represents a problem. The whole point is that we are constructing the provenances bottom-up as we topologically sort. This process also implies that each query will only ever be have its missing_count decremented before it is processed. Once that count reaches 0, it will never be affected again, because all the queries on which it depends have already been processed previously.

This does point out that there is a way for a query to not have its provenance created. Any cyclic computation will never become ready. This is good, since we would never be able to generate a proof. Therefore, we could add an extra check after computing all provenances. If any provenances are missing then we have detected a cyclic computation and can eagerly abort for clear reason. cc: @gabriel-barrett

In other words, the impossibility of generating a provenance for a cyclic computation is exactly the reason we force the prover to manifest provenance in the first place — to demonstrate the lack of such cycles. The benefit of the bottom-up topological sorting vs top-down is that it lets us detect the error condition explicitly, rather than enter into an infinite loop or wild-goose-chase that exhausts the stack.

I added the check here: 3f30a85. If an application manages to allow for and actually try to prove a cyclic computation, it will be caught. If this happens, the Lang supplying a Query enabling it is noncompliant.

What you're saying is you're unlikely to look up the query that would here provoke an underflow, especially since you end up adding an additional check for cyclic dependencies. In the current form of the code, that's probably right.

In order to prevent a future situation where an errant developer might edit this code without understanding all its implications, simply replacing

*missing_count -=1

by

*missing_count = missing_count.checked_sub(1).expect("ready queries should not be looked up at this stage")

¹ could provide the local context you just offered, and avoids us having to one day look into a query never being processed because its missing_count has just become enormous.

Footnotes

or some more enlightening message ↩

porcuquine requested a review from a team as a code owner February 15, 2024 21:13

Make topological sort fully general.

34644c3

porcuquine force-pushed the fix-topo branch from 028d0f1 to 34644c3 Compare February 15, 2024 21:57

porcuquine enabled auto-merge February 15, 2024 21:57

porcuquine requested a review from gabriel-barrett February 15, 2024 21:57

porcuquine mentioned this pull request Feb 16, 2024

Provenance refactor #1143

Merged

huitseeker requested changes Feb 16, 2024

View reviewed changes

porcuquine added 2 commits February 16, 2024 15:46

Detect missing provenance.

3f30a85

Never gonna underflow.

23fa452

huitseeker approved these changes Feb 17, 2024

View reviewed changes

porcuquine added this pull request to the merge queue Feb 17, 2024

Merged via the queue into main with commit f2d70f4 Feb 17, 2024
11 checks passed

porcuquine deleted the fix-topo branch February 17, 2024 01:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make topological sort fully general. #1138

Make topological sort fully general. #1138

porcuquine commented Feb 15, 2024

huitseeker left a comment •

edited

Loading

huitseeker Feb 16, 2024

porcuquine Feb 16, 2024

huitseeker Feb 16, 2024

porcuquine Feb 16, 2024

porcuquine Feb 16, 2024

porcuquine Feb 16, 2024 •

edited

Loading

porcuquine Feb 16, 2024

huitseeker Feb 17, 2024 •

edited

Loading

porcuquine Feb 17, 2024

Make topological sort fully general. #1138

Make topological sort fully general. #1138

Conversation

porcuquine commented Feb 15, 2024

huitseeker left a comment • edited Loading

Choose a reason for hiding this comment

huitseeker Feb 16, 2024

Choose a reason for hiding this comment

porcuquine Feb 16, 2024

Choose a reason for hiding this comment

huitseeker Feb 16, 2024

Choose a reason for hiding this comment

porcuquine Feb 16, 2024

Choose a reason for hiding this comment

porcuquine Feb 16, 2024

Choose a reason for hiding this comment

porcuquine Feb 16, 2024 • edited Loading

Choose a reason for hiding this comment

porcuquine Feb 16, 2024

Choose a reason for hiding this comment

huitseeker Feb 17, 2024 • edited Loading

Choose a reason for hiding this comment

Footnotes

porcuquine Feb 17, 2024

Choose a reason for hiding this comment

huitseeker left a comment •

edited

Loading

porcuquine Feb 16, 2024 •

edited

Loading

huitseeker Feb 17, 2024 •

edited

Loading