-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
line numbers in the AST re #732 #750
base: master
Are you sure you want to change the base?
Conversation
I expect some copy-paste, that's to be expected.
Don't worry about this at the moment. I have some structural changes which are coming and I don't know if this will break anything that happens there yet, so TBD.
Just a POC is fine! |
PS: Note there is also the %error stuff in parser-- that works well enough. Line numbers in the code is the goal =D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some random quick thoughts.
In the latest commit 913c100 the code essentially works. It will store start and end positions in nodes types that have received the TextArea embed. This does not survive interpolation though, unless I even have an idea for a type unifier that could allow us to drill down to the offensive expressions. EDIT: after some shy attempts to implement a prototype, I want to say this idea did not pan out. Refactored unification sounds much more promising 👇🏻 |
FYI: I'm currently refactoring the type unification stuff, so after that's done it will be great to revisit this and get it merged in some form. Sneak peak: The new unification has a pointer to the Expr in question so as long as the Expr has the ROW/COL information, then we'll be able to show that without doing any additional plumbing. |
I'm not moving this in any direction until that change lands then. |
Yeah, sorry for the delay. The start of that is here: https://github.com/purpleidea/mgmt/tree/feat/unification in case you're knowledgeable in this area and want to help. I'm kind of a terrible algorithmist and a few things are up in the air, but if you or anyone is an expert in this area, please chime in =D |
Boy has time flown by! I guess you know all the things I was busy with. In any case, unification has merged, so if you'd like to rebase this, that would be awesome! It might also be very easy for you to plug things into unification error messages too (or I can help with that) LMK thanks! |
Hey, nice work on the resolver, feels good so far. I can build this branch and
The error message will be this:
The trouble is that the error is reported at the definition of the Note that I did change the String function for FuncCalls to give the location. But it is not used at the moment. I have a commit in here that makes mgmt dump a string representation of the AST and it includes
(Never mind that this position is also wrong, but right now it seems more important that the resolver can report it at all.) |
(Wrong text position now also fixed) |
What's a resolver?
All type unification errors that are found during unification are reported at the same place... We can likely improve errors a bit though. You asked me to comment and look at this here, I need a bit more info what you're asking, sorry! |
Okay trying to clear it up. I meant solver of course, not resolver. The current issue is that we get this error:
That is correct, but more helpful to the user will be the information that the call of this operator is In essence, I see a need to show a "call stack". Of course, this is at type checking time, so nothing is called (is it?) Maybe what's rather needed is a few levels of AST parents. Rather than just saying "this unnamed internal operator was given bad operands", we should include that "its parent is this expression |
Duh now! I don't know why my mind missed that, lol
Yeah I agree, we probably want to hide the "ExprTopLevel" and "ExprSingleton" expressions. To do so, you'd want to wrap this x.Expr call:
With Line 330 in 3d06605
I expect this will cause an import cycle, so I'd have to ponder what the best move is... I'd like to avoid needing to add a new method on Expr to get the "TrueCallee" for everyone. Does that help? |
Yes the cycle prevented doing it from the solver. Spot on. I pushed a silly little hack to do this recursion from the GAPI scope, but it does not work. This is the output now:
So the result from trueCallee still does not show information. This makes sense to me. As said, it's true that it's the + operator that ultimately is affected. But we don't care. We are interested in its AST parent(s). |
You want to unwrap the x.Expr part... If you see this as your error:
Then TrueCallee(_) would return:
If that's not what you want, I am lost. But I don't see how this error is not correct. What do you expect it to print? |
I may be barking up the wrong tree, going on about the AST. But the issue (in terms of AST nodes) is that this type error is reported on the ExprFunc (i.e. the declaration) of the + operator, which will never be helfpul. I need it reported on the ExprCall representing (in this case) Remember, ultimately we want to report the line and column number of the code that caused the unification issue. |
I agree. I don't know the correct way to achieve this at the moment, but I will dig into it if I can.
If it reports the definition site instead of the call site for now, that's okay =D |
I don't believe I agree, at least until I see a typing issue that is not reported on this kind of builtin element. Why have line numbers when we cannot report errors that use them? I had the idea to include the pointer to the entire AST node in the error, rather than just the Expr within. Then, we could run a graph search through the AST, so that we can basically "zoom in" on the problematic expression(s). The problem is that I cannot very easily run a search through the AST. That's why I created #773 in order to have some discussion around that. (It might be good to talk about this in person a little.) |
ping @purpleidea |
Just had a look. TBQH, I'm not the expert here, so I don't know how to proceed, sorry. |
Maybe if we did a pair programming we might be able to come up with some ideas? LMK |
I'm traveling atm but maybe we can make it happen in November. |
Sure LMK when ready. |
b8072b2
to
380004b
Compare
First complete PoC implementation.
(note the final log message before the goodbye) This breaks some very brittle unit tests. I did not touch those yet, it might be worth refactoring them rather than doing the trivial fix. One big missing piece is that we cannot easily determine file names (i.e., from which file was a given parser token read), so in a complex code base with multiple mcl files, the message will still be quite ambiguous. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ffrank HNY!
This looks really promising. Have a look at some of my comments if you can, I had a few questions, and a likely answer for the filename bit.
But I think I could merge this fairly soon...
I do have a few nit's but I can fix those up unless you have any objections to that.
Cheers!
@@ -221,7 +221,7 @@ stmt: | |||
// `func name(<arg>, <arg>) { <expr> }` | |||
| FUNC_IDENTIFIER IDENTIFIER OPEN_PAREN args CLOSE_PAREN OPEN_CURLY expr CLOSE_CURLY | |||
{ | |||
posLast(yylex, yyDollar) // our pos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq: Why are some of the changes for posLast => locate done in the same space, and others done after the $$... definition bit? Does it matter? (See other comment for an example.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No the inconsistencies were just a result of early prototyping. I cleaned this up now, the new lines should always just replace the old ones (unless I am forgetting something, but don't believe so).
$$.stmt = $1.stmt | ||
locate(yylex, $1, yyDollar[len(yyDollar)-1], $$.stmt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an example of one that moves to after the $$. = bit...
} | ||
| IF expr OPEN_CURLY prog CLOSE_CURLY ELSE OPEN_CURLY prog CLOSE_CURLY | ||
{ | ||
posLast(yylex, yyDollar) // our pos | ||
locate(yylex, $1, yyDollar[len(yyDollar)-1], $$.stmt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an example where it stays in the same place.
@@ -393,3 +393,23 @@ func lambdaScopeFeedback(scope *interfaces.Scope, logf func(format string, v ... | |||
logf("$%s(...)", name) | |||
} | |||
} | |||
|
|||
func AreaParentOf(needle, haystack interfaces.Node) LocalNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some explainer comment doc would be very helpful here please.
// Expect is one of the two types to unify. | ||
Expect *types.Type | ||
|
||
// Actual is one of the two types to unify. | ||
Actual *types.Type | ||
|
||
// An error string to pass along with this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cute idea to make this implement the error interface... But why do we want to do this? Do we want to change the signature somewhere to just include a Node
or error context struct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this one is a really silly hack to allow me to pass the node back alongside the error. Better to add a new struct that will hold the error message and a reference to the node, and return that, yes?
cause := unifyErr.(*interfaces.UnificationInvariant).Node | ||
parent := ast.AreaParentOf(cause, iast) | ||
line, col := parent.GetPosition() | ||
logf("possible type issue found at line %d column %d", line, col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On each Node, we can get the filename from obj.data.Metadata.Main IIRC. And the path of that from obj.data.Base again IIRC.
If that doesn't work, I can surely fix it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a new commit using these hints. Still not able to pinpoint the original file. It can do this for now:
$ ./mgmt run lang examples/lang-errors/including.mcl
...
01:08:54 cli: lang: possible type issue found at line 4 column 30
01:08:54 cli: lang: it's in one of these files: [/home/felix/mgmt/examples/lang-errors/includable.mcl /home/felix/mgmt/examples/lang-errors/including.mcl]
01:08:54 main: goodbye!
cli parse error: could not unify types: unify error with: topLevel(func() { <built-in:_operator> }): type error: int != str
I will clean it up in the next few days. This was mainly a proof of concept implementation. Since you agree with the approach, I will rework it into passable code. Thanks for the hints and pointers. |
Do you have any feedback on the issue of linking AST nodes to source file names? The data structures that you pointed me to appear to be too unspecific. This is the best I can produce right now:
I suspect we need to do more work in order to make this information available. |
I wrote a suggestion of how to store the mcl code coordinates of notable AST nodes in the parser. (It's not done for each last node.)
The approach is possibly flawed; the amount of copy/paste makes me sad.
This was still the easier part though. Next I was trying to figure out how to generate an error message during type unification, that indicates the == operator in the following snippet:
But the unification approach that collects all invariants, and then loops over them, makes this very hard, maybe impossible.
Will we need to keep references to the AST nodes in the invariants? So that we can trace where problematic invariants originate?