implement char escapes #20

aaronjanse · 2019-07-01T01:53:52Z

fix #11

This is based on the Elm Parser example code (which has a bug btw). I wish I found it earlier.

I wrote the PR such that we can later reuse the escape-handling code for literalString.

Janiczek · 2019-07-03T23:22:18Z

README.md

-8. <span id="f8"></span> Comprehensive tests missing; not tracked yet
-9. <span id="f9"></span> Multiline strings (and maybe more) missing; not tracked yet
+6. <span id="f6"></span> Comprehensive tests missing; not tracked yet
+7. <span id="f7"></span> Multiline strings (and maybe more) missing; not tracked yet


Next time please don't bother shifting the numbers, life's too short for that :)

Janiczek · 2019-07-03T23:22:32Z

README.md

+        src="https://avatars0.githubusercontent.com/u/16829510">
+        </br>
+        <a href="https://github.com/aaronjanse">Aaron Janse</a>
+      </td>


Janiczek · 2019-07-03T23:23:17Z

src/compiler/Stage/Parse/Parser.elm

-TODO Unicode escapes
-}
+-- for literalChar and, in the future, literalString
+stringHelp = 


I'd probably rather name this character or insideQuotes or something. 🤔

character is great because it parses one character, even if that character is represented by multiple bytes (e.g. \u{123}) 👍

Fixed by 483bf8a

Janiczek · 2019-07-03T23:23:45Z

src/compiler/Stage/Parse/Parser.elm

+-- for literalChar and, in the future, literalString
+stringHelp = 
+    P.oneOf
+    [ P.succeed (identity)


probably unneeded parens for the identity

Fixed by 657fa28

Janiczek · 2019-07-03T23:27:59Z

src/compiler/Stage/Parse/Parser.elm

+            , P.map (\_ -> '\'') (P.token (P.Token "'"  ExpectingEscapeCharacter))
+            , P.map (\_ -> '\n') (P.token (P.Token "n"  ExpectingEscapeCharacter))
+            , P.map (\_ -> '\t') (P.token (P.Token "t"  ExpectingEscapeCharacter))
+            , P.map (\_ -> '\r') (P.token (P.Token "r"  ExpectingEscapeCharacter))


I'm still thinking about these ParserProblems. The problem with this current solution is that the error messages will only be able to show five times the same thing ("I expected escape character, escape character, ...")

Please parameterize this one (| ExpectingEscapeCharacter Char) and give these usages of it the chars they're really about. That should be enough and not bloat the ParserProblem type so much. WDYT?

How about something like below?

The above behavior would interpret \x as a backslash followed by a separate x.
The below would throw an error if it sees \x.

|= P.oneOf - [ P.map (\_ -> '\"') (P.token (P.Token "\"" ExpectingEscapeCharacter)) - , P.map (\_ -> '\'') (P.token (P.Token "'" ExpectingEscapeCharacter)) - , P.map (\_ -> '\n') (P.token (P.Token "n" ExpectingEscapeCharacter)) - , P.map (\_ -> '\t') (P.token (P.Token "t" ExpectingEscapeCharacter)) - , P.map (\_ -> '\r') (P.token (P.Token "r" ExpectingEscapeCharacter)) + [ P.map (\_ -> '\"') (Parser.token "\"")) + , P.map (\_ -> '\'') (Parser.token "'")) + , P.map (\_ -> '\n') (Parser.token "n")) + , P.map (\_ -> '\t') (Parser.token "t")) + , P.map (\_ -> '\r') (Parser.token "r")) , P.succeed identity |. P.token (P.Token "u" ExpectingEscapeCharacter) |. P.token (P.Token "{" ExpectingUnicodeEscapeLeftBrace) |= unicode |. P.token (P.Token "}" ExpectingUnicodeEscapeRightBrace) + , P.problem ExpectingValidEscapeChar ]

Please parameterize this one (| ExpectingEscapeCharacter Char) and give these usages of it the chars they're really about. That should be enough and not bloat the ParserProblem type so much. WDYT?

The problem is that I don't think in either solutions, both the current one in the PR and the diff above, the compiler would print out one of those token errors. Rather, it would either skip the oneOf (current PR code) or print ExpectingValidEscapeChar (diff above).

current PR state: https://ellie-app.com/5Z9tkd2hGSQa1
my suggestion: https://ellie-app.com/5Z9vStSJv72a1
The changes are on lines 31-37, and 77
I think that's enough! The ParserProblems will tell you what characters did it expect ➡️ we can write a nice error message.

If you think this still overlooks some corner case, please shout! :)

Oops, my bad!
Fixed by 7c495f7

Janiczek · 2019-07-03T23:30:17Z

src/compiler/Stage/Parse/Parser.elm

+                    string
+                        |> String.uncons
+                        |> Maybe.map (Tuple.first >> P.succeed)
+                        |> Maybe.withDefault (P.problem (CompilerBug "Multiple characters chomped in `literalChar`"))


This message is not entirely correct, we're not inside literalChar. If we want to reuse this in Strings too. it would mislead the user seeing the message.

Fixed by 483bf8a

Janiczek · 2019-07-03T23:31:19Z

src/compiler/Stage/Parse/Parser.elm

-                    |> Maybe.map (Tuple.first >> Char >> P.succeed)
-                    |> Maybe.withDefault (P.problem (CompilerBug "Multiple characters chomped in `literalChar`"))
-            )
+    |> P.map (\n -> Char n)


|> P.map Char

Fixed by b7a3cf4

Thank you, @Janiczek!

Janiczek · 2019-07-03T23:33:32Z

src/compiler/Stage/Parse/Parser.elm

+
+
+addHex : Char -> Int -> Int
+addHex char total =


let's use Hex.fromString instead of this. Also, it is buggy wrt. characters outside the hex range I think.

Fixed by 09fa46c

Janiczek · 2019-07-03T23:35:32Z

src/compiler/Stage/Parse/Parser.elm

+codeToChar str =
+  let
+    length = String.length str 
+    code = String.foldl addHex 0 str


let's check for the length, that's great, but let's use Hex.fromString for the actual "is this a hex string" and "what int value does it represent" functionality

Fixed by 09fa46c

Janiczek · 2019-07-03T23:36:41Z

tests/ParserTest.elm

+                  , "'\\\"'"
+                  , Ok (Literal (Char '"')) -- "
+                  )                         -- ^ workaround for official elm
+                                            --   vscode syntax highlighter


If you want, I'm happy to remove it.
Without it, however, the coloring is wrong for the next ~17 lines and a "parser error" prevents vscode from showing more helpful errors :-/

https://github.com/Krzysztof-Cieslak/vscode-elm/issues/244

Janiczek · 2019-07-03T23:40:04Z

Sorry for so many comments! The parsers are great, I'm just nitpicking a lot. Great job nevertheless! 🎉
Let's work through the comments and I can merge this.

(BTW we added elm-format on the master branch so at the end run it with make format please, otherwise Travis will yell)

aaronjanse · 2019-07-04T00:09:16Z

Thank you for the comments. It's a nice codebase, and, like you, I want to keep it that way :-)

I'm busy at the moment but I'll try to go through the comments soon (maybe tonight).

aaronjanse · 2019-07-04T04:24:16Z

(whoa the new GitHub merge conflict UI is awesome!)

aaronjanse · 2019-07-04T04:34:01Z

I think this is ready for another review @Janiczek 😉

Janiczek · 2019-07-04T09:08:16Z

Just one comment re ParserProblems, otherwise this is ready for merge! :)

aaronjanse · 2019-07-04T18:16:03Z

Just one comment re ParserProblems, otherwise this is ready for merge! :)

Great. Fixed by 7c495f7

Janiczek · 2019-07-04T19:32:57Z

Thanks @aaronjanse!

aaronjanse added 2 commits June 30, 2019 18:45

implement fancier char escaping

af6f489

add aaronjanse to README

458a999

Janiczek reviewed Jul 3, 2019

View reviewed changes

aaronjanse added 4 commits July 3, 2019 21:01

remove unnecessary parenthesis

657fa28

rename stringHelp to character

483bf8a

remove unnecessary lambda

b7a3cf4

Merge branch 'master' into char-escapes

3163b05

aaronjanse added 2 commits July 3, 2019 21:30

use Hex.fromString

09fa46c

make elm-format happy

48e9ae4

rework ExpectingEscapeCharacter

7c495f7

Janiczek merged commit 016b286 into elm-in-elm:master Jul 4, 2019

aaronjanse deleted the char-escapes branch July 4, 2019 20:30

Warry pushed a commit to Warry/compiler that referenced this pull request Jul 8, 2019

implement char escapes (elm-in-elm#20)

c2c45b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement char escapes #20

implement char escapes #20

aaronjanse commented Jul 1, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 4, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek Jul 3, 2019

aaronjanse Jul 4, 2019

Janiczek commented Jul 3, 2019

aaronjanse commented Jul 4, 2019

aaronjanse commented Jul 4, 2019

aaronjanse commented Jul 4, 2019

Janiczek commented Jul 4, 2019

aaronjanse commented Jul 4, 2019

Janiczek commented Jul 4, 2019

implement char escapes #20

implement char escapes #20

Conversation

aaronjanse commented Jul 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Janiczek commented Jul 3, 2019

aaronjanse commented Jul 4, 2019

aaronjanse commented Jul 4, 2019

aaronjanse commented Jul 4, 2019

Janiczek commented Jul 4, 2019

aaronjanse commented Jul 4, 2019

Janiczek commented Jul 4, 2019