-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement raw string sigils for CLI field customization. #2068
base: master
Are you sure you want to change the base?
Conversation
^ to use literal string without interpretation @ to read string from file % to read string from environment variable
Bencher Report
Click to view all benchmark results
|
I personally think sigils are a good way forward; they're simple, easy to add, and there's precedent of their use on the CLI. My personal preference would to not duplicate them too much, at the risk of having a cryptic CLI that resembles a Perl expression: for example we could just use We could have sensible but simplistic default: if nothing is specified, it's a file, and we use the extension to deduce the format. We probably also want a way to specify the format explicitly if needed, so that we can write
This isn't required, if the parsing is really trivial. I think field paths on the CLI as in This is not the case here though, as the syntax we propose isn't valid Nickel. So as long as we know exactly what characters we expect and we don't have any kind of stack to maintain like for well-bracketed expressions, parsing the sigil manually is most probably ok. For the rest of your questions, I think |
What is security standpoint for Nickel in general? Should be in general protect against mixing code and data? Shortcuts like Should Nickel strive to make secure way the easiest and avoid easy to code, superficially working, but unsound pitfalls? Filename extension sniffing with a fallback from passive json/yaml/text to active Nickel code is already a step in the wrong direction if it does.
It would be tricky to just transform Proper way also opens a path to support non-Unicode filenames, to allow any files to be specified e.g. on Linux, like |
Currently implementation does reuse proper parser for the part on the left side of It does assume that |
Do I correctly assume that current way of specifying fields on CLI is not stable, so we break backwards compatibility and e.g. require some additional prefix to specify a nontrivial Nickel code as a field value? |
It's important to note that Nickel is entirely pure at the time, and can't dynamically load files. So all an evil file or Nickel snippet could do is...to produce a value (well, or loop indefinitely, there's that). While it's a valid point, code injection isn't the same for Nickel as for say for python. On one hand I'm tempted to say that it's the user's job to use
I'm not sure to understand this part. Are you talking about the filename that can have embedded
Ah, this is a good point. I was thinking about parsing just the sigil part (the value) but we still have to find the value part and it can be non-trivial: indeed fields can contain
In theory, yes. In practice, this means like we need to prefix stuff like |
It can include arbitrary file from a filesystem, which may be unsandboxed. That value may unexpectedly contain secrets.
The filename. Only
What if the filename or a literal string contains a Maybe
If we go the |
Hmm. Parsing arbitrary stuff is annoying to do in the current parser infrastructure, because the lexer still produces tokens that are tailored to the Nickel language. It's not impossible to do, but might prove annoying - requiring something like a new lexer mode that is toggled by the parser. Maybe the simplest is to add lexer rules for stuff like In general I don't see how a prefix for complex expressions is going to help though, because if you do Maybe what we should do is the following: by default, restrict everything that can be given as a field value on the CLI. That is, we accept string literals, numbers, bools etc - all literals, and maybe arrays and records of literals - and that's it. It's easy to check after parsing, we can just validate the right hand side. Then for anything more involved, like For the sigils, maybe not having shortcuts is the way forward: you always need to write
|
How important are those field names containing |
I don't expect that to happen a lot. But I would still be a bit disappointed that an implementation detail (it's annoying to parse) leaks to the interface and someone somewhere on day will get a failure because they use a Thinking about this again, in fact we could find the right |
I also though about it.
Can lalrpop parse prefix of a string; not expect EOF, just stop parsing when a specified token is found? This can be used even to handle non-Unicode filenames that cannot be put in a |
Not that I know. LALRPOP just pops tokens from the lexer until it can't parse stuff anymore, to the best of my knowledge. But honestly the more I think about it, the more I think the lexer is the easiest: it's cheap (lexing should be pretty fast), it's simple (take until And if we think that the sigils are the exception and that normal field assignment should be the fast path, we can do the converse: try to parse, if there is an error, restart a lexer and try to sigil-parse, and if both fails report the original parse error. But I'm not sure this will make any difference, honestly. |
This issue was discussed in the last weekly. We settled on using only one sigil @vi are you willing/able to take this PR through, or should we take over? |
Shall there be also "@literal:" to embed arbitrary uninterpreted string? Is the list of those prefixes expected to grow in further pull requests?
What will happen with Will it still default to a Nickel file or just fail? Will explicit formats be handled by alternative prefixes like
Maybe someday (was doing other, non-Nickel-related things meanwhile - my attention allotment ran dry for this one), but not this/next month probably. If needed, it can be taken over (with some comment here). |
We can discuss this further afterwards, but right now I just don't see the use case. You can already include literals by either enclosing the whole assignment within single quotes, as in
I would say this should behave like an import. So try to import it as Nickel by default. Another solution is to consider it text by default on the CLI, which is maybe the more common case, but then this might be surprising for users (there are two different default interpretations for files instead of one, depending on if they come from an import or a
Yes, I think we want that. We can implement something like that first and then bikeshed the actual format specification, keeping in mind the injection possibility (that is: the default without an explicit format must NOT be a proper prefix of the version with an explicit format specified). Final syntax can be bikeshedded later, such as
Got it. I think the whole |
^
to use literal string without interpretation@
to read string from file%
to read string from environment variableExample:
Related to #2043.
Open questions:
nickel-lang-cli export ... -- the_number=123
may lead inaccurate user tothe_number=$UNTRUSTED_USER_INPUT
, where the user would assume that it would just fail if it is not a number.grammar.lalrpop
, just like regular assignments? If so, I am uncertain how to design it properly.FieldOverride
(that may gain something likeFieldOverrideValue
enum)?@
go though the cache?IOError
or extend the mainError
type?