-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge InputIter, InputTakeAtPosition, InputLength and InputTake #1612
Conversation
The new Input trait assembles methods from these 4 traits
Pull Request Test Coverage Report for Build 3859246340
💛 - Coveralls |
@@ -15,6 +15,208 @@ use crate::lib::std::string::String; | |||
#[cfg(feature = "alloc")] | |||
use crate::lib::std::vec::Vec; | |||
|
|||
/// Parser input types must implement this trait | |||
pub trait Input: Clone + Sized { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help me understand what are the needs of custom Input authors?
Before, every parser declared the capabilities that they needed, theoretically allowing for a wide variety of Inputs. In practice, I'm assuming only slices of tokens are practically supported and that that is what this change represents.
I guess the other advantage of the old approach was that you could add new capabilities without a breaking change. Now, if we need a new function, there either has to be a safe inherent implementation or we create a weird offshoot trait until the next breaking release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's as you said. Having the multiple traits allowed adding more features without too much impact, at the cost of more complex type signatures. Now they have not moved much for some time, and there are some redundant parts, like InputTake
VS Slice
, so I am merging them together. An existing input type implementation would mainly have to copy old method implementations to the new trait and add the new take_from
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something else this PR doesn't acknowledge that would be good to document is what traits were not included (e.g. Offset
) and why, or why InputLength
still exists despite being merged
/// The current input type is a sequence of that `Item` type. | ||
/// | ||
/// Example: `u8` for `&[u8]` or `char` for `&str` | ||
type Item; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo as a user, calling this Token
would make the intent clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that would be the right term. The input type produces raw data, tokens is what would be parsed from this data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on the level the parser exists. Some parsers are producing tokens, some are consuming tokens to produce an AST.
Looking around, I see combine
calls it Token
. I'm not seeing an equivalent trait in chumsky yet.
the Input trait now has everything needed with the take, take_from and take_split methods
71bd90c
to
dcb252b
Compare
} | ||
} | ||
|
||
impl<'a> Input for &'a [u8] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this implementation be generalised for all &'a [T]
? I can't see anything in the trait implementation that is u8
specific and this would close #1482
This is modeled off of rust-bakery/nom#1612 with a mix of `combine::Stream. - Chose `Token`, like `combine`, as that is the term I've been generally standardizing on and is clearer in intent than `Item` - Baking in `Slice` support as that is important for wrapper types. I chose `Slice` over `combine`s `Range` as this is more focused on Rust integration, rather than grammar integration, and we are generally using `slice`. - `next_token` is really what some of nom's existing traits are trying to do but awkwardly. `combine` does this through `StreamOnce::uncons` - `iter_offsets` instead of `iter_tokens` as the only time you should care about the tokens, you should be looking at the offsets unless using something like `AsBytes`. - `next_slice` is a parallel to `next_token` but the return type differs as the lookup is done in a different function call. I'm tempted to add an `unchecked` variant but I want to benchmark first. I considered following `combine` and doing separate `InputToken` and `InputSlice` traits but figured we'd see how this goes first.
This is modeled off of rust-bakery/nom#1612 with a mix of `combine::Stream. - Chose `Token`, like `combine`, as that is the term I've been generally standardizing on and is clearer in intent than `Item` - Baking in `Slice` support as that is important for wrapper types. I chose `Slice` over `combine`s `Range` as this is more focused on Rust integration, rather than grammar integration, and we are generally using `slice`. - `next_token` is really what some of nom's existing traits are trying to do but awkwardly. `combine` does this through `StreamOnce::uncons` - `iter_offsets` instead of `iter_tokens` as the only time you should care about the tokens, you should be looking at the offsets unless using something like `AsBytes`. - `next_slice` is a parallel to `next_token` but the return type differs as the lookup is done in a different function call. I'm tempted to add an `unchecked` variant but I want to benchmark first. I considered following `combine` and doing separate `InputToken` and `InputSlice` traits but figured we'd see how this goes first.
The new Input trait assembles methods from these 4 traits