Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add output_stream/with_output_stream to Parser #233

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 91 additions & 1 deletion src/combinator/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@
use crate::error::{ContextError, ErrMode, ErrorKind, FromExternalError, Needed, ParseError};
use crate::lib::std::borrow::Borrow;
use crate::lib::std::ops::Range;
use crate::stream::{Location, Stream};
use crate::stream::{Location, Stream, UpdateSlice};
use crate::stream::{Offset, StreamIsPartial};
use crate::trace::trace;
use crate::trace::trace_result;
Expand Down Expand Up @@ -972,6 +972,96 @@ where
}
}

/// Implementation of [`Parser::output_stream`]
#[cfg_attr(nightly, warn(rustdoc::missing_doc_code_examples))]
pub struct OutputStream<F, I, O, E>
where
F: Parser<I, O, E>,
I: UpdateSlice,
{
parser: F,
i: core::marker::PhantomData<I>,
o: core::marker::PhantomData<O>,
e: core::marker::PhantomData<E>,
}

impl<F, I, O, E> OutputStream<F, I, O, E>
where
F: Parser<I, O, E>,
I: UpdateSlice,
{
pub(crate) fn new(parser: F) -> Self {
Self {
parser,
i: Default::default(),
o: Default::default(),
e: Default::default(),
}
}
}

impl<I, O, E, F> Parser<I, I, E> for OutputStream<F, I, O, E>
where
F: Parser<I, O, E>,
I: UpdateSlice,
{
fn parse_next(&mut self, input: I) -> IResult<I, I, E> {
match self.parser.parse_next(input.clone()) {
Ok((remaining, _)) => {
let offset = input.offset_to(&remaining);
let (remaining, slice) = input.next_slice(offset);
Ok((remaining, input.update_slice(slice)))
}
Err(e) => Err(e),
Comment on lines +1009 to +1015
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the streams be marked complete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but how would I mark a stream as complete?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}
}

/// Implementation of [`Parser::with_output_stream`]
#[cfg_attr(nightly, warn(rustdoc::missing_doc_code_examples))]
pub struct WithOutputStream<F, I, O, E>
where
F: Parser<I, O, E>,
I: UpdateSlice,
{
parser: F,
i: core::marker::PhantomData<I>,
o: core::marker::PhantomData<O>,
e: core::marker::PhantomData<E>,
}

impl<F, I, O, E> WithOutputStream<F, I, O, E>
where
F: Parser<I, O, E>,
I: UpdateSlice,
{
pub(crate) fn new(parser: F) -> Self {
Self {
parser,
i: Default::default(),
o: Default::default(),
e: Default::default(),
}
}
}

impl<F, I, O, E> Parser<I, (O, I), E> for WithOutputStream<F, I, O, E>
where
F: Parser<I, O, E>,
I: UpdateSlice,
{
fn parse_next(&mut self, input: I) -> IResult<I, (O, I), E> {
match self.parser.parse_next(input.clone()) {
Ok((remaining, output)) => {
let offset = input.offset_to(&remaining);
let (remaining, slice) = input.next_slice(offset);
Ok((remaining, (output, input.update_slice(slice))))
}
Err(e) => Err(e),
}
}
}

/// Implementation of [`Parser::span`]
#[cfg_attr(nightly, warn(rustdoc::missing_doc_code_examples))]
pub struct Span<F, I, O, E>
Expand Down
124 changes: 110 additions & 14 deletions src/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

use crate::combinator::*;
use crate::error::{ContextError, FromExternalError, IResult, ParseError};
use crate::stream::{AsChar, Compare, Location, Offset, ParseSlice, Stream, StreamIsPartial};
use crate::stream::{
AsChar, Compare, Location, Offset, ParseSlice, Stream, StreamIsPartial, UpdateSlice,
};

/// Core trait for parsing
///
Expand Down Expand Up @@ -187,19 +189,21 @@ pub trait Parser<I, O, E> {

/// Produce the consumed input as produced value.
///
/// The produced value is of type `Stream::Slice`. If you're looking for an alternative that
/// returns the original input `Stream`'s type, use [`output_stream`](Parser::output_stream)
/// instead.
///
/// # Example
///
/// ```rust
/// # use winnow::{error::ErrMode,error::ErrorKind, error::Error, IResult, Parser};
/// use winnow::character::{alpha1};
/// # use winnow::{error::{ErrMode, Error, ErrorKind}, IResult, Parser};
/// use winnow::character::alpha1;
/// use winnow::sequence::separated_pair;
/// # fn main() {
///
/// let mut parser = separated_pair(alpha1, ',', alpha1).recognize();
///
/// assert_eq!(parser.parse_next("abcd,efgh"), Ok(("", "abcd,efgh")));
/// assert_eq!(parser.parse_next("abcd;"),Err(ErrMode::Backtrack(Error::new(";", ErrorKind::Verify))));
/// # }
/// ```
#[doc(alias = "concat")]
fn recognize(self) -> Recognize<Self, I, O, E>
Expand All @@ -212,29 +216,29 @@ pub trait Parser<I, O, E> {

/// Produce the consumed input with the output
///
/// Functions similarly to [recognize][Parser::recognize] except it
/// returns the parser output as well.
/// Functions similarly to [`recognize`][Parser::recognize] except it returns the parser output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated changes will need to be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do that, but I thought it would make sense to add the missing backticks around recognize as part of this PR to align it with the docs of the new methods since they are related.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, that should be a separate commit

/// as well.
///
/// This can be useful especially in cases where the output is not the same type
/// as the input, or the input is a user defined type.
///
/// The consumed input's value is of type `Stream::Slice`. If you're looking for an alternative
/// that returns the original input `Stream`'s type, use
/// [`with_output_stream`](Parser::with_output_stream) instead.
///
/// Returned tuple is of the format `(produced output, consumed input)`.
///
/// # Example
///
/// ```rust
/// # use winnow::prelude::*;
/// # use winnow::{error::ErrMode,error::ErrorKind, error::Error, IResult};
/// use winnow::character::{alpha1};
/// use winnow::bytes::tag;
/// # use winnow::{error::{ErrMode, Error, ErrorKind}, IResult, Parser};
/// use winnow::character::alpha1;
/// use winnow::sequence::separated_pair;
///
/// fn inner_parser(input: &str) -> IResult<&str, bool> {
/// "1234".value(true).parse_next(input)
/// }
///
/// # fn main() {
///
/// let mut consumed_parser = separated_pair(alpha1, ',', alpha1).value(true).with_recognized();
///
/// assert_eq!(consumed_parser.parse_next("abcd,efgh1"), Ok(("1", (true, "abcd,efgh"))));
Expand All @@ -247,7 +251,6 @@ pub trait Parser<I, O, E> {
///
/// assert_eq!(recognize_parser.parse_next("1234"), consumed_parser.parse_next("1234"));
/// assert_eq!(recognize_parser.parse_next("abcd"), consumed_parser.parse_next("abcd"));
/// # }
/// ```
#[doc(alias = "consumed")]
fn with_recognized(self) -> WithRecognized<Self, I, O, E>
Expand All @@ -258,6 +261,99 @@ pub trait Parser<I, O, E> {
WithRecognized::new(self)
}

/// Produce the consumed input as produced value.
///
/// The produced value is of the same type as the input `Stream`. If you're looking for an
/// alternative that returns `Stream::Slice`, use [`recognize`](Parser::recognize) instead.
///
/// # Example
///
/// ```rust
/// # use winnow::{error::{ErrMode, Error, ErrorKind}, IResult, Parser};
/// use winnow::character::alpha1;
/// use winnow::sequence::separated_pair;
/// use winnow::stream::BStr;
///
/// let mut parser = separated_pair(alpha1, ',', alpha1).output_stream();
///
/// assert_eq!(
/// parser.parse_next(BStr::new("abcd,efgh")),
/// Ok((BStr::new(""), BStr::new("abcd,efgh"))),
/// );
/// assert_eq!(
/// parser.parse_next(BStr::new("abcd;")),
/// Err(ErrMode::Backtrack(Error::new(BStr::new(";"), ErrorKind::Verify))),
/// );
/// ```
fn output_stream(self) -> OutputStream<Self, I, O, E>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer recognize, rather than output as that does clarify what the stream's range is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, recognize_stream also feels more fitting to me. Will rename it when I polish the PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could length_value be updated to use this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a look.

where
Self: core::marker::Sized,
I: UpdateSlice,
{
OutputStream::new(self)
}

/// Produce the consumed input with the output.
///
/// Functions similarly to [`output_stream`][Parser::output_stream] except it returns the
/// parser output as well.
///
/// This can be useful especially in cases where the output is not the same type as the input.
///
/// The consumed input's value is of the same type as the input `Stream`. If you're looking for
/// an alternative that returns `Stream::Slice`, use
/// [`with_recognized`](Parser::with_recognized) instead.
///
/// Returned tuple is of the format `(produced output, consumed input)`.
///
/// # Example
///
/// ```rust
/// # use winnow::{error::{ErrMode, Error, ErrorKind}, IResult, Parser};
/// use winnow::character::alpha1;
/// use winnow::sequence::separated_pair;
/// use winnow::stream::BStr;
///
/// fn inner_parser(input: &BStr) -> IResult<&BStr, bool> {
/// "1234".value(true).parse_next(input)
/// }
///
/// let mut consumed_parser = separated_pair(alpha1, ',', alpha1)
/// .value(true)
/// .with_output_stream();
///
/// assert_eq!(
/// consumed_parser.parse_next(BStr::new("abcd,efgh1")),
/// Ok((BStr::new("1"), (true, BStr::new("abcd,efgh")))),
/// );
/// assert_eq!(
/// consumed_parser.parse_next(BStr::new("abcd;")),
/// Err(ErrMode::Backtrack(Error::new(BStr::new(";"), ErrorKind::Verify))),
/// );
///
/// // The second output (representing the consumed input) should be the same as that of the
/// // `output_stream` parser.
/// let mut output_stream_parser = inner_parser.output_stream();
/// let mut consumed_parser = inner_parser.with_output_stream()
/// .map(|(output, output_stream)| output_stream);
///
/// assert_eq!(
/// output_stream_parser.parse_next(BStr::new("1234")),
/// consumed_parser.parse_next(BStr::new("1234")),
/// );
/// assert_eq!(
/// output_stream_parser.parse_next(BStr::new("abcd")),
/// consumed_parser.parse_next(BStr::new("abcd")),
/// );
/// ```
fn with_output_stream(self) -> WithOutputStream<Self, I, O, E>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered trying these out on an extension trait in your crate first to explore their usage?

I am considering doing 0.5 in the next couple months, so we do have room to change things soon if we aren't thrilled with it. I'm just exploring (and reminding myself) what the options are for these

Copy link
Contributor Author

@martinohmann martinohmann Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider that yet, no. But it's a great idea and I'll do this to see if the additional methods are really that useful as I imagine. Will report back once I did that.

I already played around with output_stream in my hcl-edit parser by pointing it against my local winnow fork and it actually helped to simplify one of my use cases quite a bit. Didn't check how (if) with_output_stream would help simplifying the parsers that receive a state parameter yet, but I did a small PoC in a bigger unit test that I didn't bundle with the PR yet to verify that state updates within a map operation actually work the way I expect.

where
Self: core::marker::Sized,
I: UpdateSlice,
{
WithOutputStream::new(self)
}

/// Produce the location of the consumed input as produced value.
///
/// # Example
Expand Down