Skip to content

Commit

Permalink
Add XML renderer documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
satabin committed Jan 23, 2024
1 parent d8a1610 commit b8354b9
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 17 deletions.
35 changes: 31 additions & 4 deletions site/documentation/xml/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The `fs2-data-xml` module provides tools to parse XML data in a streaming manner

To create a stream of XML events from an input stream, use the `events` pipe in `fs2.data.xml` package.

```scala mdoc
```scala mdoc:height=500
import cats.effect._
import cats.effect.unsafe.implicits.global

Expand All @@ -33,14 +33,14 @@ The pipe validates the XML structure while parsing. It reads all the XML element

Namespace can be resolved by using the `namespaceResolver` pipe.

```scala mdoc
```scala mdoc:height=500
val nsResolved = stream.through(namespaceResolver[IO])
nsResolved.compile.toList.unsafeRunSync()
```

Using the `referenceResolver` pipe, entity and character references can be resolved. By defaut the standard `xmlEntities` mapping is used, but it can be replaced by any mapping you see fit.

```scala mdoc
```scala mdoc:height=500
val entityResolved = stream.through(referenceResolver[IO]())
entityResolved.compile.toList.unsafeRunSync()
```
Expand All @@ -49,7 +49,7 @@ entityResolved.compile.toList.unsafeRunSync()

Once entites and namespaces are resolved, the events might be numerous and can be normalized to avoid emitting too many of them. For instance, after reference resolution, consecutive text events can be merged. This is achieved by using the `normalize` pipe.

```scala mdoc
```scala mdoc:height=500
val normalized = entityResolved.through(normalize)
normalized.compile.toList.unsafeRunSync()
```
Expand Down Expand Up @@ -82,3 +82,30 @@ implicit val eventifier: DocumentEventifier[SomeDocType] = ???
stream.through(documents[IO, SomeDocType])
.through(eventify[IO, SomeDocType])
```

## XML Renderers

Once you got an XML event stream, selected and transformed what you needed in it, you can then write the resulting event stream to some storage. This can be achieved using renderers.

For instance, let's say you want to write the resulting XML stream to a file in raw form (i.e. without trying to format the nested tags and text), you can do:

```scala mdoc:compile-only
import fs2.io.file.{Files, Flags, Path}

stream
.through(render.raw())
.through(text.utf8.encode)
.through(Files[IO].writeAll(Path("/some/path/to/file.xml"), Flags.Write))
.compile
.drain
```

There exists also a `pretty()` renderer, that indents inner tags and text by the given indent string.

If you are interested in the String rendering as a value, the library also provides `Collector`s:

```scala mdoc
stream.compile.to(collector.raw()).unsafeRunSync()

stream.compile.to(collector.pretty()).unsafeRunSync()
```
7 changes: 3 additions & 4 deletions site/documentation/xml/xpath.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,14 +84,13 @@ The `filter.raw` emits a stream of all matches.
Each match is represented as a nested stream of XML events which must be consumed.

```scala mdoc
import cats.Show
import cats.effect._
import cats.effect.unsafe.implicits.global

stream
.lift[IO]
.through(filter.raw(path))
.parEvalMapUnbounded(_.map(Show[XmlEvent].show(_)).compile.foldMonoid)
.parEvalMapUnbounded(_.through(render.raw()).compile.foldMonoid)
.compile
.toList
.unsafeRunSync()
Expand All @@ -105,7 +104,7 @@ The library offers `filter.collect` to collect each match for any collector.
```scala mdoc
stream
.lift[IO]
.through(filter.collect(path, collector.show))
.through(filter.collect(path, collector.raw()))
.compile
.toList
.unsafeRunSync()
Expand All @@ -116,7 +115,7 @@ If you want to have results emitted as early as possible instead of in order, yo
```scala mdoc
stream
.lift[IO]
.through(filter.collect(path, collector.show, deterministic = false))
.through(filter.collect(path, collector.raw(), deterministic = false))
.compile
.toList
.unsafeRunSync()
Expand Down
18 changes: 11 additions & 7 deletions xml/src/main/scala/fs2/data/xml/internals/Renderer.scala
Original file line number Diff line number Diff line change
Expand Up @@ -96,9 +96,9 @@ private[xml] class Renderer(pretty: Boolean,
newline = true

case XmlEvent.EndTag(name) =>
level -= 1
newline = true
if (!skipClose) {
level -= 1
indentation()
builder ++= show"</$name>"
}
Expand All @@ -111,12 +111,16 @@ private[xml] class Renderer(pretty: Boolean,

case XmlEvent.XmlString(content, false) if pretty =>
content.linesIterator.foreach { line =>
indentation()
if (newline)
builder ++= line.stripLeading()
else
builder ++= line
newline = true
if (line.matches("\\s*")) {
// empty line, ignore it
} else {
indentation()
if (newline)
builder ++= line.stripLeading()
else
builder ++= line
newline = true
}
}
newline = content.matches("^.*\n\\s*$")

Expand Down
6 changes: 4 additions & 2 deletions xml/src/main/scala/fs2/data/xml/package.scala
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ package object xml {
}

/** Renders all events without extra formatting. */
def raw(collapseEmpty: Boolean = true): Collector[XmlEvent] =
def raw(collapseEmpty: Boolean = true): Collector.Aux[XmlEvent, String] =
new Collector[XmlEvent] {
type Out = String
def newBuilder: Collector.Builder[XmlEvent, Out] =
Expand All @@ -182,7 +182,9 @@ package object xml {
* @param indent THe indentation string
* @param attributeThreshold Number of attributes above which each attribute is rendered on a new line
*/
def pretty(collapseEmpty: Boolean = true, indent: String = " ", attributeThreshold: Int = 3): Collector[XmlEvent] =
def pretty(collapseEmpty: Boolean = true,
indent: String = " ",
attributeThreshold: Int = 3): Collector.Aux[XmlEvent, String] =
new Collector[XmlEvent] {
type Out = String
def newBuilder: Collector.Builder[XmlEvent, Out] =
Expand Down

0 comments on commit b8354b9

Please sign in to comment.