Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move away from DocBook for documentation #1588

Closed
rdmark opened this issue Oct 3, 2024 · 4 comments · Fixed by #1905
Closed

Move away from DocBook for documentation #1588

rdmark opened this issue Oct 3, 2024 · 4 comments · Fixed by #1905
Assignees
Milestone

Comments

@rdmark
Copy link
Member

rdmark commented Oct 3, 2024

The DocBook toolchain relies on XML parsing and XSL stylesheets that is heavy, complex, and partially on the verge of obsolescence.

For one, we use traditional non-namespaced DocBook XSL style sheets which are two generations behind where DocBook is today. The docbook-xsl package is orphaned in Debian, for instance.

Another concrete drawback is that Debian tries to proactively strip XML related dependencies from architecture dependent builds, to save on computing resources and speed up builds. But this can’t be done for netatalk because of how we generate the troff man pages from XML.

This is to look into converting the documentation sources to a markup format that is parseable by pandoc, such as Commonmark. This would allow us to transcode to human readable formats with the light-weight cmark.

@rdmark rdmark added this to the release-4.1 milestone Oct 12, 2024
@rdmark rdmark removed this from the release-4.1 milestone Jan 5, 2025
@rdmark
Copy link
Member Author

rdmark commented Jan 18, 2025

It has been suggested to me that Commonmark can be localized with po4a. It would be a big plus to have a more efficient way to manage the Japanese localization.

@rdmark rdmark linked a pull request Jan 19, 2025 that will close this issue
@rdmark rdmark self-assigned this Jan 19, 2025
@rdmark
Copy link
Member Author

rdmark commented Jan 19, 2025

@NJRoadfan Heads-up: My current thinking here is to add a pandoc dependency for the documentation, in lieu of docbook-xsl and xsltproc.

I have found a workflow for automatically converting docbook to markdown fairly painlessly. Only minor manual cleanup needed in the header and synopsis.

Here's how the aecho man page looks in markdown format:

aecho.1.md

% aecho(1) Netatalk @netatalk_version@ | Netatalk AFP Fileserver Manual

# Name

aecho - send AppleTalk Echo Protocol packets to network hosts

# Synopsis

`aecho [-c count] [ address | nbpname ]`

# Description

`aecho` repeatedly sends an Apple Echo Protocol (AEP) packet to the host
specified by the given AppleTalk `address` or `nbpname` and reports
whether a reply was received. Requests are sent at the rate of one per
second.

`address` is parsed by `atalk_aton(3)`{.citerefentry}. `nbpname` is
parsed by `nbp_name(3)`{.citerefentry}. The nbp type defaults to
\`*Workstation*\'.

When `aecho` is terminated, it reports the number of packets sent, the
number of responses received, and the percentage of packets lost. If any
responses were received, the minimum, average, and maximum round trip
times are reported.

# Example

Check to see if a particular host is up and responding to AEP packets:

    example% aecho bloodsport
    11 bytes from 8195.13: aep_seq=0. time=10. ms
    11 bytes from 8195.13: aep_seq=1. time=10. ms
    11 bytes from 8195.13: aep_seq=2. time=10. ms
    11 bytes from 8195.13: aep_seq=3. time=10. ms
    11 bytes from 8195.13: aep_seq=4. time=10. ms
    11 bytes from 8195.13: aep_seq=5. time=9. ms
    ^C
    ----8195.13 AEP Statistics----
    6 packets sent, 6 packets received, 0% packet loss
    round-trip (ms)  min/avg/max = 9/9/10

# Options

`-c` \<count\>

:   Stop after *count* packets.

# See Also

`ping(1)`{.citerefentry}, `atalk_aton(3)`{.citerefentry},
`nbp_name(3)`{.citerefentry}, `aep(4)`{.citerefentry},
`atalkd(8)`{.citerefentry}.

# Author

See [CONTRIBUTORS](https://netatalk.io/contributors)

Then turn this into a troff man page with: pandoc --to man -o aecho.1 aecho.1.md

Note that the % and : notation are pandoc extensions, for man page headers and definition lists respectively, and not standard commonmark. This is why I'm opting to use pandoc instead of the more light-weight cmark.

By using markdown, web publishing and localization will become much easier and more flexible.

@rdmark
Copy link
Member Author

rdmark commented Jan 19, 2025

Linux distros and BSDs are all distributing pandoc packages. But the situation in Solaris land is dire. Oracle Solaris does not ship a package AFAICT, while SmartOS pkgsrc (via OmniOS) does. However, it takes over 5 minutes just to install all Haskell dependencies, and after that Meson still cannot detect the pandoc program...

I think the reasonable option here would be to go back to distributing pre-generated troff man pages. A bit of a step backwards, but the upsides with pandoc and markdown for documentation make up for it I think.

@rdmark
Copy link
Member Author

rdmark commented Jan 19, 2025

This is what a plain commonmark version of the above would look like. The one additional manual adjustment needed is to indent items in the definition list, which can be done with a global search and replace.

The biggest drawback is that we won't get any man page headers, which looks a little empty.

On the plus side, this is entirely standards compliant markdown, without pandoc extensions. And, we can use the extremely fast and portable cmark app to generate the troff pages on the fly. So no need for a intermediate step, or heavy Haskell stack...

# Name

aecho - send AppleTalk Echo Protocol packets to network hosts

# Synopsis

`aecho [-c count] [ address | nbpname ]`

# Description

`aecho` repeatedly sends an Apple Echo Protocol (AEP) packet to the host
specified by the given AppleTalk `address` or `nbpname` and reports
whether a reply was received. Requests are sent at the rate of one per
second.

`address` is parsed by `atalk_aton(3)`. `nbpname` is parsed by
`nbp_name(3)`. The nbp type defaults to \`*Workstation*'.

When `aecho` is terminated, it reports the number of packets sent, the
number of responses received, and the percentage of packets lost. If any
responses were received, the minimum, average, and maximum round trip
times are reported.

# Example

Check to see if a particular host is up and responding to AEP packets:

    example% aecho bloodsport
    11 bytes from 8195.13: aep_seq=0. time=10. ms
    11 bytes from 8195.13: aep_seq=1. time=10. ms
    11 bytes from 8195.13: aep_seq=2. time=10. ms
    11 bytes from 8195.13: aep_seq=3. time=10. ms
    11 bytes from 8195.13: aep_seq=4. time=10. ms
    11 bytes from 8195.13: aep_seq=5. time=9. ms
    ^C
    ----8195.13 AEP Statistics----
    6 packets sent, 6 packets received, 0% packet loss
    round-trip (ms)  min/avg/max = 9/9/10

# Options

`-c` \<count\>

> Stop after *count* packets.

# See Also

`ping(1)`, `atalk_aton(3)`, `nbp_name(3)`, `aep(4)`, `atalkd(8)`.

# Author

See [CONTRIBUTORS](https://netatalk.io/contributors)

rdmark added a commit that referenced this issue Jan 23, 2025
Leveraged pandoc's docbook filter to convert to CommonMark,
then cmark-gfm (or cmark for man pages only) to transcode
to roff or html.

Introduces with-docs, with-docs-l10n, and with-website Meson options.
rdmark added a commit that referenced this issue Jan 23, 2025
Leveraged pandoc's docbook filter to convert to CommonMark,
then cmark-gfm (or cmark for man pages only) to transcode
to roff or html.

Introduces with-docs, with-docs-l10n, and with-website Meson options.
rdmark added a commit that referenced this issue Jan 23, 2025
Leveraged pandoc's docbook filter to convert to CommonMark,
then cmark-gfm (or cmark for man pages only) to transcode
to roff or html.

This gives us a much, much leaner toolchain for generating documentation
through the build system. No need to provide the heavy XML and Docbook suites.

Localization is done with po4a, which uses gettext in the backend.

Introduces with-docs, with-docs-l10n, and with-website Meson options.
rdmark added a commit that referenced this issue Jan 23, 2025
Leveraged pandoc's docbook filter to convert to CommonMark,
then cmark-gfm (or cmark for man pages only) to transcode
to roff or html.

This gives us a much, much leaner toolchain for generating documentation
through the build system. No need to provide the heavy XML and Docbook suites.

Localization is done with po4a, which uses gettext in the backend.

Introduces with-docs, with-docs-l10n, and with-website Meson options.
@rdmark rdmark added this to the release-4.2 milestone Jan 24, 2025
rdmark added a commit that referenced this issue Jan 24, 2025
Leveraged pandoc's docbook filter to convert to CommonMark,
then cmark-gfm (or cmark for man pages only) to transcode
to roff or html.

This gives us a much, much leaner toolchain for generating documentation
through the build system. No need to provide the heavy XML and Docbook suites.

Localization is done with po4a, which uses gettext in the backend.

Introduces with-docs, with-docs-l10n, and with-website Meson options.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant