Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ REFACTOR: Move to markdown-it + mdformat renderer #18

Merged
merged 30 commits into from
Jun 25, 2021
Merged

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Jun 18, 2021

Hey @hukkin I imagine you'll be interested in this (and hopefully want to get involved 😄)

So currently what this package does is convert RST source text -> docutils AST, via a modified RST parser that most importantly does not "resolve" all directives or roles, so that we can work out how to convert them back to MyST Markdown.
Then, in rst_to_myst/renderer.py, I walk through the AST and convert it "manually" to Markdown.

What would obviously be good though, and what I have started in rst_to_myst/markdownit.py, is to convert the docutils AST -> Markdown-it tokens, then use mdformat/mdformat-myst to convert that to Markdown.

Most of this should be quite straight-forward 🤞, but the only tricky part is directives that have nested parses, e.g.

.. admonition:: Abc *d*
   :class: xyz
   :name: df

   A *b*

   .. note::

     lmn

is currently converted to AST (see tests/fixtures/ast.txt):

<document source="source">
    <DirectiveNode arg_block="['Abc *d*']" delimiter="::::" name="admonition" options_list="('class',\ 'xyz') ('name',\ 'df')" type="argument_content_colon">
        <ArgumentNode>
            Abc 
            <emphasis>
                d
        <ContentNode>
            <paragraph>
                A 
                <emphasis>
                    b
            <DirectiveNode arg_block="[]" delimiter=":::" name="note" options_list="" type="content_only_colon">
                <ContentNode>
                    <paragraph>
                        lmn

and I don't believe mdformat/mdformat-myst has anything specific to handle this.

@chrisjsewell chrisjsewell changed the title ♻️ REFACTOR: Move to markdown-it renderer ♻️ REFACTOR: Move to markdown-it + mdformat renderer Jun 18, 2021
@chrisjsewell
Copy link
Member Author

Note this PR also addresses some of the feedback in #11

@hukkin
Copy link

hukkin commented Jun 18, 2021

I imagine you'll be interested in this (and hopefully want to get involved smile)

Haha, yeah this is good stuff.

but the only tricky part is directives that have nested parses

oh yeah this is a bit nasty, I think mdformat-myst probably does unexpected stuff here. This kind of conflicts the nesting assumptions of CommonMark and markdown-it so a case like this didn't even cross my mind.

Does MyST syntax support nested directives btw (outside eval-rst)?

This is not the first time someone needs to render Markdown from tokens, so I think we may want to work on making something like build_mdit public to alleviate the various gotchas involved (regarding plugins and default conf options for example).

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jun 18, 2021

Does MyST syntax support nested directives btw (outside eval-rst)

Oh absolutely: https://myst-parser.readthedocs.io/en/latest/syntax/syntax.html#nesting-directives

For myst-parser though, we do not implement this in the markdown parsing and so do not "encode" it in the markdown-it tokens; leaving it to the (docutils) directives to perform nested parses (see https://github.com/executablebooks/MyST-Parser/blob/649791694ba83c9b9295030c8cae9a6eee8fc767/myst_parser/mocking.py#L126).
This is because there is no real way to (programmatically) introspect directives code (like https://github.com/chrisjsewell/docutils/blob/8adab0660b2097b4f3c32cef7e5ff4cb3c72b084/docutils/docutils/parsers/rst/directives/admonitions.py#L28) and "translate" this to "markdown-it parsing code"

In this package, the default is indeed to just "wrap" all the directive text in eval-rst.
But obviously, just doing this would not make for a particularly great converter, and so with https://github.com/executablebooks/rst-to-myst/blob/main/rst_to_myst/data/directives.yml, I have essentially manually decided how each of the core directives should be parsed, and which ones require nested parsing.

It is also of note that in e.g. jupyter-book/mystmd#31, on the javascript end, we do not have the existing docutils directives, and so one would look to do everything within markdown-it

TLDR: mdformat-myst may not fully solve the use case here, and we may need to look to "extend" it in some fashion to achieve this

so I think we may want to work on making something like build_mdit public

sounds good to me 👍

@codecov
Copy link

codecov bot commented Jun 19, 2021

Codecov Report

Merging #18 (7523168) into main (1c0dee8) will increase coverage by 2.81%.
The diff coverage is 90.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #18      +/-   ##
==========================================
+ Coverage   81.05%   83.86%   +2.81%     
==========================================
  Files           8       10       +2     
  Lines        1341     1587     +246     
==========================================
+ Hits         1087     1331     +244     
- Misses        254      256       +2     
Flag Coverage Δ
pytests 83.86% <90.45%> (+2.81%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
rst_to_myst/inliner.py 69.85% <ø> (+1.47%) ⬆️
rst_to_myst/cli.py 83.85% <80.86%> (-1.26%) ⬇️
rst_to_myst/markdownit.py 91.46% <91.46%> (ø)
rst_to_myst/states.py 87.93% <91.89%> (+2.72%) ⬆️
rst_to_myst/parser.py 94.44% <94.11%> (-0.91%) ⬇️
rst_to_myst/mdformat_render.py 95.32% <95.32%> (ø)
rst_to_myst/__init__.py 100.00% <100.00%> (ø)
rst_to_myst/nodes.py 100.00% <100.00%> (ø)
rst_to_myst/utils.py 100.00% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c0dee8...7523168. Read the comment docs.

@chrisjsewell chrisjsewell linked an issue Jun 25, 2021 that may be closed by this pull request
@chrisjsewell chrisjsewell marked this pull request as ready for review June 25, 2021 04:26
@chrisjsewell chrisjsewell merged commit d40e96f into main Jun 25, 2021
@chrisjsewell chrisjsewell deleted the updates branch June 25, 2021 04:29
This was referenced Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants