diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index ab56651..217c7fc 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -25,7 +25,7 @@ jobs: strategy: fail-fast: false matrix: - python-version: [3.6, 3.7, 3.8, 3.9] + python-version: [3.7, 3.8, 3.9] os: [ubuntu-latest, windows-latest] steps: @@ -48,6 +48,7 @@ jobs: pytest --cov=rst_to_myst --cov-report=xml --cov-report=term-missing - name: Upload to Codecov + if: matrix.os == 'ubuntu-latest' uses: codecov/codecov-action@v1 with: name: pytests diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 5527938..09cbae5 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,6 +1,6 @@ repos: - repo: https://github.com/pre-commit/pre-commit-hooks - rev: v3.4.0 + rev: v4.0.1 hooks: - id: end-of-file-fixer - id: mixed-line-ending @@ -12,19 +12,19 @@ repos: - id: check-yaml - id: check-toml - repo: https://github.com/pre-commit/pygrep-hooks - rev: v1.7.0 + rev: v1.9.0 hooks: - id: python-check-blanket-noqa - repo: https://github.com/timothycrosley/isort - rev: 5.6.4 + rev: 5.8.0 hooks: - id: isort - repo: https://github.com/psf/black - rev: 20.8b1 + rev: 21.6b0 hooks: - id: black - repo: https://gitlab.com/pycqa/flake8 - rev: 3.8.4 + rev: 3.9.2 hooks: - id: flake8 additional_dependencies: diff --git a/.readthedocs.yml b/.readthedocs.yml new file mode 100644 index 0000000..1863741 --- /dev/null +++ b/.readthedocs.yml @@ -0,0 +1,13 @@ +version: 2 + +python: + version: 3 + install: + - method: pip + path: . + extra_requirements: + - docs + +sphinx: + builder: html + fail_on_warning: true diff --git a/README.md b/README.md index c6c4af2..255651a 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# rst-to-myst [UNDER-DEVELOPMENT] +# rst-to-myst [![Build Status][ci-badge]][ci-link] [![codecov.io][cov-badge]][cov-link] @@ -21,179 +21,34 @@ or with sphinx: pip install rst-to-myst[sphinx] ``` -## Basic Usage - -### Command-Line Interface (CLI) - -For all commands see: - -```bash -rst2myst --help -``` - -Parse *via* stdin: - -```console -$ echo ":role:`content`" | rst2myst parse -{role}`content` -``` - -Parse *via* file: - -```console -$ rst2myst parse -f path/to/file.rst -... -``` - -Warnings are written to `stderr` and converted text to `stdout`. - -List available directives/roles: - -```console -$ rst2myst directives list -acks admonition ... - -$ rst2myst roles list -abbr abbreviation ... -``` - -Show details of a specific directive/role: - -```console -$ rst2myst directives show admonition -class: docutils.parsers.rst.directives.admonitions.Admonition -description: '' -has_content: true -name: admonition -optional_arguments: 0 -options: - class: class_option - name: unchanged -required_arguments: 1 - -$ rst2myst roles show abbreviation -description: |- - Generic interpreted text role, where the interpreted text is simply - wrapped with the provided node class. -module: docutils.parsers.rst.roles -name: abbreviation -``` - -### Python Interface (API) - -```python -from rst_to_myst import convert - -text, stderr_stream = convert(""" -Some RST -======== - -To **convert** -""") -``` - -## Advanced Usage - -You can select a language to translate directive/role names: +To then run a basic conversion of a whole project: ```console -$ rst2myst parse -l fr -f path/to/file.rst -... +$ rst2myst convert docs/**/*.rst ``` -You can select whether sphinx directives/roles are loaded: +For greater control, you can pass configuration with CLI options, or via a YAML configuration file: ```console -$ rst2myst parse --no-sphinx -f path/to/file.rst -... +$ rst2myst convert --config config.yaml docs/**/*.rst ``` -You can load directives/roles from extensions: - -```console -$ rst2myst parse -e sphinx.ext.autodoc -e sphinx_panels -f path/to/file.rst -... +`config.yaml`: + +```yaml +language: en +sphinx: true +extensions: +- sphinx_panels +default_domain: py +consecutive_numbering: true +colon_fences: true +dollar_math: true +conversions: + sphinx_panels.dropdpwn.DropdownDirective: parse_all ``` -Directives are converted according to [rst_to_myst/data/directives.yml](rst_to_myst/data/directives.yml), which can also be updated with an external YAML file, using the `-c/--conversions` option. -This is a mapping of directive import paths to a conversion type: - -- "eval_rst" (the default): no conversion, wrap in MyST eval_rst directive - ```` - ```{eval_rst} - .. name:: argument `link`_ - :option: value - - content `link`_ - ``` - ```` -- "direct": convert directly to MyST directive, keeping original argument/content - ```` - ```{name} argument `link`_ - :option: value - - content `link`_ - ``` - ```` -- "argument_only": convert to MyST directive and convert the argument to Markdown - ```` - ```{name} argument [link](link) - :option: value - - content `link`_ - ``` - ```` -- "content_only": convert to MyST directive and convert the content to Markdown - ```` - ```{name} argument `link`_ - :option: value - - content [link](link) - ``` - ```` -- "argument_content": convert to MyST directive and convert the content to Markdown - ```` - ```{name} argument [link](link) - :option: value - - content [link](link) - ``` - ```` - -If a conversion type is prepended by "_colon", use `:::` delimiters instad of ```` ``` ````, -e.g. "argument_content_colon" - -```` -:::{name} argument [link](link) -:option: value - -content [link](link) -::: -```` - -## Conversion Notes - -The conversion is designed to be fault tolerant, -i.e. it will not check if referenced targets, roles, directives, etc exist nor fail if they do not. - -The only syntax where some checks are required is matching anonymous references and auto-number/symbol footnotes with their definitions; these definitions must be available. - -- enumerated lists with roman numerals or alphabetic prefixes will be converted to numbers -- only one kind of footnote (i.e. no symbol prefixes) -- citation are turned into footnotes, with label prepended by `cite_prefix` -- inline targets are not convertible (and so ignored) -- If tables are not compatible with Markdown (single header row, no merged cells, etc), then they will be wrapped in an `eval_rst` -- Markdown blockquotes do not have an attribution syntax, so it is converted instead to `
—text
` (the standard HTML render) - -## TODO - -The conversion covers almost all syntaxes (see—text
` (the standard HTML render) + +## Converting text snippets + +Either use the `stream` CLI command, parsing in `stdin`: + +```console +$ echo ":role:`content`" | rst2myst stream - +{role}`content` +``` + +or use the API: + +```python +from rst_to_myst import rst_to_myst +output = rst_to_myst(":role:`content`") +print(output.text) +``` + +## Converting multiple files + +Use the `convert` CLI command, with standard file globbing. +The `--dry-run` option will run without actually writing any files: + +```console +$ rst2myst convert --dry-run docs/**/*.rst +docs/source/api.rst -> docs/source/api.md +CONVERTED (extensions: []) +docs/source/cli.rst -> docs/source/cli.md +CONVERTED (extensions: ['deflist']) + +FINISHED ALL! (extensions: ['deflist']) +``` + +Extensions specify which MyST optional extensions are required to reparse the Markdown text. + +## Configuring the conversion + +The [CLI](./cli.rst) and [API](./api.rst) documentation list all the available configurations. + +For the CLI, you can directly use the option flags, or you can provide all the options in a YAML configuration file, with the `--config` option: + +```console +$ rst2myst convert --config config.yaml docs/**/*.rst +``` + +YAML config options mirror the CLI options, except using `_` instead of `-`, e.g. + +```yaml +language: en +sphinx: true +extensions: + - sphinx_panels +default_domain: py +consecutive_numbering: true +colon_fences: true +dollar_math: true +conversions: + sphinx_panels.dropdpwn.DropdownDirective: parse_all +``` + +### Directive conversion + +Directives are converted according to a mapping of the directive module path to a conversion type: + +- "eval_rst" (the default): no conversion, wrap in MyST `eval_rst` directive + + ```` + ```{eval_rst} + .. name:: argument `link`_ + :option: value + + content `link`_ + ``` + ```` + +- "direct": convert directly to MyST directive, keeping original argument/content + + ```` + ```{name} argument `link`_ + :option: value + + content `link`_ + ``` + ```` + +- "parse_argument": convert to MyST directive and convert the argument to Markdown + + ```` + ```{name} argument [link](link) + :option: value + + content `link`_ + ``` + ```` + +- "parse_content": convert to MyST directive and convert the content to Markdown + + ```` + ```{name} argument `link`_ + :option: value + + content [link](link) + ``` + ```` + +- "parse_all": convert to MyST directive and convert the content to Markdown + + ```` + ```{name} argument [link](link) + :option: value + + content [link](link) + ``` + ```` + +The default conversions are listed below, or you can use the `conversions` options to update these conversions. +Also use the `colon_fence` option to control whether directives with Markdown content are delimited by `:::`. + +````{dropdown} **Directive conversion defaults** + +```{literalinclude} ../../rst_to_myst/data/directives.yml +:language: yaml +``` + +```` + +## Additional Functionality + +### Listing available directives/roles + +List available directives/roles: + +```console +$ rst2myst directives list +acks admonition ... + +$ rst2myst roles list +abbr abbreviation ... +``` + +Show details of a specific directive/role: + +```console +$ rst2myst directives show admonition +class: docutils.parsers.rst.directives.admonitions.Admonition +description: '' +has_content: true +name: admonition +optional_arguments: 0 +options: + class: class_option + name: unchanged +required_arguments: 1 + +$ rst2myst roles show abbreviation +description: |- + Generic interpreted text role, where the interpreted text is simply + wrapped with the provided node class. +module: docutils.parsers.rst.roles +name: abbreviation +``` diff --git a/pyproject.toml b/pyproject.toml index 2bb0a3f..3753d48 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -17,18 +17,34 @@ classifiers = [ description-file = "README.md" keywords = "restructuredtext,markdown,myst" -requires-python=">=3.6" -requires=["docutils==0.15", "importlib_resources~=3.1", "pyyaml", "click~=7.1"] +requires-python=">=3.7" +requires=[ + "docutils>=0.15,<0.18", + "importlib_resources~=3.1;python_version<'3.9'", + "pyyaml", + "markdown-it-py~=1.0", + "mdformat~=0.7.6", + "mdformat-myst~=0.1.4", + "mdformat-deflist~=0.1.0", + "click~=7.1" +] [tool.flit.entrypoints."console_scripts"] rst2myst = "rst_to_myst.cli:main" [tool.flit.metadata.requires-extra] -sphinx = ["sphinx~=3.2"] +sphinx = ["sphinx>=3.2,<5"] test = [ "pytest~=6.0", "coverage", "pytest-cov", + "pytest-regressions" +] +docs = [ + "myst-parser~=0.15.0", + "sphinx-book-theme", + "sphinx-click~=2.6", + "sphinx-panels", ] [tool.flit.sdist] diff --git a/rst_to_myst/__init__.py b/rst_to_myst/__init__.py index 0fca28e..f81ffa4 100644 --- a/rst_to_myst/__init__.py +++ b/rst_to_myst/__init__.py @@ -1,6 +1,6 @@ """Convert RST to MyST-Markdown.""" +from .mdformat_render import rst_to_myst # noqa: F401 from .namespace import compile_namespace # noqa: F401 -from .parser import to_ast # noqa: F401 -from .renderer import convert, render # noqa: F401 +from .parser import to_docutils_ast # noqa: F401 -__version__ = "0.1.2" +__version__ = "0.2.0" diff --git a/rst_to_myst/cli.py b/rst_to_myst/cli.py index 22ed405..52f9d0b 100644 --- a/rst_to_myst/cli.py +++ b/rst_to_myst/cli.py @@ -1,17 +1,45 @@ from io import TextIOWrapper from pathlib import Path +from typing import List, Mapping, Optional import click import yaml -from . import compile_namespace, convert, to_ast +from . import compile_namespace, rst_to_myst, to_docutils_ast from .utils import yaml_dump @click.group(context_settings={"help_option_names": ["-h", "--help"]}) @click.version_option() def main(): - """CLI for rst-to-myst""" + """CLI for converting ReStructuredText to MyST Markdown.""" + + +def read_config(ctx, param, value): + if not value: + return + try: + with open(value, encoding="utf8") as handle: + data = yaml.safe_load(handle) + except Exception as exc: + raise click.BadOptionUsage( + "--config", f"Error reading configuration file: {exc}", ctx + ) + + ctx.default_map = ctx.default_map or {} + ctx.default_map.update(data or {}) + + return value + + +OPT_CONFIG = click.option( + "--config", + help="YAML file to read default configuration from", + is_eager=True, + expose_value=False, + type=click.Path(exists=True, file_okay=True, dir_okay=False, readable=True), + callback=read_config, +) OPT_LANGUAGE = click.option( @@ -23,24 +51,42 @@ def main(): help="Language code for directive names", ) -# TODO don't hang when when no stdin provided (test file.isatty()?) -OPT_READ = click.option( - "--file", - "-f", - type=click.File("r"), - default="-", - help="Input file [default: stdin]", + +ARG_STREAM = click.argument("stream", type=click.File("r"), metavar="PATH_OR_STDIN") + + +ARG_PATHS = click.argument( + "paths", + type=click.Path(exists=True, file_okay=True, dir_okay=True), + nargs=-1, +) + +OPT_ENCODING = click.option( + "--encoding", default="utf8", show_default=True, help="Encoding for read/write" ) def read_conversions(ctx, param, value): if not value: return {} - path = Path(value) - if not path.exists(): - raise click.BadParameter(f"Path does not exist: {value}") - with path.open("r") as handle: - data = yaml.safe_load(handle) + if isinstance(value, Mapping): + # read from config file + data = value + else: + path = Path(str(value)) + if not path.exists(): + raise click.BadOptionUsage( + "--conversions", f"Path does not exist: {value}", ctx + ) + try: + with path.open("r") as handle: + data = yaml.safe_load(handle) + except Exception as exc: + raise click.BadOptionUsage( + "--conversions", f"Error reading conversions file: {exc}", ctx + ) + if not isinstance(value, Mapping): + raise click.BadOptionUsage("--conversions", f"Not a mapping: {value!r}", ctx) return data @@ -50,7 +96,7 @@ def read_conversions(ctx, param, value): default=None, callback=read_conversions, metavar="PATH", - help="YAML file containing directive conversions", + help="YAML file mapping directives -> conversions", ) @@ -68,27 +114,82 @@ def check_sphinx(ctx, param, value): OPT_SPHINX = click.option( "--sphinx/--no-sphinx", - "-s/-ns", is_flag=True, default=True, + show_default=True, callback=check_sphinx, help="Load sphinx.", ) + + +def split_extension(ctx, param, value): + if isinstance(value, list): + # if reading from config + return value + return [ext.strip() for ext in value.split(",")] if value else [] + + OPT_EXTENSIONS = click.option( - "--extensions", "-e", multiple=True, help="Load sphinx extensions." + "--extensions", + "-e", + callback=split_extension, + help="A comma-separated list of sphinx extensions to load.", +) + +OPT_DEFAULT_DOMAIN = click.option( + "--default-domain", + "-dd", + default="py", + show_default=True, + help="Default sphinx domain", +) +OPT_DEFAULT_ROLE = click.option( + "--default-role", + "-dr", + default=None, + help="Default sphinx role [default: convert to literal]", +) +OPT_CITE_PREFIX = click.option( + "--cite-prefix", + "-cp", + default="cite", + show_default=True, + help="Prefix to add to citation references", +) +OPT_RAISE_ON_WARNING = click.option( + "--raise-on-warning", "-W", is_flag=True, help="Raise exception on parsing warning" +) +OPT_CONSECUTIVE_NUMBERING = click.option( + "--consecutive-numbering/--no-consecutive-numbering", + default=True, + show_default=True, + help="Apply consecutive numbering to ordered lists", +) +OPT_COLON_FENCES = click.option( + "--colon-fences/--no-colon-fences", + default=True, + show_default=True, + help="Use colon fences for directives with parsed content", +) +OPT_DOLLAR_MATH = click.option( + "--dollar-math/--no-dollar-math", + default=True, + show_default=True, + help="Convert math roles to dollar delimited math", ) @main.command("ast") -@OPT_READ +@ARG_STREAM @OPT_LANGUAGE @OPT_SPHINX @OPT_EXTENSIONS @OPT_CONVERSIONS -def ast(file: TextIOWrapper, language: str, sphinx: bool, extensions, conversions): - """Convert ReStructuredText to an Abstract Syntax Tree.""" - text = file.read() - document, _ = to_ast( +@OPT_CONFIG +def ast(stream: TextIOWrapper, language: str, sphinx: bool, extensions, conversions): + """Parse file / stdin (-) and print RST Abstract Syntax Tree.""" + text = stream.read() + document, _ = to_docutils_ast( text, warning_stream=click.get_text_stream("stderr"), language_code=language, @@ -100,24 +201,167 @@ def ast(file: TextIOWrapper, language: str, sphinx: bool, extensions, conversion click.echo(output) -@main.command("parse") -@OPT_READ +@main.command("tokens") +@ARG_STREAM @OPT_LANGUAGE @OPT_SPHINX @OPT_EXTENSIONS +@OPT_DEFAULT_DOMAIN +@OPT_DEFAULT_ROLE +@OPT_CITE_PREFIX +@OPT_COLON_FENCES +@OPT_DOLLAR_MATH @OPT_CONVERSIONS -def parse(file: TextIOWrapper, language: str, sphinx: bool, extensions, conversions): - """Convert ReStructuredText to MyST Markdown.""" - text = file.read() - output, _ = convert( +@OPT_CONFIG +def tokens( + stream: TextIOWrapper, + language: str, + sphinx: bool, + extensions: List[str], + default_domain: str, + default_role: Optional[str], + cite_prefix: str, + colon_fences: bool, + dollar_math: bool, + conversions, +): + """Parse file / stdin (-) and print Markdown-It tokens.""" + text = stream.read() + output = rst_to_myst( text, - click.get_text_stream("stderr"), + warning_stream=click.get_text_stream("stderr"), language_code=language, use_sphinx=sphinx, extensions=extensions, conversions=conversions, + default_domain=default_domain, + default_role=default_role, + cite_prefix=cite_prefix + "_", + colon_fences=colon_fences, + dollar_math=dollar_math, ) - click.echo(output) + click.echo(yaml_dump([token.as_dict() for token in output.tokens])) + + +@main.command("stream") +@ARG_STREAM +@OPT_LANGUAGE +@OPT_SPHINX +@OPT_EXTENSIONS +@OPT_DEFAULT_DOMAIN +@OPT_DEFAULT_ROLE +@OPT_CITE_PREFIX +@OPT_CONSECUTIVE_NUMBERING +@OPT_COLON_FENCES +@OPT_DOLLAR_MATH +@OPT_CONVERSIONS +@OPT_CONFIG +def stream( + stream: TextIOWrapper, + language: str, + sphinx: bool, + extensions: List[str], + default_domain: str, + default_role: Optional[str], + cite_prefix: str, + consecutive_numbering: bool, + colon_fences: bool, + dollar_math: bool, + conversions, +): + """Parse file / stdin (-) and print Markdown text.""" + text = stream.read() + output = rst_to_myst( + text, + warning_stream=click.get_text_stream("stderr"), + language_code=language, + use_sphinx=sphinx, + extensions=extensions, + conversions=conversions, + default_domain=default_domain, + default_role=default_role, + cite_prefix=cite_prefix + "_", + consecutive_numbering=consecutive_numbering, + colon_fences=colon_fences, + dollar_math=dollar_math, + ) + click.echo(output.text) + + +@main.command("convert") +@ARG_PATHS +@click.option("--dry-run", "-d", is_flag=True, help="Do not write/remove any files") +@click.option("--replace-files", "-R", is_flag=True, help="Remove parsed files") +@click.option("--stop-on-fail", "-S", is_flag=True, help="Stop on first failure") +@OPT_RAISE_ON_WARNING +@OPT_LANGUAGE +@OPT_SPHINX +@OPT_EXTENSIONS +@OPT_DEFAULT_DOMAIN +@OPT_DEFAULT_ROLE +@OPT_CITE_PREFIX +@OPT_CONSECUTIVE_NUMBERING +@OPT_COLON_FENCES +@OPT_DOLLAR_MATH +@OPT_CONVERSIONS +@OPT_ENCODING +@OPT_CONFIG +def convert( + paths: List[str], + dry_run: bool, + replace_files: bool, + raise_on_warning: bool, + stop_on_fail: bool, + language: str, + sphinx: bool, + extensions: List[str], + default_domain: str, + default_role: Optional[str], + cite_prefix: str, + consecutive_numbering: bool, + colon_fences: bool, + dollar_math: bool, + conversions, + encoding: str, +): + """Convert one or more files.""" + myst_extensions = set() + for path in paths: + path = Path(path) + output_path = path.parent / (path.stem + ".md") + click.secho(f"{path} -> {output_path}", fg="blue") + input_text = path.read_text(encoding) + try: + output = rst_to_myst( + input_text, + warning_stream=click.get_text_stream("stderr"), + raise_on_warning=raise_on_warning, + language_code=language, + use_sphinx=sphinx, + extensions=extensions, + conversions=conversions, + default_domain=default_domain, + default_role=default_role, + cite_prefix=cite_prefix + "_", + consecutive_numbering=consecutive_numbering, + colon_fences=colon_fences, + dollar_math=dollar_math, + ) + except Exception as exc: + click.secho(f"FAILED:\n{exc}", fg="red") + if stop_on_fail: + raise SystemExit(1) + continue + + click.secho(f"CONVERTED (extensions: {list(output.extensions)!r})", fg="green") + myst_extensions.update(output.extensions) + if dry_run: + continue + output_path.write_text(output.text, encoding=encoding) + if replace_files and output_path != path: + path.unlink() + click.echo("") + click.secho(f"FINISHED ALL! (extensions: {list(myst_extensions)!r})", fg="green") @main.group("directives") @@ -140,7 +384,7 @@ def directives_list(sphinx, extensions): @OPT_EXTENSIONS @OPT_LANGUAGE def directives_show(name, sphinx, extensions, language): - """List available directives.""" + """Show information about a single role.""" namespace = compile_namespace( extensions=extensions, use_sphinx=sphinx, language_code=language ) @@ -160,7 +404,7 @@ def roles(): @OPT_SPHINX @OPT_EXTENSIONS def roles_list(sphinx, extensions): - """List available directives.""" + """List available roles.""" namespace = compile_namespace(extensions=extensions, use_sphinx=sphinx) click.echo(" ".join(namespace.list_roles())) @@ -171,7 +415,7 @@ def roles_list(sphinx, extensions): @OPT_EXTENSIONS @OPT_LANGUAGE def roles_show(name, sphinx, extensions, language): - """List available directives.""" + """Show information about a single role.""" namespace = compile_namespace( extensions=extensions, use_sphinx=sphinx, language_code=language ) diff --git a/rst_to_myst/data/directives.yml b/rst_to_myst/data/directives.yml index 61aa00f..a8464e2 100644 --- a/rst_to_myst/data/directives.yml +++ b/rst_to_myst/data/directives.yml @@ -1,53 +1,51 @@ # value one of: # - "eval_rst": no conversion, wrap in MyST eval_rst directive # - "direct": convert directly to MyST directive, keeping original argument/content -# - "argument_only": convert to MyST directive and convert the argument to Markdown -# - "content_only": convert to MyST directive and convert the content to Markdown -# - "argument_content": convert to MyST directive and convert the content to Markdown - -# if prepended by "_colon", use ::: delimiters instad of ``` +# - "parse_argument": convert to MyST directive and convert the argument to Markdown +# - "parse_content": convert to MyST directive and convert the content to Markdown +# - "parse_all": convert to MyST directive and convert the content to Markdown # admonitions (docutils) -docutils.parsers.rst.directives.admonitions.Admonition: argument_content_colon -docutils.parsers.rst.directives.admonitions.Attention: content_only_colon -docutils.parsers.rst.directives.admonitions.Caution: content_only_colon -docutils.parsers.rst.directives.admonitions.Danger: content_only_colon -docutils.parsers.rst.directives.admonitions.Error: content_only_colon -docutils.parsers.rst.directives.admonitions.Hint: content_only_colon -docutils.parsers.rst.directives.admonitions.Important: content_only_colon -docutils.parsers.rst.directives.admonitions.Note: content_only_colon -docutils.parsers.rst.directives.admonitions.Tip: content_only_colon -docutils.parsers.rst.directives.admonitions.Warning: content_only_colon +docutils.parsers.rst.directives.admonitions.Admonition: parse_all +docutils.parsers.rst.directives.admonitions.Attention: parse_content +docutils.parsers.rst.directives.admonitions.Caution: parse_content +docutils.parsers.rst.directives.admonitions.Danger: parse_content +docutils.parsers.rst.directives.admonitions.Error: parse_content +docutils.parsers.rst.directives.admonitions.Hint: parse_content +docutils.parsers.rst.directives.admonitions.Important: parse_content +docutils.parsers.rst.directives.admonitions.Note: parse_content +docutils.parsers.rst.directives.admonitions.Tip: parse_content +docutils.parsers.rst.directives.admonitions.Warning: parse_content # docutils other (see https://docutils.sourceforge.io/docs/ref/rst/directives.html#figure) docutils.parsers.rst.directives.body.CodeBlock: direct docutils.parsers.rst.directives.body.Compound: eval_rst -docutils.parsers.rst.directives.body.Container: content_only_colon +docutils.parsers.rst.directives.body.Container: parse_content docutils.parsers.rst.directives.body.Epigraph: eval_rst docutils.parsers.rst.directives.body.Highlights: eval_rst docutils.parsers.rst.directives.body.LineBlock: eval_rst docutils.parsers.rst.directives.body.MathBlock: eval_rst docutils.parsers.rst.directives.body.ParsedLiteral: eval_rst docutils.parsers.rst.directives.body.PullQuote: eval_rst -docutils.parsers.rst.directives.body.Rubric: argument_only_colon -docutils.parsers.rst.directives.body.Sidebar: argument_content_colon -docutils.parsers.rst.directives.body.Topic: argument_content_colon +docutils.parsers.rst.directives.body.Rubric: parse_argument +docutils.parsers.rst.directives.body.Sidebar: parse_all +docutils.parsers.rst.directives.body.Topic: parse_all docutils.parsers.rst.directives.html.Meta: eval_rst -docutils.parsers.rst.directives.images.Figure: content_only_colon +docutils.parsers.rst.directives.images.Figure: parse_content docutils.parsers.rst.directives.images.Image: direct docutils.parsers.rst.directives.misc.Class: eval_rst -docutils.parsers.rst.directives.misc.Date: eval_rst +docutils.parsers.rst.directives.misc.Date: direct docutils.parsers.rst.directives.misc.DefaultRole: eval_rst docutils.parsers.rst.directives.misc.Include: direct docutils.parsers.rst.directives.misc.Raw: direct -docutils.parsers.rst.directives.misc.Replace: content_only +docutils.parsers.rst.directives.misc.Replace: parse_content docutils.parsers.rst.directives.misc.Role: eval_rst docutils.parsers.rst.directives.misc.TestDirective: eval_rst docutils.parsers.rst.directives.misc.Title: direct docutils.parsers.rst.directives.misc.Unicode: eval_rst -docutils.parsers.rst.directives.parts.Contents: argument_only_colon -docutils.parsers.rst.directives.parts.Footer: content_only_colon -docutils.parsers.rst.directives.parts.Header: content_only_colon +docutils.parsers.rst.directives.parts.Contents: parse_argument +docutils.parsers.rst.directives.parts.Footer: parse_content +docutils.parsers.rst.directives.parts.Header: parse_content docutils.parsers.rst.directives.parts.Sectnum: eval_rst docutils.parsers.rst.directives.references.TargetNotes: eval_rst docutils.parsers.rst.directives.tables.CSVTable: direct @@ -67,14 +65,14 @@ sphinx.directives.patches.CSVTable: eval_rst # list-table sphinx.directives.patches.ListTable: eval_rst # figure -sphinx.directives.patches.Figure: content_only_colon +sphinx.directives.patches.Figure: parse_content # meta sphinx.directives.patches.Meta: eval_rst # deprecated, versionadded, versionchanged -sphinx.domains.changeset.VersionChange: content_only_colon +sphinx.domains.changeset.VersionChange: parse_content # seealso -sphinx.directives.other.SeeAlso: content_only_colon +sphinx.directives.other.SeeAlso: parse_content # index sphinx.domains.index.IndexDirective: direct # default-domain @@ -105,7 +103,7 @@ sphinx.directives.other.Acks: eval_rst # hlist sphinx.directives.other.HList: eval_rst # only -sphinx.directives.other.Only: content_only_titles +sphinx.directives.other.Only: parse_content_titles # c:member # c:var @@ -207,6 +205,10 @@ sphinx.domains.std.Cmdoption: eval_rst # std:envvar sphinx.domains.std.EnvVar: eval_rst # std:glossary -sphinx.domains.std.Glossary: eval_rst +sphinx.domains.std.Glossary: parse_content # std:productionlist sphinx.domains.std.ProductionList: eval_rst + +# third-party directives +sphinxcontrib.bibtex.directives.BibliographyDirective: direct +sphinx_panels.dropdpwn.DropdownDirective: parse_all diff --git a/rst_to_myst/inliner.py b/rst_to_myst/inliner.py index fbd884b..843b47f 100644 --- a/rst_to_myst/inliner.py +++ b/rst_to_myst/inliner.py @@ -372,8 +372,6 @@ def parse( 4. If not found or invalid, generate a warning and ignore the start-string. 5. Implicit inline markup (e.g. standalone URIs) is found last. """ - # TODO Needs to be refactored for nested inline markup - # (add nested_parse() method?) self.reporter = memo.reporter # type: Reporter self.document = memo.document # type: nodes.document self.language = memo.language @@ -506,7 +504,7 @@ def interpreted_or_phrase_ref(self, match: Match, lineno: int) -> DispatchResult def phrase_ref( self, before: str, after: str, rawsource: str, escaped: str, text: str ) -> DispatchResult: - """Handle phrase references e.g. `phrase ref`_, `embedded-{node.astext()}
' + raise nodes.SkipNode + + def visit_reference(self, node): + # we assume all reference names are plain text + text = node.astext() + + if "standalone_uri" in node: + # autolink + token = self.add_token("link_open", "a", 1, markup="autolink", info="auto") + token.attrs["href"] = node["refuri"] + self.add_token("text", "", 0, content=node["refuri"]) + self.add_token("link_close", "a", -1, markup="autolink", info="auto") + elif "refname" in node: + # reference a link definition `[refname]: url`, or a target `(refname)=` + # TODO ensure mdformat does not wrap in <> + token = self.add_token( + "link_open", + "a", + 1, + attrs={"href": node["refname"]}, + # TODO should only add label if target found? + meta={"label": node["refname"]}, + ) + self.add_token("text", "", 0, content=text) + self.add_token("link_close", "a", -1) + elif "refuri" in node: + # external link + # TODO ensure prefixed with http://? + token = self.add_token("link_open", "a", 1, attrs={"href": node["refuri"]}) + self.add_token("text", "", 0, content=text) + self.add_token("link_close", "a", -1) + elif "refid" in node: + # anonymous links, pointing to internal targets + # TODO ensure mdformat does not wrap in <> + token = self.add_token( + "link_open", + "a", + 1, + attrs={"href": node["refid"]}, + ) + self.add_token("text", "", 0, content=text) + self.add_token("link_close", "a", -1) + else: + message = f"unknown reference type: {node.rawsource}" + self.warning(message, node.line) + if self.raise_on_warning: + raise NotImplementedError(message) + + raise nodes.SkipNode + + def visit_target(self, node): + if "inline" in node and node["inline"]: + # TODO inline targets + message = f"inline targets not implemented: {node.rawsource}" + self.warning(message, node.line) + if self.raise_on_warning: + raise NotImplementedError(message) + self.add_token( + "code_inline", "code", 0, markup="`", content=str(node.rawsource) + ) + raise nodes.SkipNode + + if "refuri" in node: + for name in node["names"]: + # TODO warn about name starting ^ (clashes with footnotes) + if name not in self._env["references"]: + self._env["references"][name] = { + "title": "", + "href": node["refuri"], + "map": [node.line, node.line], + } + else: + self._env["duplicate_refs"].append( + { + "label": name, + "title": "", + "href": node["refuri"], + "map": [node.line, node.line], + } + ) + elif "names" in node: + for name in node["names"]: + self.add_token( + "myst_target", "", 0, attrs={"class": "myst-target"}, content=name + ) + if "refid" in node: + self.add_token( + "myst_target", + "", + 0, + attrs={"class": "myst-target"}, + content=node["refid"], + ) + + # TODO check for content? + raise nodes.SkipNode + + # Standard CommonMark extensions + + def parse_gfm_table(self, node) -> bool: + """Check whether an RST table can be converted to a GFM one. + + RST tables can have e.g. cells spanning multiple columns/rows, + which the GitHub Flavoured Markdown (GFM) table variant does not support. + """ + # must have one child tgroup + if len(node.children) != 1 or not isinstance(node.children[0], nodes.tgroup): + return False + # tgroup should contain the number of columns + tgroup = node.children[0] + if "cols" not in tgroup: + return False + ncolumns = tgroup["cols"] + # trgoup should contain children: (colspec)*, thead, tbody + if len(tgroup.children) < 2: + return False + if not isinstance(tgroup.children[-2], nodes.thead): + return False + if not isinstance(tgroup.children[-1], nodes.tbody): + return False + thead = tgroup.children[-2] + tbody = tgroup.children[-1] + # the header can only have one row with the full amount of columns + if len(thead.children) != 1 or len(thead.children[0]) != ncolumns: + return False + # each body row should have the full amount of columns + for row in tbody.children: + if len(row.children) != ncolumns: + return False + return True + + def visit_table(self, node): + + if not self.parse_gfm_table(node): + text = node.rawsource + if not text.endswith("\n"): + text += "\n" + self.add_token( + "fence", "code", 0, content=text, markup="```", info="{eval_rst}" + ) + raise nodes.SkipNode + + self.add_token("table_open", "table", 1) + + def depart_table(self, node): + self.add_token("table_close", "table", -1) + + def visit_tgroup(self, node): + pass + + def depart_tgroup(self, node): + pass + + def visit_colspec(self, node): + raise nodes.SkipNode + + def visit_thead(self, node): + self.add_token("thead_open", "thead", 1) + + def depart_thead(self, node): + self.add_token("thead_close", "thead", -1) + + def visit_tbody(self, node): + self.add_token("tbody_open", "tbody", 1) + + def depart_tbody(self, node): + self.add_token("tbody_close", "tbody", -1) + + def visit_row(self, node): + self.add_token("tr_open", "tr", 1) + + def depart_row(self, node): + self.add_token("tr_close", "tr", -1) + + def visit_entry(self, node): + tag = "th" if self.parent_tokens.get("thead") else "td" + self.add_token(f"{tag}_open", tag, 1) + + def depart_entry(self, node): + tag = "th" if self.parent_tokens.get("thead") else "td" + # Markdown cells can not include newlines + # TODO improve or upstream this "fix" + # maybe replace with html_inline—{node.astext()}
']) - self.add_newline(2) - raise nodes.SkipNode - - def visit_table(self, node): - # convert tables to Markdown if possible, e.g. single header row, etc - cells = self.assess_table(node) - if cells: - self.add_lines( - [ - "| " + " | ".join(cells[0]) + " |", - "| " + " | ".join("-" * len(c) for c in cells[0]) + " |", - ] - + ["| " + " | ".join(row) + " |" for row in cells[1:]] - ) - - else: - self.add_lines(["```{eval_rst}"] + node.rawsource.splitlines() + ["```"]) - self.add_newline(2) - raise nodes.SkipNode - - def assess_table(self, node): - if len(node.children) != 1 or not isinstance(node.children[0], nodes.tgroup): - return None - tgroup = node.children[0] - if "cols" not in tgroup: - return None - ncolumns = tgroup["cols"] - if ( - not len(tgroup.children) > 1 - or not isinstance(tgroup.children[-1], nodes.tbody) - or not isinstance(tgroup.children[-2], nodes.thead) - ): - return None - thead = tgroup.children[-2] - tbody = tgroup.children[-1] - if len(thead.children) != 1 or len(thead.children[0]) != ncolumns: - return None - rows = [copy.copy(thead.children[0].children)] - for row in tbody.children: - if len(row.children) != ncolumns: - return None - rows.append(copy.copy(row.children)) - - # render cells - widths = [0 for _ in rows[0]] - for i, row in enumerate(rows): - for j, col in enumerate(row): - if not isinstance(col, nodes.entry): - return None - if len(col.children) != 1 or not isinstance( - col.children[0], nodes.paragraph - ): - return None - rows[i][j] = self.nested_render(col.children[0].children).strip() - widths[j] = max(widths[j], len(rows[i][j])) - - # align columns - for i, _ in enumerate(rows): - for j, _ in enumerate(row): - rows[i][j] = rows[i][j].ljust(widths[j]) - - return rows - - # TODO https://docutils.sourceforge.io/docs/user/rst/quickref.htm - # line block, field list, option list - - -def render( - document: nodes.document, warning_stream: Optional[IO] = None, **kwargs -) -> Tuple[str, IO]: - renderer = MystRenderer(document, warning_stream, **kwargs) - document.walkabout(renderer) - # TODO also return or print renderer.extensions_required - # TODO remove double black lines - # TODO remove spaces in blank lines - return renderer.rendered, renderer._warning_stream - - -def convert( - text: str, - warning_stream: Optional[IO] = None, - raise_on_error: bool = False, - cite_prefix: str = "cite_", - language_code="en", - use_sphinx=True, - extensions=(), - default_domain="py", - conversions=None, -) -> Tuple[str, IO]: - document, warning_stream = to_ast( - text, - warning_stream=warning_stream, - language_code=language_code, - use_sphinx=use_sphinx, - extensions=extensions, - default_domain=default_domain, - conversions=conversions, - ) - text, warning_stream = render( - document, warning_stream, cite_prefix=cite_prefix, raise_on_error=raise_on_error - ) - return text, warning_stream diff --git a/rst_to_myst/states.py b/rst_to_myst/states.py index 62319b6..9763053 100644 --- a/rst_to_myst/states.py +++ b/rst_to_myst/states.py @@ -1,8 +1,10 @@ +"""docutils states.""" import re +from typing import List, Optional from docutils import nodes from docutils.nodes import fully_normalize_name as normalize_name -from docutils.parsers.rst import states, tableparser +from docutils.parsers.rst import Directive, states, tableparser from docutils.utils import ( BadOptionDataError, BadOptionError, @@ -10,7 +12,7 @@ extract_options, ) -from .nodes import ArgumentNode, ContentNode, DirectiveNode +from .nodes import ArgumentNode, ContentNode, DirectiveNode, EvalRstNode # Alphanumerics with isolated internal [-._+:] chars (i.e. not 2 together): SIMPLENAME_RE = r"(?:(?!_)\w)+(?:[-._+:](?:(?!_)\w)+)*" @@ -109,22 +111,18 @@ def directive(self, match, **option_presets): blank_finish, ) = self.parse_directive_match(match) - directive_node = DirectiveNode( - block_text, - name=type_name, - delimiter="`", - ) - # try to get directive class # directive_class, messages = directives.directive( # type_name, self.memo.language, self.document # ) - directive_class = self.document.settings.namespace.get_directive(type_name) + directive_class: Optional[ + Directive + ] = self.document.settings.namespace.get_directive(type_name) # default to eval rst if directive_class is None: # TODO warning message? - return self.eval_rst(directive_node, indent, indented, blank_finish) + return self.eval_rst(type_name, block_text, indent, indented, blank_finish) # get directive path for lookup directive_path = f"{directive_class.__module__}.{directive_class.__name__}" @@ -132,26 +130,20 @@ def directive(self, match, **option_presets): # lookup directive path conversion = self.document.settings.directive_data.get(directive_path, None) - if (not conversion) or conversion == "eval_rst": - return self.eval_rst(directive_node, indent, indented, blank_finish) - if conversion not in [ "direct", - "argument_only", - "content_only", - "content_only_titles", - "argument_content", - "direct_colon", - "argument_only_colon", - "content_only_colon", - "argument_content_colon", + "parse_argument", + "parse_content", + "parse_content_titles", + "parse_all", ]: - # TODO warning - return self.eval_rst(directive_node, indent, indented, blank_finish) - - directive_node["type"] = conversion - if conversion.endswith("_colon"): - directive_node["delimiter"] = ":" + if conversion and conversion != "eval_rst": + self.reporter.warning( + f'Unknown conversion type "{conversion}"', + nodes.literal_block(block_text, block_text), + line=lineno, + ) + return self.eval_rst(type_name, block_text, indent, indented, blank_finish) try: ( @@ -166,41 +158,52 @@ def directive(self, match, **option_presets): nodes.literal_block(block_text, block_text), line=lineno, ) - return self.eval_rst(directive_node, indent, indented, blank_finish) - - directive_node["arg_block"] = arg_block - directive_node["options_list"] = options_list + return self.eval_rst(type_name, block_text, indent, indented, blank_finish) - if content and conversion == "direct": - content_node = ContentNode() - content_node += nodes.paragraph("", nodes.Text("\n".join(content))) - directive_node += content_node + directive_node = DirectiveNode( + block_text, + name=type_name, + module=directive_path, + conversion=conversion, + options_list=options_list, + ) - if "argument" in conversion: + if directive_class.required_arguments or directive_class.optional_arguments: argument_node = ArgumentNode() directive_node += argument_node - textnodes, messages = self.inline_text("\n".join(arg_block), lineno) - # TODO report messages? - argument_node.extend(textnodes) + if conversion in ("parse_argument", "parse_all"): + textnodes, messages = self.inline_text(" ".join(arg_block), lineno) + # TODO report messages? + argument_node.extend(textnodes) + else: + argument_node += nodes.Text(" ".join(arg_block)) - if content and "content" in conversion: + if directive_class.has_content: content_node = ContentNode() directive_node += content_node - self.nested_parse( - content, - content_offset, - content_node, - match_titles="titles" in conversion, - ) + if conversion in ("parse_content", "parse_content_titles", "parse_all"): + self.nested_parse( + content, + content_offset, + content_node, + match_titles="titles" in conversion, + ) + else: + content_node += nodes.Text("\n".join(content or [])) return [directive_node], blank_finish @staticmethod - def eval_rst(directive_node, indent, indented, blank_finish): - directive_node["type"] = "eval_rst" - directive_node["indent"] = indent - directive_node["indented"] = indented - return [directive_node], blank_finish + def eval_rst( + name: str, block_text: str, indent: int, indented: List[str], blank_finish: bool + ): + """Return an EvalRstNode.""" + node = EvalRstNode(block_text, name=name, indent=indent) + if not block_text.startswith(".. "): + # substitution definition directives + block_text = ".. " + block_text + node += nodes.Text(block_text) + return [node], blank_finish def parse_directive_match(self, match): lineno = self.state_machine.abs_line_number() @@ -354,7 +357,7 @@ def substitution_def(self, match): return [substitution_node], blank_finish def table(self, isolate_function, parser_class): - """Parse a table.""" + """Parse a table, and record the raw text.""" block, messages, blank_finish = isolate_function() if block: try: @@ -382,6 +385,34 @@ def __init__(self, state_machine, debug=False): "initial_state": "Body", } + def field_marker(self, match, context, next_state): + """Field list item. + + Modified to store full text of field_list in ``rawsource`` + """ + field_list = nodes.field_list() + self.parent += field_list + field, blank_finish = self.field(match) + field_list += field + offset = self.state_machine.line_offset + 1 # next line + newline_offset, blank_finish = self.nested_list_parse( + self.state_machine.input_lines[offset:], + input_offset=self.state_machine.abs_line_offset() + 1, + node=field_list, + initial_state="FieldList", + blank_finish=blank_finish, + ) + self.goto_line(newline_offset) + # TODO this slicing of input_lines seems to work, but I'm not exactly sure why + field_list.rawsource += "\n".join( + self.state_machine.input_lines[ + offset - 1 : offset + (newline_offset - field.line) + ] + ) + if not blank_finish: + self.parent += self.unindent_warning("Field list") + return [], next_state, [] + class Explicit(ExplicitMixin, states.Explicit): def __init__(self, state_machine, debug=False): diff --git a/rst_to_myst/utils.py b/rst_to_myst/utils.py index 070fa26..eb2c0c3 100644 --- a/rst_to_myst/utils.py +++ b/rst_to_myst/utils.py @@ -39,5 +39,5 @@ class YamlDumper(yaml.SafeDumper): YamlDumper.add_representer(str, represent_str) -def yaml_dump(data): - return yaml.dump(data, Dumper=YamlDumper) +def yaml_dump(data, sort_keys: bool = True): + return yaml.dump(data, Dumper=YamlDumper, sort_keys=sort_keys) diff --git a/tests/fixtures/ast.txt b/tests/fixtures/ast.txt index 031dd6d..e0cfd0f 100644 --- a/tests/fixtures/ast.txt +++ b/tests/fixtures/ast.txt @@ -185,6 +185,10 @@ directive-eval-rst: content + .. sdf:: + + other + .. hij:: argument :opt: value1 :opt2: value2 @@ -197,12 +201,29 @@ directive-eval-rst: .-attribution
-para 3 -. +> > nested +> +> - with +> - bullet list +> +> 1. with +> 2. enumerated list -comments: -. -.. This is a comment. -. - +c . transition: . ---- . ---- - +______________________________________________________________________ . -directives: +headings: . -.. image:: images/ball1.gif +heading 1 +========= -.. figure:: images/ball1.gif - :option: value +heading 2-1 +----------- - Content +heading 3 +********* -.. note:: - :class: something - :name: else +heading 2-2 +----------- +. +# heading 1 + +## heading 2-1 - .. admonition:: Some :role:`a` +### heading 3 - Content :role:`a` +## heading 2-2 . -```{image} images/ball1.gif -``` +bullet-list: +. +- a +- b +- c -:::{figure} images/ball1.gif -:option: value +* d -Content + * e + * f -::: +* g +. +- a +- b +- c -::::{note} -:class: something -:name: else +* d -:::{admonition} Some {role}`a` + - e + - f -Content {role}`a` +* g +. -::: +enumerated list: +. +1. a +2. b +3. c -:::: +#. d +#. e +11. f +12. g . +1. a +2. b +3. c +4. d +5. e -lists: +11) f +12) g . -a -- b -- *c* +comment: +. +.. This is a comment. - * x +.. + This whole indented block + is a comment. -1. d -2. e - f + Still in the comment. - g +. +% This is a comment. - 5. x +% This whole indented block +% is a comment. +% +% Still in the comment. . -a -- b -- *c* +autolink: +. +http://a.net/ +. +—attribution
-> > -> - with -> - bullet list - -> 1. with -> 2. enumerated list +| Header row, column 1 (header rows optional) | Header 2 | Header 3 | Header 4 | +| ------------------------------------------- | -------- | -------- | -------- | +| body row 1, column 1 | column 2 | column 3 | column 4 | +| body row 2 | ... | ... | | para . +table, multi-head: +. +== == +A B +X Y +== == +C D +== == +. +```{eval_rst} +== == +A B +X Y +== == +C D +== == +``` +. + front-matter: . :Authors: @@ -323,134 +357,233 @@ Authors: |- Dedication: To my father. Version: 1.0 of 2001/08/08 orphan: true - --- +. +substitution-reference: +. +|sub| +. +{{ sub }} . -substitution-definitions: +substitution-definition: . +:orphan: + .. |name| replace:: replacement `a`_ .. |caution| image:: warning.png :alt: Warning! . --- +orphan: true substitutions: caution: |- ```{image} warning.png :alt: Warning! - ``` - name: replacement [a](a) - + name: replacement [a] --- - - . -tables-simple: +definition list . -===== ===== ======= -`a`_ B A and B -===== ===== ======= -False False False -True False False -False True False -True True `a`_ -===== ===== ======= +term (up to a line of text) + Definition of the term, which must be indented -para + and can even consist of multiple paragraphs + +next term + Description ``a`` `a`_. . -| [a](a) | B | A and B | -| ------ | ----- | ------- | -| False | False | False | -| True | False | False | -| False | True | False | -| True | True | [a](a) | +term (up to a line of text) -para +: Definition of the term, which must be indented + + and can even consist of multiple paragraphs + +next term + +: Description `a` [a]. . -tables-grid: + +directive-eval-rst . -+------------------------+------------+----------+----------+ -| Header row, column 1 | Header 2 | Header 3 | Header 4 | -| (header rows optional) | | | | -+========================+============+==========+==========+ -| body row 1, column 1 | column 2 | column 3 | column 4 | -+------------------------+------------+----------+----------+ -| body row 2 | ... | ... | | -+------------------------+------------+----------+----------+ +.. unknown:: argument + +.. xyz:: + :opt: value + +.. lmp:: + + content + + .. sdf:: + + other + +.. hij:: argument + :opt: value1 + :opt2: value2 + +.. hij:: argument + :opt: value1 + :opt2: value2 + + ````content```` -para . -```{eval_rst} -+------------------------+------------+----------+----------+ -| Header row, column 1 | Header 2 | Header 3 | Header 4 | -| (header rows optional) | | | | -+========================+============+==========+==========+ -| body row 1, column 1 | column 2 | column 3 | column 4 | -+------------------------+------------+----------+----------+ -| body row 2 | ... | ... | | -+------------------------+------------+----------+----------+ +```{eval-rst} +.. unknown:: argument ``` -para +```{eval-rst} +.. xyz:: + :opt: value +``` + +```{eval-rst} +.. lmp:: + + content + + .. sdf:: + + other +``` + +```{eval-rst} +.. hij:: argument + :opt: value1 + :opt2: value2 +``` + +`````{eval-rst} +.. hij:: argument + :opt: value1 + :opt2: value2 + + ````content```` +````` . -match_titles: +directive-admonition: . -.. computational-economics documentation master file +initial paragraph + +.. admonition:: Abc *d* `a`_ + :class: xyz + :name: df + + A *b* http://a.net/ + + next paragraph + +.. note:: -.. only:: html + .. tip:: - #### - Home - #### + Content -.. only:: latex + .. note:: - ########################## - Datascience for Economists - ########################## + Content 2 -.. toctree:: - :maxdepth: 2 - :titlesonly: + .. unknown:: arg_block - introduction/index - python_fundamentals/index - scientific/index - pandas/index - applications/index + content +final paragraph . - +initial paragraph + +:::{admonition} Abc *d* [a] +:class: xyz +:name: df + +A *b*-attribution
+> +> - with +> - bullet list +> +> 1. with +> 2. enumerated list + +para +. + +match_titles: +. +.. computational-economics documentation master file + +.. only:: html + + #### + Home + #### + +.. only:: latex + + ########################## + Datascience for Economists + ########################## + +.. toctree:: + :maxdepth: 2 + :titlesonly: + + introduction/index + python_fundamentals/index + scientific/index + pandas/index + applications/index + +. +% computational-economics documentation master file + +:::{only} html +# Home +::: + +:::{only} latex +# Datascience for Economists +::: + +```{toctree} +:maxdepth: 2 +:titlesonly: true + +introduction/index +python_fundamentals/index +scientific/index +pandas/index +applications/index +``` +. + +list-indented +. +This is a numbered list! + +#. Step 1 +#. Step 2 +#. Step 3 +#. Step 4 + +This is a numbered list with indentation! + + #. Step 1 + #. Step 2 + #. Step 3 + #. Step 4 + +This is a regular list with indentation! + + * Step 1 + * Step 2 + * Step 3 + * Step 4 +. +This is a numbered list! + +1. Step 1 +2. Step 2 +3. Step 3 +4. Step 4 + +This is a numbered list with indentation! + +> 1. Step 1 +> 2. Step 2 +> 3. Step 3 +> 4. Step 4 + +This is a regular list with indentation! + +> - Step 1 +> - Step 2 +> - Step 3 +> - Step 4 +. + +fields-after-title +. +============================= + reStructuredText Directives +============================= +:Author: David Goodger +:Contact: docutils-develop@lists.sourceforge.net +:Revision: $Revision$ +:Date: $Date$ +:Copyright: This document has been placed in the public domain. +. +--- +Author: David Goodger +Contact:-Random House Webster's College Dictionary, 1991
+ +The "rubric" directive inserts a "rubric" element into the document +tree. A rubric is like an informal heading that doesn't correspond to +the document's structure. + +### Epigraph + +```{eval-rst} + +:Directive Type: "epigraph" +:Doctree Element: block_quote_ +:Directive Arguments: None. +:Directive Options: None. +:Directive Content: Interpreted as the body of the block quote. +``` + +An epigraph is an apposite (suitable, apt, or pertinent) short +inscription, often a quotation or poem, at the beginning of a document +or section. + +The "epigraph" directive produces an "epigraph"-class block quote. +For example, this input: + +``` +.. epigraph:: + + No matter where you go, there you are. + + -- Buckaroo Banzai +``` + +becomes this document tree fragment: + +``` +-Attribution 1
+> +> > Block quote 2. + +[Empty comments] may be used to explicitly terminate preceding +constructs that would otherwise consume a block quote: + +``` +* List item. + +.. + + Block quote 3. +``` + +Empty comments may also be used to separate block quotes: + +``` + Block quote 4. + +.. + + Block quote 5. +``` + +Blank lines are required before and after a block quote, but these +blank lines are not included as part of the block quote. + +Syntax diagram: + +``` ++------------------------------+ +| (current level of | +| indentation) | ++------------------------------+ + +---------------------------+ + | block quote | + | (body elements)+ | + | | + | -- attribution text | + | (optional) | + +---------------------------+ +``` + +#### Doctest Blocks + +Doctree element: doctest_block. + +Doctest blocks are interactive Python sessions cut-and-pasted into +docstrings. They are meant to illustrate usage by example, and +provide an elegant and powerful testing environment via the [doctest +module][doctest module] in the Python standard library. + +Doctest blocks are text blocks which begin with `">>> "`, the Python +interactive interpreter main prompt, and end with a blank line. +Doctest blocks are treated as a special case of literal blocks, +without requiring the literal block syntax. If both are present, the +literal block syntax takes priority over Doctest block syntax: + +``` +This is an ordinary paragraph. + +>>> print 'this is a Doctest block' +this is a Doctest block + +The following is a literal block:: + + >>> This is not recognized as a doctest block by + reStructuredText. It *will* be recognized by the doctest + module, though! +``` + +Indentation is not required for doctest blocks. + +#### Tables + +Doctree elements: table, tgroup, colspec, thead, tbody, row, entry. + +ReStructuredText provides two syntaxes for delineating table cells: +[Grid Tables] and [Simple Tables]. + +As with other body elements, blank lines are required before and after +tables. Tables' left edges should align with the left edge of +preceding text blocks; if indented, the table is considered to be part +of a block quote. + +Once isolated, each table cell is treated as a miniature document; the +top and bottom cell boundaries act as delimiting blank lines. Each +cell contains zero or more body elements. Cell contents may include +left and/or right margins, which are removed before processing. + +##### Grid Tables + +Grid tables provide a complete table representation via grid-like +"ASCII art". Grid tables allow arbitrary cell contents (body +elements), and both row and column spans. However, grid tables can be +cumbersome to produce, especially for simple data sets. The [Emacs +table mode][emacs table mode] is a tool that allows easy editing of grid tables, in +Emacs. See [Simple Tables] for a simpler (but limited) +representation. + +Grid tables are described with a visual grid made up of the characters +"-", "=", "|", and "+". The hyphen ("-") is used for horizontal lines +(row separators). The equals sign ("=") may be used to separate +optional header rows from the table body (not supported by the [Emacs +table mode][emacs table mode]). The vertical bar ("|") is used for vertical lines +(column separators). The plus sign ("+") is used for intersections of +horizontal and vertical lines. Example: + +``` ++------------------------+------------+----------+----------+ +| Header row, column 1 | Header 2 | Header 3 | Header 4 | +| (header rows optional) | | | | ++========================+============+==========+==========+ +| body row 1, column 1 | column 2 | column 3 | column 4 | ++------------------------+------------+----------+----------+ +| body row 2 | Cells may span columns. | ++------------------------+------------+---------------------+ +| body row 3 | Cells may | - Table cells | ++------------------------+ span rows. | - contain | +| body row 4 | | - body elements. | ++------------------------+------------+---------------------+ +``` + +Some care must be taken with grid tables to avoid undesired +interactions with cell text in rare cases. For example, the following +table contains a cell in row 2 spanning from column 2 to column 4: + +``` ++--------------+----------+-----------+-----------+ +| row 1, col 1 | column 2 | column 3 | column 4 | ++--------------+----------+-----------+-----------+ +| row 2 | | ++--------------+----------+-----------+-----------+ +| row 3 | | | | ++--------------+----------+-----------+-----------+ +``` + +If a vertical bar is used in the text of that cell, it could have +unintended effects if accidentally aligned with column boundaries: + +``` ++--------------+----------+-----------+-----------+ +| row 1, col 1 | column 2 | column 3 | column 4 | ++--------------+----------+-----------+-----------+ +| row 2 | Use the command ``ls | more``. | ++--------------+----------+-----------+-----------+ +| row 3 | | | | ++--------------+----------+-----------+-----------+ +``` + +Several solutions are possible. All that is needed is to break the +continuity of the cell outline rectangle. One possibility is to shift +the text by adding an extra space before: + +``` ++--------------+----------+-----------+-----------+ +| row 1, col 1 | column 2 | column 3 | column 4 | ++--------------+----------+-----------+-----------+ +| row 2 | Use the command ``ls | more``. | ++--------------+----------+-----------+-----------+ +| row 3 | | | | ++--------------+----------+-----------+-----------+ +``` + +Another possibility is to add an extra line to row 2: + +``` ++--------------+----------+-----------+-----------+ +| row 1, col 1 | column 2 | column 3 | column 4 | ++--------------+----------+-----------+-----------+ +| row 2 | Use the command ``ls | more``. | +| | | ++--------------+----------+-----------+-----------+ +| row 3 | | | | ++--------------+----------+-----------+-----------+ +``` + +##### Simple Tables + +Simple tables provide a compact and easy to type but limited +row-oriented table representation for simple data sets. Cell contents +are typically single paragraphs, although arbitrary body elements may +be represented in most cells. Simple tables allow multi-line rows (in +all but the first column) and column spans, but not row spans. See +[Grid Tables] above for a complete table representation. + +Simple tables are described with horizontal borders made up of "=" and +"-" characters. The equals sign ("=") is used for top and bottom +table borders, and to separate optional header rows from the table +body. The hyphen ("-") is used to indicate column spans in a single +row by underlining the joined columns, and may optionally be used to +explicitly and/or visually separate rows. + +A simple table begins with a top border of equals signs with one or +more spaces at each column boundary (two or more spaces recommended). +Regardless of spans, the top border *must* fully describe all table +columns. There must be at least two columns in the table (to +differentiate it from section headers). The top border may be +followed by header rows, and the last of the optional header rows is +underlined with '=', again with spaces at column boundaries. There +may not be a blank line below the header row separator; it would be +interpreted as the bottom border of the table. The bottom boundary of +the table consists of '=' underlines, also with spaces at column +boundaries. For example, here is a truth table, a three-column table +with one header row and four body rows: + +``` +===== ===== ======= + A B A and B +===== ===== ======= +False False False +True False False +False True False +True True True +===== ===== ======= +``` + +Underlines of '-' may be used to indicate column spans by "filling in" +column margins to join adjacent columns. Column span underlines must +be complete (they must cover all columns) and align with established +column boundaries. Text lines containing column span underlines may +not contain any other text. A column span underline applies only to +one row immediately above it. For example, here is a table with a +column span in the header: + +``` +===== ===== ====== + Inputs Output +------------ ------ + A B A or B +===== ===== ====== +False False False +True False True +False True True +True True True +===== ===== ====== +``` + +Each line of text must contain spaces at column boundaries, except +where cells have been joined by column spans. Each line of text +starts a new row, except when there is a blank cell in the first +column. In that case, that line of text is parsed as a continuation +line. For this reason, cells in the first column of new rows (*not* +continuation lines) *must* contain some text; blank cells would lead +to a misinterpretation (but see the tip below). Also, this mechanism +limits cells in the first column to only one line of text. Use [grid +tables][grid tables] if this limitation is unacceptable. + +:::{Tip} +To start a new row in a simple table without text in the first +column in the processed output, use one of these: + +- an empty comment (".."), which may be omitted from the processed + output (see [Comments] below) +- a backslash escape ("`\`") followed by a space (see [Escaping + Mechanism][escaping mechanism] above) +::: + +Underlines of '-' may also be used to visually separate rows, even if +there are no column spans. This is especially useful in long tables, +where rows are many lines long. + +Blank lines are permitted within simple tables. Their interpretation +depends on the context. Blank lines *between* rows are ignored. +Blank lines *within* multi-line rows may separate paragraphs or other +body elements within cells. + +The rightmost column is unbounded; text may continue past the edge of +the table (as indicated by the table borders). However, it is +recommended that borders be made long enough to contain the entire +text. + +The following example illustrates continuation lines (row 2 consists +of two lines of text, and four lines for row 3), a blank line +separating paragraphs (row 3, column 2), text extending past the right +edge of the table, and a new row which will have no text in the first +column in the processed output (row 4): + +``` +===== ===== +col 1 col 2 +===== ===== +1 Second column of row 1. +2 Second column of row 2. + Second line of paragraph. +3 - Second column of row 3. + + - Second item in bullet + list (row 3, column 2). +\ Row 4; column 1 will be empty. +===== ===== +``` + +#### Explicit Markup Blocks + +An explicit markup block is a text block: + +- whose first line begins with ".." followed by whitespace (the + "explicit markup start"), +- whose second and subsequent lines (if any) are indented relative to + the first, and +- which ends before an unindented line. + +Explicit markup blocks are analogous to bullet list items, with ".." +as the bullet. The text on the lines immediately after the explicit +markup start determines the indentation of the block body. The +maximum common indentation is always removed from the second and +subsequent lines of the block body. Therefore if the first construct +fits in one line, and the indentation of the first and second +constructs should differ, the first construct should not begin on the +same line as the explicit markup start. + +Blank lines are required between explicit markup blocks and other +elements, but are optional between explicit markup blocks where +unambiguous. + +The explicit markup syntax is used for footnotes, citations, hyperlink +targets, directives, substitution definitions, and comments. + +##### Footnotes + +Doctree elements: footnote, label. + +Each footnote consists of an explicit markup start (".. "), a left +square bracket, the footnote label, a right square bracket, and +whitespace, followed by indented body elements. A footnote label can +be: + +- a whole decimal number consisting of one or more digits, +- a single "#" (denoting [auto-numbered footnotes]), +- a "#" followed by a simple reference name (an [autonumber label]), + or +- a single "\*" (denoting [auto-symbol footnotes]). + +The footnote content (body elements) must be consistently indented (by +at least 3 spaces) and left-aligned. The first body element within a +footnote may often begin on the same line as the footnote label. +However, if the first element fits on one line and the indentation of +the remaining elements differ, the first element must begin on the +line after the footnote label. Otherwise, the difference in +indentation will not be detected. + +Footnotes may occur anywhere in the document, not only at the end. +Where and how they appear in the processed output depends on the +processing system. + +Here is a manually numbered footnote: + +``` +.. [1] Body elements go here. +``` + +Each footnote automatically generates a hyperlink target pointing to +itself. The text of the hyperlink target name is the same as that of +the footnote label. [Auto-numbered footnotes] generate a number as +their footnote label and reference name. See [Implicit Hyperlink +Targets][implicit hyperlink targets] for a complete description of the mechanism. + +Syntax diagram: + +``` ++-------+-------------------------+ +| ".. " | "[" label "]" footnote | ++-------+ | + | (body elements)+ | + +-------------------------+ +``` + +###### Auto-Numbered Footnotes + +A number sign ("#") may be used as the first character of a footnote +label to request automatic numbering of the footnote or footnote +reference. + +The first footnote to request automatic numbering is assigned the +label "1", the second is assigned the label "2", and so on (assuming +there are no manually numbered footnotes present; see [Mixed Manual +and Auto-Numbered Footnotes][mixed manual and auto-numbered footnotes] below). A footnote which has +automatically received a label "1" generates an implicit hyperlink +target with name "1", just as if the label was explicitly specified. + +(autonumber label)= + +A footnote may specify a label explicitly while at the same time +requesting automatic numbering: `[#label]`. These labels are called +`` _`autonumber labels` ``. Autonumber labels do two things: + +- On the footnote itself, they generate a hyperlink target whose name + is the autonumber label (doesn't include the "#"). + +- They allow an automatically numbered footnote to be referred to more + than once, as a footnote reference or hyperlink reference. For + example: + + ``` + If [#note]_ is the first footnote reference, it will show up as + "[1]". We can refer to it again as [#note]_ and again see + "[1]". We can also refer to it as note_ (an ordinary internal + hyperlink reference). + + .. [#note] This is the footnote labeled "note". + ``` + +The numbering is determined by the order of the footnotes, not by the +order of the references. For footnote references without autonumber +labels (`[#]_`), the footnotes and footnote references must be in +the same relative order but need not alternate in lock-step. For +example: + +``` +[#]_ is a reference to footnote 1, and [#]_ is a reference to +footnote 2. + +.. [#] This is footnote 1. +.. [#] This is footnote 2. +.. [#] This is footnote 3. + +[#]_ is a reference to footnote 3. +``` + +Special care must be taken if footnotes themselves contain +auto-numbered footnote references, or if multiple references are made +in close proximity. Footnotes and references are noted in the order +they are encountered in the document, which is not necessarily the +same as the order in which a person would read them. + +###### Auto-Symbol Footnotes + +An asterisk ("\*") may be used for footnote labels to request automatic +symbol generation for footnotes and footnote references. The asterisk +may be the only character in the label. For example: + +``` +Here is a symbolic footnote reference: [*]_. + +.. [*] This is the footnote. +``` + +A transform will insert symbols as labels into corresponding footnotes +and footnote references. The number of references must be equal to +the number of footnotes. One symbol footnote cannot have multiple +references. + +The standard Docutils system uses the following symbols for footnote +marks [^id12]: + +- asterisk/star ("\*") +- dagger (HTML character entity "\†", Unicode U+02020) +- double dagger ("\‡"/U+02021) +- section mark ("\§"/U+000A7) +- pilcrow or paragraph mark ("\¶"/U+000B6) +- number sign ("#") +- spade suit ("\♠"/U+02660) +- heart suit ("\♥"/U+02665) +- diamond suit ("\♦"/U+02666) +- club suit ("\♣"/U+02663) + +[^id12]: This list was inspired by the list of symbols for "Note + Reference Marks" in The Chicago Manual of Style, 14th edition, + section 12.51. "Parallels" ("||") were given in CMoS instead of + the pilcrow. The last four symbols (the card suits) were added + arbitrarily. + +If more than ten symbols are required, the same sequence will be +reused, doubled and then tripled, and so on ("\*\*" etc.). + +:::{Note} +When using auto-symbol footnotes, the choice of output +encoding is important. Many of the symbols used are not encodable +in certain common text encodings such as Latin-1 (ISO 8859-1). The +use of UTF-8 for the output encoding is recommended. An +alternative for HTML and XML output is to use the +"xmlcharrefreplace" [output encoding error handler](../../user/config.html#output-encoding-error-handler). +::: + +###### Mixed Manual and Auto-Numbered Footnotes + +Manual and automatic footnote numbering may both be used within a +single document, although the results may not be expected. Manual +numbering takes priority. Only unused footnote numbers are assigned +to auto-numbered footnotes. The following example should be +illustrative: + +``` +[2]_ will be "2" (manually numbered), +[#]_ will be "3" (anonymous auto-numbered), and +[#label]_ will be "1" (labeled auto-numbered). + +.. [2] This footnote is labeled manually, so its number is fixed. + +.. [#label] This autonumber-labeled footnote will be labeled "1". + It is the first auto-numbered footnote and no other footnote + with label "1" exists. The order of the footnotes is used to + determine numbering, not the order of the footnote references. + +.. [#] This footnote will be labeled "3". It is the second + auto-numbered footnote, but footnote label "2" is already used. +``` + +##### Citations + +Citations are identical to footnotes except that they use only +non-numeric labels such as `[note]` or `[GVR2001]`. Citation +labels are simple [reference names] (case-insensitive single words +consisting of alphanumerics plus internal hyphens, underscores, and +periods; no whitespace). Citations may be rendered separately and +differently from footnotes. For example: + +``` +Here is a citation reference: [CIT2002]_. + +.. [CIT2002] This is the citation. It's just like a footnote, + except the label is textual. +``` + +(hyperlinks)= + +##### Hyperlink Targets + +Doctree element: target. + +These are also called `` _`explicit hyperlink targets` ``, to differentiate +them from [implicit hyperlink targets] defined below. + +Hyperlink targets identify a location within or outside of a document, +which may be linked to by [hyperlink references]. + +Hyperlink targets may be named or anonymous. Named hyperlink targets +consist of an explicit markup start (".. "), an underscore, the +reference name (no trailing underscore), a colon, whitespace, and a +link block: + +``` +.. _hyperlink-name: link-block +``` + +Reference names are whitespace-neutral and case-insensitive. See +[Reference Names] for details and examples. + +Anonymous hyperlink targets consist of an explicit markup start +(".. "), two underscores, a colon, whitespace, and a link block; there +is no reference name: + +``` +.. __: anonymous-hyperlink-target-link-block +``` + +An alternate syntax for anonymous hyperlinks consists of two +underscores, a space, and a link block: + +``` +__ anonymous-hyperlink-target-link-block +``` + +See [Anonymous Hyperlinks] below. + +There are three types of hyperlink targets: internal, external, and +indirect. + +1. `` _`Internal hyperlink targets` `` have empty link blocks. They provide + an end point allowing a hyperlink to connect one place to another + within a document. An internal hyperlink target points to the + element following the target. For example: + + ``` + Clicking on this internal hyperlink will take us to the target_ + below. + + .. _target: + + The hyperlink target above points to this paragraph. + ``` + + Internal hyperlink targets may be "chained". Multiple adjacent + internal hyperlink targets all point to the same element: + + ``` + .. _target1: + .. _target2: + + The targets "target1" and "target2" are synonyms; they both + point to this paragraph. + ``` + + If the element "pointed to" is an external hyperlink target (with a + URI in its link block; see #2 below) the URI from the external + hyperlink target is propagated to the internal hyperlink targets; + they will all "point to" the same URI. There is no need to + duplicate a URI. For example, all three of the following hyperlink + targets refer to the same URI: + + ``` + .. _Python DOC-SIG mailing list archive: + .. _archive: + .. _Doc-SIG: http://mail.python.org/pipermail/doc-sig/ + ``` + + An inline form of internal hyperlink target is available; see + [Inline Internal Targets]. + +2. `` _`External hyperlink targets` `` have an absolute or relative URI or + email address in their link blocks. For example, take the + following input: + + ``` + See the Python_ home page for info. + + `Write to me`_ with your questions. + + .. _Python: http://www.python.org + .. _Write to me: jdoe@example.com + ``` + + After processing into HTML, the hyperlinks might be expressed as: + + ``` + See the Python home page + for info. + + Write to me with your + questions. + ``` + + An external hyperlink's URI may begin on the same line as the + explicit markup start and target name, or it may begin in an + indented text block immediately following, with no intervening + blank lines. If there are multiple lines in the link block, they + are concatenated. Any whitespace is removed (whitespace is + permitted to allow for line wrapping). The following external + hyperlink targets are equivalent: + + ``` + .. _one-liner: http://docutils.sourceforge.net/rst.html + + .. _starts-on-this-line: http:// + docutils.sourceforge.net/rst.html + + .. _entirely-below: + http://docutils. + sourceforge.net/rst.html + ``` + + If an external hyperlink target's URI contains an underscore as its + last character, it must be escaped to avoid being mistaken for an + indirect hyperlink target: + + ``` + This link_ refers to a file called ``underscore_``. + + .. _link: underscore\_ + ``` + + It is possible (although not generally recommended) to include URIs + directly within hyperlink references. See [Embedded URIs and Aliases] + below. + +3. `` _`Indirect hyperlink targets` `` have a hyperlink reference in their + link blocks. In the following example, target "one" indirectly + references whatever target "two" references, and target "two" + references target "three", an internal hyperlink target. In + effect, all three reference the same thing: + + ``` + .. _one: two_ + .. _two: three_ + .. _three: + ``` + + Just as with [hyperlink references] anywhere else in a document, + if a phrase-reference is used in the link block it must be enclosed + in backquotes. As with [external hyperlink targets], the link + block of an indirect hyperlink target may begin on the same line as + the explicit markup start or the next line. It may also be split + over multiple lines, in which case the lines are joined with + whitespace before being normalized. + + For example, the following indirect hyperlink targets are + equivalent: + + ``` + .. _one-liner: `A HYPERLINK`_ + .. _entirely-below: + `a hyperlink`_ + .. _split: `A + Hyperlink`_ + ``` + + It is possible to include an alias directly within hyperlink + references. See [Embedded URIs and Aliases] below. + +If the reference name contains any colons, either: + +- the phrase must be enclosed in backquotes: + + ``` + .. _`FAQTS: Computers: Programming: Languages: Python`: + http://python.faqts.com/ + ``` + +- or the colon(s) must be backslash-escaped in the link target: + + ``` + .. _Chapter One\: "Tadpole Days": + + It's not easy being green... + ``` + +See [Implicit Hyperlink Targets] below for the resolution of +duplicate reference names. + +Syntax diagram: + +``` ++-------+----------------------+ +| ".. " | "_" name ":" link | ++-------+ block | + | | + +----------------------+ +``` + +###### Anonymous Hyperlinks + +The [World Wide Web Consortium] recommends in its [HTML Techniques +for Web Content Accessibility Guidelines][html techniques for web content accessibility guidelines] that authors should +"clearly identify the target of each link." Hyperlink references +should be as verbose as possible, but duplicating a verbose hyperlink +name in the target is onerous and error-prone. Anonymous hyperlinks +are designed to allow convenient verbose hyperlink references, and are +analogous to [Auto-Numbered Footnotes]. They are particularly useful +in short or one-off documents. However, this feature is easily abused +and can result in unreadable plaintext and/or unmaintainable +documents. Caution is advised. + +Anonymous [hyperlink references] are specified with two underscores +instead of one: + +``` +See `the web site of my favorite programming language`__. +``` + +Anonymous targets begin with ".. \_\_:"; no reference name is required +or allowed: + +``` +.. __: http://www.python.org +``` + +As a convenient alternative, anonymous targets may begin with "\_\_" +only: + +``` +__ http://www.python.org +``` + +The reference name of the reference is not used to match the reference +to its target. Instead, the order of anonymous hyperlink references +and targets within the document is significant: the first anonymous +reference will link to the first anonymous target. The number of +anonymous hyperlink references in a document must match the number of +anonymous targets. For readability, it is recommended that targets be +kept close to references. Take care when editing text containing +anonymous references; adding, removing, and rearranging references +require attention to the order of corresponding targets. + +##### Directives + +Doctree elements: depend on the directive. + +Directives are an extension mechanism for reStructuredText, a way of +adding support for new constructs without adding new primary syntax +(directives may support additional syntax locally). All standard +directives (those implemented and registered in the reference +reStructuredText parser) are described in the [reStructuredText +Directives][restructuredtext directives] document, and are always available. Any other directives +are domain-specific, and may require special action to make them +available when processing the document. + +For example, here's how an [image] may be placed: + +``` +.. image:: mylogo.jpeg +``` + +A [figure] (a graphic with a caption) may placed like this: + +``` +.. figure:: larch.png + + The larch. +``` + +An [admonition] (note, caution, etc.) contains other body elements: + +``` +.. note:: This is a paragraph + + - Here is a bullet list. +``` + +Directives are indicated by an explicit markup start (".. ") followed +by the directive type, two colons, and whitespace (together called the +"directive marker"). Directive types are case-insensitive single +words (alphanumerics plus isolated internal hyphens, underscores, +plus signs, colons, and periods; no whitespace). Two colons are used +after the directive type for these reasons: + +- Two colons are distinctive, and unlikely to be used in common text. + +- Two colons avoids clashes with common comment text like: + + ``` + .. Danger: modify at your own risk! + ``` + +- If an implementation of reStructuredText does not recognize a + directive (i.e., the directive-handler is not installed), a level-3 + (error) system message is generated, and the entire directive block + (including the directive itself) will be included as a literal + block. Thus "::" is a natural choice. + +The directive block is consists of any text on the first line of the +directive after the directive marker, and any subsequent indented +text. The interpretation of the directive block is up to the +directive code. There are three logical parts to the directive block: + +1. Directive arguments. +2. Directive options. +3. Directive content. + +Individual directives can employ any combination of these parts. +Directive arguments can be filesystem paths, URLs, title text, etc. +Directive options are indicated using [field lists]; the field names +and contents are directive-specific. Arguments and options must form +a contiguous block beginning on the first or second line of the +directive; a blank line indicates the beginning of the directive +content block. If either arguments and/or options are employed by the +directive, a blank line must separate them from the directive content. +The "figure" directive employs all three parts: + +``` +.. figure:: larch.png + :scale: 50 + + The larch. +``` + +Simple directives may not require any content. If a directive that +does not employ a content block is followed by indented text anyway, +it is an error. If a block quote should immediately follow a +directive, use an empty comment in-between (see [Comments] below). + +Actions taken in response to directives and the interpretation of text +in the directive content block or subsequent text block(s) are +directive-dependent. See [reStructuredText Directives] for details. + +Directives are meant for the arbitrary processing of their contents, +which can be transformed into something possibly unrelated to the +original text. It may also be possible for directives to be used as +pragmas, to modify the behavior of the parser, such as to experiment +with alternate syntax. There is no parser support for this +functionality at present; if a reasonable need for pragma directives +is found, they may be supported. + +Directives do not generate "directive" elements; they are a *parser +construct* only, and have no intrinsic meaning outside of +reStructuredText. Instead, the parser will transform recognized +directives into (possibly specialized) document elements. Unknown +directives will trigger level-3 (error) system messages. + +Syntax diagram: + +``` ++-------+-------------------------------+ +| ".. " | directive type "::" directive | ++-------+ block | + | | + +-------------------------------+ +``` + +##### Substitution Definitions + +Doctree element: substitution_definition. + +Substitution definitions are indicated by an explicit markup start +(".. ") followed by a vertical bar, the substitution text, another +vertical bar, whitespace, and the definition block. Substitution text +may not begin or end with whitespace. A substitution definition block +contains an embedded inline-compatible directive (without the leading +".. "), such as "[image]" or "[replace]". For example: + +``` +The |biohazard| symbol must be used on containers used to +dispose of medical waste. + +.. |biohazard| image:: biohazard.png +``` + +It is an error for a substitution definition block to directly or +indirectly contain a circular substitution reference. + +[Substitution references] are replaced in-line by the processed +contents of the corresponding definition (linked by matching +substitution text). Matches are case-sensitive but forgiving; if no +exact match is found, a case-insensitive comparison is attempted. + +Substitution definitions allow the power and flexibility of +block-level [directives] to be shared by inline text. They are a way +to include arbitrarily complex inline structures within text, while +keeping the details out of the flow of text. They are the equivalent +of SGML/XML's named entities or programming language macros. + +Without the substitution mechanism, every time someone wants an +application-specific new inline structure, they would have to petition +for a syntax change. In combination with existing directive syntax, +any inline structure can be coded without new syntax (except possibly +a new directive). + +Syntax diagram: + +``` ++-------+-----------------------------------------------------+ +| ".. " | "|" substitution text "| " directive type "::" data | ++-------+ directive block | + | | + +-----------------------------------------------------+ +``` + +Following are some use cases for the substitution mechanism. Please +note that most of the embedded directives shown are examples only and +have not been implemented. + +Objects + +: Substitution references may be used to associate ambiguous text + with a unique object identifier. + + For example, many sites may wish to implement an inline "user" + directive: + + ``` + |Michael| and |Jon| are our widget-wranglers. + + .. |Michael| user:: mjones + .. |Jon| user:: jhl + ``` + + Depending on the needs of the site, this may be used to index the + document for later searching, to hyperlink the inline text in + various ways (mailto, homepage, mouseover Javascript with profile + and contact information, etc.), or to customize presentation of + the text (include username in the inline text, include an icon + image with a link next to the text, make the text bold or a + different color, etc.). + + The same approach can be used in documents which frequently refer + to a particular type of objects with unique identifiers but + ambiguous common names. Movies, albums, books, photos, court + cases, and laws are possible. For example: + + ``` + |The Transparent Society| offers a fascinating alternate view + on privacy issues. + + .. |The Transparent Society| book:: isbn=0738201448 + ``` + + Classes or functions, in contexts where the module or class names + are unclear and/or interpreted text cannot be used, are another + possibility: + + ``` + 4XSLT has the convenience method |runString|, so you don't + have to mess with DOM objects if all you want is the + transformed output. + + .. |runString| function:: module=xml.xslt class=Processor + ``` + +Images + +: Images are a common use for substitution references: + + ``` + West led the |H| 3, covered by dummy's |H| Q, East's |H| K, + and trumped in hand with the |S| 2. + + .. |H| image:: /images/heart.png + :height: 11 + :width: 11 + .. |S| image:: /images/spade.png + :height: 11 + :width: 11 + + * |Red light| means stop. + * |Green light| means go. + * |Yellow light| means go really fast. + + .. |Red light| image:: red_light.png + .. |Green light| image:: green_light.png + .. |Yellow light| image:: yellow_light.png + + |-><-| is the official symbol of POEE_. + + .. |-><-| image:: discord.png + .. _POEE: http://www.poee.org/ + ``` + + The "[image]" directive has been implemented. + +Styles [^id15] + +: Substitution references may be used to associate inline text with + an externally defined presentation style: + + ``` + Even |the text in Texas| is big. + + .. |the text in Texas| style:: big + ``` + + The style name may be meaningful in the context of some particular + output format (CSS class name for HTML output, LaTeX style name + for LaTeX, etc), or may be ignored for other output formats (such + as plaintext). + + % @@@ This needs to be rethought & rewritten or removed: + % + % Interpreted text is unsuitable for this purpose because the set + % of style names cannot be predefined - it is the domain of the + % content author, not the author of the parser and output + % formatter - and there is no way to associate a style name + % argument with an interpreted text style role. Also, it may be + % desirable to use the same mechanism for styling blocks:: + % + % .. style:: motto + % At Bob's Underwear Shop, we'll do anything to get in + % your pants. + % + % .. style:: disclaimer + % All rights reversed. Reprint what you like. + + [^id15]: There may be sufficient need for a "style" mechanism to + warrant simpler syntax such as an extension to the interpreted + text role syntax. The substitution mechanism is cumbersome for + simple text styling. + +Templates + +: Inline markup may be used for later processing by a template + engine. For example, a [Zope] author might write: + + ``` + Welcome back, |name|! + + .. |name| tal:: replace user/getUserName + ``` + + After processing, this ZPT output would result: + + ``` + Welcome back, + name! + ``` + + Zope would then transform this to something like "Welcome back, + David!" during a session with an actual user. + +Replacement text + +: The substitution mechanism may be used for simple macro + substitution. This may be appropriate when the replacement text + is repeated many times throughout one or more documents, + especially if it may need to change later. A short example is + unavoidably contrived: + + ``` + |RST|_ is a little annoying to type over and over, especially + when writing about |RST| itself, and spelling out the + bicapitalized word |RST| every time isn't really necessary for + |RST| source readability. + + .. |RST| replace:: reStructuredText + .. _RST: http://docutils.sourceforge.net/rst.html + ``` + + Note the trailing underscore in the first use of a substitution + reference. This indicates a reference to the corresponding + hyperlink target. + + Substitution is also appropriate when the replacement text cannot + be represented using other inline constructs, or is obtrusively + long: + + ``` + But still, that's nothing compared to a name like + |j2ee-cas|__. + + .. |j2ee-cas| replace:: + the Java `TM`:super: 2 Platform, Enterprise Edition Client + Access Services + __ http://developer.java.sun.com/developer/earlyAccess/ + j2eecas/ + ``` + + The "[replace]" directive has been implemented. + +##### Comments + +Doctree element: comment. + +Arbitrary indented text may follow the explicit markup start and will +be processed as a comment element. No further processing is done on +the comment block text; a comment contains a single "text blob". +Depending on the output formatter, comments may be removed from the +processed output. The only restriction on comments is that they not +use the same syntax as any of the other explicit markup constructs: +substitution definitions, directives, footnotes, citations, or +hyperlink targets. To ensure that none of the other explicit markup +constructs is recognized, leave the ".." on a line by itself: + +``` +.. This is a comment +.. + _so: is this! +.. + [and] this! +.. + this:: too! +.. + |even| this:: ! +``` + +(empty-comments)= + +An explicit markup start followed by a blank line and nothing else +(apart from whitespace) is an "`` _`empty comment` ``". It serves to +terminate a preceding construct, and does **not** consume any indented +text following. To have a block quote follow a list or any indented +construct, insert an unindented empty comment in-between. + +Syntax diagram: + +``` ++-------+----------------------+ +| ".. " | comment | ++-------+ block | + | | + +----------------------+ +``` + +### Implicit Hyperlink Targets + +Implicit hyperlink targets are generated by section titles, footnotes, +and citations, and may also be generated by extension constructs. +Implicit hyperlink targets otherwise behave identically to explicit +[hyperlink targets]. + +Problems of ambiguity due to conflicting duplicate implicit and +explicit reference names are avoided by following this procedure: + +1. [Explicit hyperlink targets] override any implicit targets having + the same reference name. The implicit hyperlink targets are + removed, and level-1 (info) system messages are inserted. +2. Duplicate implicit hyperlink targets are removed, and level-1 + (info) system messages inserted. For example, if two or more + sections have the same title (such as "Introduction" subsections of + a rigidly-structured document), there will be duplicate implicit + hyperlink targets. +3. Duplicate explicit hyperlink targets are removed, and level-2 + (warning) system messages are inserted. Exception: duplicate + [external hyperlink targets] (identical hyperlink names and + referenced URIs) do not conflict, and are not removed. + +System messages are inserted where target links have been removed. +See "Error Handling" in [PEP 258]. + +The parser must return a set of *unique* hyperlink targets. The +calling software (such as the [Docutils]) can warn of unresolvable +links, giving reasons for the messages. + +### Inline Markup + +In reStructuredText, inline markup applies to words or phrases within +a text block. The same whitespace and punctuation that serves to +delimit words in written text is used to delimit the inline markup +syntax constructs. The text within inline markup may not begin or end +with whitespace. Arbitrary [character-level inline markup] is +supported although not encouraged. Inline markup cannot be nested. + +There are nine inline markup constructs. Five of the constructs use +identical start-strings and end-strings to indicate the markup: + +- [emphasis]: "\*" +- [strong emphasis]: "\*\*" +- [interpreted text]: "\`" +- [inline literals]: "\`\`" +- [substitution references]: "|" + +Three constructs use different start-strings and end-strings: + +- [inline internal targets]: "\_\`" and "\`" +- [footnote references]: "\[" and "\]\_" +- [hyperlink references]: "\`" and "\`\_" (phrases), or just a + trailing "\_" (single words) + +[Standalone hyperlinks] are recognized implicitly, and use no extra +markup. + +#### Inline markup recognition rules + +Inline markup start-strings and end-strings are only recognized if all of +the following conditions are met: + +1. Inline markup start-strings must start a text block or be + immediately preceded by + + - whitespace, + - one of the ASCII characters `- : / ' " < ( [ {` or + - a non-ASCII punctuation character with [Unicode category] + `Pd` (Dash), + `Po` (Other), + `Ps` (Open), + `Pi` (Initial quote), or + `Pf` (Final quote) [^pipf]. + +2. Inline markup start-strings must be immediately followed by + non-whitespace. + +3. Inline markup end-strings must be immediately preceded by + non-whitespace. + +4. Inline markup end-strings must end a text block or be immediately + followed by + + - whitespace, + - one of the ASCII characters `- . , : ; ! ? \ / ' " ) ] } >` or + - a non-ASCII punctuation character with [Unicode category] + `Pd` (Dash), + `Po` (Other), + `Pe` (Close), + `Pf` (Final quote), or + `Pi` (Initial quote) [^pipf]. + +5. If an inline markup start-string is immediately preceded by one of the + ASCII characters `' " < ( [ {`, or a character with Unicode character + category `Ps`, `Pi`, or `Pf`, it must not be followed by the + corresponding [^corresponding-quotes] closing character from + `' " ) ] } >` or the categories `Pe`, `Pf`, or `Pi`. + +6. An inline markup end-string must be separated by at least one + character from the start-string. + +7. An unescaped backslash preceding a start-string or end-string will + disable markup recognition, except for the end-string of [inline + literals][inline literals]. See [Escaping Mechanism] above for details. + +[^pipf]: `Pi` (Punctuation, Initial quote) characters are "usually + closing, sometimes opening". `Pf` (Punctuation, Final quote) + characters are "usually closing, sometimes opening". + +[^corresponding-quotes]: For quotes, corresponding characters can be + any of the [quotation marks in international usage] + +The inline markup recognition rules were devised to allow 90% of non-markup +uses of "\*", "\`", "\_", and "|" without escaping. For example, none of the +following terms are recognized as containing inline markup strings: + +- 2\*x a\*\*b O(N\*\*2) e\*\*(x\*y) f(x)\*f(y) a|b file\*.\* (breaks 1) +- 2 * x a \*\* b (\* BOM32\_\* \` \`\` _ \_\_ | (breaks 2) +- "\*" '|' (\*) \[\*\] {\*} \<\*> + ‘\*’ ‚\*‘ ‘\*‚ ’\*’ ‚\*’ + “\*” „\*“ “\*„ ”\*” „\*” + »\*« ›\*‹ «\*» »\*» ›\*› (breaks 5) +- || (breaks 6) +- \_\_init\_\_ \_\_init\_\_() + +No escaping is required inside the following inline markup examples: + +- *2 * x \*a \*\*b \*.txt* (breaks 3) +- *2\*x a\*\*b O(N\*\*2) e\*\*(x\*y) f(x)\*f(y) a\*(1+2)* (breaks 4) + +It may be desirable to use [inline literals] for some of these anyhow, +especially if they represent code snippets. It's a judgment call. + +These cases *do* require either literal-quoting or escaping to avoid +misinterpretation: + +> \*4, class\_, \*args, \*\*kwargs, \`TeX-quoted', \*ML, \*.txt + +In most use cases, [inline literals] or [literal blocks] are the best +choice (by default, this also selects a monospaced font): + +``` +*4, class_, *args, **kwargs, `TeX-quoted', *ML, *.txt +``` + +#### Recognition order + +Inline markup delimiter characters are used for multiple constructs, +so to avoid ambiguity there must be a specific recognition order for +each character. The inline markup recognition order is as follows: + +- Asterisks: [Strong emphasis] ("\*\*") is recognized before [emphasis] + ("\*"). +- Backquotes: [Inline literals] ("\`\`"), [inline internal targets] + (leading "\_\`", trailing "\`"), are mutually independent, and are + recognized before phrase [hyperlink references] (leading "\`", + trailing "\`\_") and [interpreted text] ("\`"). +- Trailing underscores: Footnote references ("\[" + label + "\]\_") and + simple [hyperlink references] (name + trailing "\_") are mutually + independent. +- Vertical bars: [Substitution references] ("|") are independently + recognized. +- [Standalone hyperlinks] are the last to be recognized. + +#### Character-Level Inline Markup + +It is possible to mark up individual characters within a word with +backslash escapes (see [Escaping Mechanism] above). Backslash +escapes can be used to allow arbitrary text to immediately follow +inline markup: + +``` +Python ``list``\s use square bracket syntax. +``` + +The backslash will disappear from the processed document. The word +"list" will appear as inline literal text, and the letter "s" will +immediately follow it as normal text, with no space in-between. + +Arbitrary text may immediately precede inline markup using +backslash-escaped whitespace: + +``` +Possible in *re*\ ``Structured``\ *Text*, though not encouraged. +``` + +The backslashes and spaces separating "re", "Structured", and "Text" +above will disappear from the processed document. + +:::{CAUTION} +The use of backslash-escapes for character-level inline markup is +not encouraged. Such use is ugly and detrimental to the +unprocessed document's readability. Please use this feature +sparingly and only where absolutely necessary. +::: + +#### Emphasis + +Doctree element: emphasis. + +Start-string = end-string = "\*". + +Text enclosed by single asterisk characters is emphasized: + +``` +This is *emphasized text*. +``` + +Emphasized text is typically displayed in italics. + +#### Strong Emphasis + +Doctree element: strong. + +Start-string = end-string = "\*\*". + +Text enclosed by double-asterisks is emphasized strongly: + +``` +This is **strong text**. +``` + +Strongly emphasized text is typically displayed in boldface. + +#### Interpreted Text + +Doctree element: depends on the explicit or implicit role and +processing. + +Start-string = end-string = "\`". + +Interpreted text is text that is meant to be related, indexed, linked, +summarized, or otherwise processed, but the text itself is typically +left alone. Interpreted text is enclosed by single backquote +characters: + +``` +This is `interpreted text`. +``` + +The "role" of the interpreted text determines how the text is +interpreted. The role may be inferred implicitly (as above; the +"default role" is used) or indicated explicitly, using a role marker. +A role marker consists of a colon, the role name, and another colon. +A role name is a single word consisting of alphanumerics plus isolated +internal hyphens, underscores, plus signs, colons, and periods; +no whitespace or other characters are allowed. A role marker is +either a prefix or a suffix to the interpreted text, whichever reads +better; it's up to the author: + +``` +:role:`interpreted text` + +`interpreted text`:role: +``` + +Interpreted text allows extensions to the available inline descriptive +markup constructs. To [emphasis], [strong emphasis], [inline +literals][inline literals], and [hyperlink references], we can add "title reference", +"index entry", "acronym", "class", "red", "blinking" or anything else +we want. Only pre-determined roles are recognized; unknown roles will +generate errors. A core set of standard roles is implemented in the +reference parser; see [reStructuredText Interpreted Text Roles] for +individual descriptions. The [role] directive can be used to define +custom interpreted text roles. In addition, applications may support +specialized roles. + +#### Inline Literals + +Doctree element: literal. + +Start-string = end-string = "\`\`". + +Text enclosed by double-backquotes is treated as inline literals: + +``` +This text is an example of ``inline literals``. +``` + +Inline literals may contain any characters except two adjacent +backquotes in an end-string context (according to the recognition +rules above). No markup interpretation (including backslash-escape +interpretation) is done within inline literals. + +Line breaks are *not* preserved in inline literals. Although a +reStructuredText parser will preserve runs of spaces in its output, +the final representation of the processed document is dependent on the +output formatter, thus the preservation of whitespace cannot be +guaranteed. If the preservation of line breaks and/or other +whitespace is important, [literal blocks] should be used. + +Inline literals are useful for short code snippets. For example: + +``` +The regular expression ``[+-]?(\d+(\.\d*)?|\.\d+)`` matches +floating-point numbers (without exponents). +``` + +#### Hyperlink References + +Doctree element: reference. + +- Named hyperlink references: + + - Start-string = "" (empty string), end-string = "\_". + - Start-string = "\`", end-string = "\`\_". (Phrase references.) + +- Anonymous hyperlink references: + + - Start-string = "" (empty string), end-string = "\_\_". + - Start-string = "\`", end-string = "\`\_\_". (Phrase references.) + +Hyperlink references are indicated by a trailing underscore, "\_", +except for [standalone hyperlinks] which are recognized +independently. The underscore can be thought of as a right-pointing +arrow. The trailing underscores point away from hyperlink references, +and the leading underscores point toward [hyperlink targets]. + +Hyperlinks consist of two parts. In the text body, there is a source +link, a reference name with a trailing underscore (or two underscores +for [anonymous hyperlinks]): + +``` +See the Python_ home page for info. +``` + +A target link with a matching reference name must exist somewhere else +in the document. See [Hyperlink Targets] for a full description). + +[Anonymous hyperlinks] (which see) do not use reference names to +match references to targets, but otherwise behave similarly to named +hyperlinks. + +##### Embedded URIs and Aliases + +A hyperlink reference may directly embed a target URI or (since +Docutils 0.11) a hyperlink reference within angle brackets ("\<...>") +as follows: + +``` +See the `Python home page