Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FoLia-correct: resolve HEMP's using FoLiA::Correction #47

Open
kosloot opened this issue Sep 15, 2020 · 0 comments
Open

FoLia-correct: resolve HEMP's using FoLiA::Correction #47

kosloot opened this issue Sep 15, 2020 · 0 comments
Assignees

Comments

@kosloot
Copy link
Contributor

kosloot commented Sep 15, 2020

This came up after issue #45

when resolving a HEMP, FoLiA-correct just adds the resolved text to one of the string/word nodes.
I assume using a real Correction would be better.

for example:

    <p xml:id="mwsel.p.1">
      <t class="OCR">•c c•</t>
      <str xml:id="mwsel.p.1.str.1">
        <t class="OCR">•c</t>
      </str>
      <str xml:id="mwsel.p.1.str.2">
        <t class="OCR">c•</t>
      </str>
    </p>

assuming •c c• is in the PUNCT file as •c c• cc this HEMP is resolved as:

   <p xml:id="mwsel.p.1">
      <t>cc</t>
      <t class="OCR">•c c•</t>
      <str xml:id="mwsel.p.1.str.1">
        <t class="OCR">•c</t>
      </str>
      <str xml:id="mwsel.p.1.str.2">
        <t offset="0">cc</t>
        <t class="OCR">c•</t>
      </str>
    </p>

IMHO a much better solution would be:

   <p xml:id="mwsel.p.1">
      <t>cc</t>
      <t class="OCR">•c c•</t>
      <correction xml:id="mwsel.p.1.correction.1">
        <new>
          <str xml:id="mwsel.p.1.str.edit.1">
            <t >cc</t>
          </str>
        </new>
         <original>
          <str xml:id="mwsel.p.1.str.1">
            <t class="OCR">•c</t>
          </str>
          <str xml:id="mwsel.p.1.str.2">
            <t class="OCR">c•</t>
          </str>
        </original>
      </correction>
    </p>

interesting point: HEMP resolution is done before other corrections. I assume that a real correction using the cc will not be performed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants