Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

values need not be in descriptions #2663

Open
sydb opened this issue Jan 20, 2025 · 0 comments
Open

values need not be in descriptions #2663

sydb opened this issue Jan 20, 2025 · 0 comments

Comments

@sydb
Copy link
Member

sydb commented Jan 20, 2025

Back on #2579 @peterstadler pointed out that we do not normally list the legal, suggested, or sample values of an enumerated list in the description (i.e., <desc>) of an attribute (or, I suppose, of a datatype). However, at least in the case listed there (gap/@reason), that happens in at least 3 languages.

There are quite a few cases of this phenomenon, but a submit a human has to look at each to see if it is just being listed (and thus should be corrected) or is being mentioned for some other reason. (Also note that the should be encoded as <val>, but might be encoded as something else.¹) No point in listing them here, as the list may have changed by the time someone gets around to this ticket. I searched for them using the following XSLT.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xpath-default-namespace="http://www.tei-c.org/ns/1.0"
  exclude-result-prefixes="#all"
  version="3.0">

  <!--
    Input: p5.xml
    Output: List of cases of phrase-level elements inside <desc> (itself inside <attDef>)
            where the contents of the element match one of the enumerated values of the
            attribute being defined.
            
    The idea is this is a list of cases of phrase-level encoding that might be
    inappropriate, a human should take a look.
  -->

  <xsl:output method="text"/>
  
  <xsl:template name="xsl:initial-template" match="/">
    <xsl:apply-templates select="//attDef//desc/*[ not(*)  and  string-length(.) gt 1  and  not( contains( normalize-space(.), ' ') ) ]"/>
    <xsl:text>&#x0A;</xsl:text>
  </xsl:template>

  <xsl:template match="*">
    <xsl:variable name="myGI" select="name(.)"/>
    <xsl:variable name="myContent" select="normalize-space(.)"/>
    <xsl:variable name="myLang" select="ancestor-or-self::*[@xml:lang][1]/@xml:lang"/>
    <xsl:variable name="IamIn" select="ancestor::attDef/ancestor::*[@ident][1]/@ident
                                     ||'/@'
                                     ||ancestor::attDef/@ident"/>
    <xsl:variable name="values" select="ancestor::attDef//valItem/@ident"/>
    <xsl:if test="$myContent = $values">
      <xsl:sequence select="'&#x0A;'
                          ||$myGI
                          ||' in '
                          ||$myLang
                          ||' description of '
                          ||$IamIn
                          ||' is &quot;'
                          ||$myContent
                          ||'&quot; which matches possible value.'"/>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

Worth noting that these are in a variety of languages², and thus whomever is assigned this ticket is not likely to be able to fix them on their own. More likely they will be able to fix some, but will have to reach out to native speakers to fix others.

¹ The list as of today is:

     20 val
     16 att
      3 ident
      1 term

² The list as of today is:

     11 en
      8 ko
      6 zh-TW
      5 ja
      4 es
      4 fr
      2 it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants