-
Notifications
You must be signed in to change notification settings - Fork 308
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
eb5a2bb
commit 2ba28c6
Showing
1 changed file
with
113 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
<?xml version='1.0' encoding='UTF-8'?> | ||
<collection id="2024.tlt"> | ||
<volume id="1" ingest-date="2025-01-07" type="proceedings"> | ||
<meta> | ||
<booktitle>Proceedings of the 22nd Workshop on Treebanks and Linguistic Theories (TLT 2024)</booktitle> | ||
<editor><first>Daniel</first><last>Dakota</last></editor> | ||
<editor><first>Sarah</first><last>Jablotschkin</last></editor> | ||
<editor><first>Sandra</first><last>Kübler</last></editor> | ||
<editor><first>Heike</first><last>Zinsmeister</last></editor> | ||
<publisher>Association for Computational Linguistics</publisher> | ||
<address>Hamburg,Germany</address> | ||
<month>December</month> | ||
<year>2024</year> | ||
<url hash="3df81f53">2024.tlt-1</url> | ||
<venue>tlt</venue> | ||
</meta> | ||
<frontmatter> | ||
<url hash="e8a963df">2024.tlt-1.0</url> | ||
<bibkey>tlt-2024-1</bibkey> | ||
</frontmatter> | ||
<paper id="1"> | ||
<title>Developing the <fixed-case>E</fixed-case>gyptian-<fixed-case>UJ</fixed-case>aen Treebank</title> | ||
<author><first>Roberto</first><last>Antonio Díaz Hernández</last></author> | ||
<author><first>Marco</first><last>Carlo Passarotti</last></author> | ||
<pages>1-10</pages> | ||
<abstract>This paper presents preliminary results of the development of the Egyptian-UJaen treebank, the first dependency treebank created for pre-Coptic Egyptian in Universal Dependencies. It describes the current state of the treebank, explains the approach adopted for the morphosyntactic annotation and discusses some issues concerning the adoption of the CoNLL-U format for the annotation of Egyptian texts. This treebank will surely become a useful linguistic tool for understanding the synchronic and dia- chronic use of pre-Coptic Egyptian.</abstract> | ||
<url hash="11f0af5b">2024.tlt-1.1</url> | ||
<bibkey>antonio-diaz-hernandez-carlo-passarotti-2024-developing</bibkey> | ||
</paper> | ||
<paper id="2"> | ||
<title>Symmetric Dependency Structure of Coordination: Crosslinguistic Arguments from Dependency Length Minimization</title> | ||
<author><first>Adam Przepiórkowski</first><last>Przepiórkowski</last></author> | ||
<author><first>Magdalena</first><last>Borysiak</last></author> | ||
<author><first>Adam</first><last>Okrasiński</last></author> | ||
<author><first>Bartosz</first><last>Pobożniak</last></author> | ||
<author><first>Wojciech</first><last>Stempniak</last></author> | ||
<author><first>Kamil</first><last>Tomaszek</last></author> | ||
<author><first>Adam</first><last>Głowacki</last></author> | ||
<pages>11-17</pages> | ||
<abstract>The aim of this paper is to replicate and extend recent treebank-based considerations regarding the syntactic structure of coordination. Overall, we confirm the previous results that, given the principle of Dependency Length Minimization, corpus data suggest that the structure of coordination is symmetric. While previous work was based on 2 English datasets, we extend the investigation to 3 more English datasets, 3 Polish datasets, and UD corpora for a number of diverse languages. The results confirm the symmetric structure of coordination, but they also make it possible to question some of the previous findings regarding the exact symmetric structure of coordination.</abstract> | ||
<url hash="7cf429d7">2024.tlt-1.2</url> | ||
<bibkey>przepiorkowski-etal-2024-symmetric</bibkey> | ||
</paper> | ||
<paper id="3"> | ||
<title>A First Look at the <fixed-case>U</fixed-case>garitic Poetic Text Corpus</title> | ||
<author><first>Tillmann</first><last>Dönicke</last></author> | ||
<author><first>Clemens</first><last>Steinberger</last></author> | ||
<author><first>Max-Ferdinand</first><last>Zeterberg</last></author> | ||
<author><first>Noah</first><last>Krill</last></author> | ||
<pages>18-26</pages> | ||
<abstract>For the Ugaritic poetic texts there is currently no digital corpus including extensive philological and poetological annotations. Within the research project “Edition des ugaritischen poetischen Textkorpus” (EUPT), these texts are digitised and provided as an online-accessible corpus. This paper briefly introduces the project and outlines the principles of the data model. The focus is on the different annotation levels and their connection with each other.</abstract> | ||
<url hash="df345b5b">2024.tlt-1.3</url> | ||
<bibkey>donicke-etal-2024-first</bibkey> | ||
</paper> | ||
<paper id="4"> | ||
<title><fixed-case>L</fixed-case>ux<fixed-case>B</fixed-case>ank: The First <fixed-case>U</fixed-case>niversal <fixed-case>D</fixed-case>ependency Treebank for <fixed-case>L</fixed-case>uxembourgish</title> | ||
<author><first>Alistair</first><last>Plum</last></author> | ||
<author><first>Caroline</first><last>Döhmer</last></author> | ||
<author><first>Emilia</first><last>Milano</last></author> | ||
<author><first>Anne-Marie</first><last>Lutgen</last></author> | ||
<author><first>Christoph</first><last>Purschke</last></author> | ||
<pages>27-36</pages> | ||
<abstract>For the Ugaritic poetic texts there is currently no digital corpus including extensive philological and poetological annotations. Within the research project “Edition des ugaritischen poetischen Textkorpus” (EUPT), these texts are digitised and provided as an online-accessible corpus. This paper briefly introduces the project and outlines the principles of the data model. The focus is on the different annotation levels and their connection with each other.</abstract> | ||
<url hash="a6567ad2">2024.tlt-1.4</url> | ||
<bibkey>plum-etal-2024-luxbank</bibkey> | ||
</paper> | ||
<paper id="5"> | ||
<title>Building a <fixed-case>U</fixed-case>niversal <fixed-case>D</fixed-case>ependencies Treebank for <fixed-case>G</fixed-case>eorgian</title> | ||
<author><first>Irina</first><last>Lobzhanidze</last></author> | ||
<author><first>Erekle</first><last>Magradze</last></author> | ||
<author><first>Svetlana</first><last>Berikashvili</last></author> | ||
<author><first>Anzor</first><last>Gozalishvili</last></author> | ||
<author><first>Tamar</first><last>Jalaghonia</last></author> | ||
<pages>37-47</pages> | ||
<abstract>This paper presents the design and development of the Georgian Syntactic Treebank within the Universal Dependencies (UD) framework, addressing the unique morphosyntactic challenges ofGeorgian, a Kartvelian language. We describe the methodology for selecting andannotating 3,013 sentences from Wiki, mapping existing tagsets to the UD scheme, and converting data into the CoNLL-U format. The paper also details the training of a UDPipe model using this preliminary treebank.</abstract> | ||
<url hash="c02a95a2">2024.tlt-1.5</url> | ||
<bibkey>lobzhanidze-etal-2024-building</bibkey> | ||
</paper> | ||
<paper id="6"> | ||
<title>Introducing Shallow Syntactic Information within the Graph-based Dependency Parsing</title> | ||
<author><first>Nikolay</first><last>Paev</last></author> | ||
<author><first>Kiril</first><last>Simov</last></author> | ||
<author><first>Petya</first><last>Osenova</last></author> | ||
<pages>48-56</pages> | ||
<abstract>The paper presents a new BERT model, fine-tuned for parsing of Bulgarian texts. This model is extended with a new neural network layer in order to incorporate shallow syntactic information during the training phase. The results show statistically significant improvement over the baseline. Thus, the addition of syntactic knowledge - even partial - makes the model better. Also, some error analysis has been conducted on the results from the parsers. Although the architecture has been designed and tested for Bulgarian, it is also scalable for other languages. This scalability was shown here with some experiments and evaluation on an English treebank with a comparable size.</abstract> | ||
<url hash="df345b5b">2024.tlt-1.6</url> | ||
<bibkey>paev-etal-2024-introducing</bibkey> | ||
</paper> | ||
<paper id="7"> | ||
<title>A Multilingual Parallel Corpus for Coreference Resolution and Information Status in the Literary Domain</title> | ||
<author><first>Andrew</first><last>Dyer</last></author> | ||
<author><first>Ruveyda</first><last>Betul Bahceci</last></author> | ||
<author><first>Maryam</first><last>Rajestari</last></author> | ||
<author><first>Andreas</first><last>Rouvalis</last></author> | ||
<author><first>Aarushi</first><last>Singhal</last></author> | ||
<author><first>Yuliya</first><last>Stodolinska</last></author> | ||
<author><first>Syahidah</first><last>Asma Umniyati</last></author> | ||
<author><first>Helena</first><last>Rodrigues Menezes de Oliveira Vaz</last></author> | ||
<pages>57-66</pages> | ||
<abstract>Information status — the newness or givenness of referents in discourse — is known to affect the production of language at many different levels. At the morphosyntactic level, information status gives rise to special words orders, elisions, and other phenomena that challenge the notion that morphosyntax can be considered independent of discourse context. Though there are many language-specific corpora annotated for information status and its related phenomena, coreference and anaphora resolution, what is not available at present is a cross-lingually consistently annotated corpus or annotation scheme that would allow for comparativestudy of these phenomena across many diverse languages. In this paper we present our work to build such a resource. We are annotating a parsed, parallel corpus of prose in many languages for information status and coreference resolution, so that like-for-like cross-lingual comparisons can be made at the intersection of discourse and syntax. Our corpus can and will be used bot</abstract> | ||
<url hash="a6567ad2">2024.tlt-1.7</url> | ||
<bibkey>dyer-etal-2024-multilingual</bibkey> | ||
</paper> | ||
<paper id="8"> | ||
<title>Dependency Structure of Coordination in Head-final Languages: a Dependency-Length-Minimization-Based Study</title> | ||
<author><first>Wojciech</first><last>Stempniak</last></author> | ||
<pages>67-77</pages> | ||
<abstract>There is no single accepted model of the dependency structure of coordination. Universal Dependencies (UD, De Marneffe et al. 2021) enforces in its corpora an asymmetrical model privileging the coordination’s first conjunct as a standard. Kanayama et al. (2018) criticize that approach stating that this model is incompatible with the grammatical structure of head-final languages. Recent research (Przepiórkowski and Woźniak 2023, Przepiórkowski et al. 2024a) provides a DLM-based argument for the symmetrical models of the dependency structure of English coordination. This paper shows the result of the analysis of coordinations found in UD corpora of two head-final languages, namely Korean and Turkish. Based on the analysis of coordinations and theoretical arguments, an alternative approach to the dependency structure of coordination in head-final languages is suggested.</abstract> | ||
<url hash="c02a95a2">2024.tlt-1.8</url> | ||
<bibkey>stempniak-2024-dependency</bibkey> | ||
</paper> | ||
</volume> | ||
</collection> |