From 8e6ab25390b3a3d38ea3e33343142dc2a6d8ea7e Mon Sep 17 00:00:00 2001 From: nilinykh Date: Mon, 27 May 2024 17:49:45 +0200 Subject: [PATCH] workshop abstracts updae --- .../language-and-vision-workshop.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/events/language-and-vision-workshop/language-and-vision-workshop.md b/events/language-and-vision-workshop/language-and-vision-workshop.md index dbc10f4..daef7fd 100644 --- a/events/language-and-vision-workshop/language-and-vision-workshop.md +++ b/events/language-and-vision-workshop/language-and-vision-workshop.md @@ -64,6 +64,9 @@ The day after the workshop, we will have a doctoral thesis defense that everyone 13:15 - 13:45: [Jörg Tiedemann](https://researchportal.helsinki.fi/en/persons/jörg-tiedemann), University of Helsinki + * **Title**: Releasing the MAMMOTH - a framework for modular NLP + * **Abstract**: Neural language models have been grown in size and importance over the past years. We address two challenging aspects in the field of NLP: The support of a wide variety of languages and the runtime efficiency of such models. We focus on encoder-decoder models and modular architectures that balance between task-specific components and parameter sharing. In particular, we want to achieve effective cross-lingual transfer learning while keeping language-specific modules that can operate independently. The latter is important for efficient inference reducing computational costs and energy consumption at runtime, a crucial task for modern NLP. Our toolkit, MAMMOTH, is a flexible framework for training various types of modular architectures making it possible to systematically compare different approaches also beyond machine translation and single modalities. + 13:45 - 14:15: [Ece Takmaz](https://ecekt.github.io), Utrecht University (online) * **Title**: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes @@ -71,6 +74,10 @@ The day after the workshop, we will have a doctoral thesis defense that everyone 14:15 - 14:45: [Carina Silberer](https://sites.google.com/view/carinasilberer/home), University of Stuttgart + * **Title**: Multimodal Knowledge Learning of Actions and Everyday Procedures + * **Abstract**: In order to instruct and interact with machines in everyday life using natural language, they need to be able to understand and model procedural tasks. This ability is therefore relevant for the fields of NLP, human-computer interaction and robotics, as well as for multimodal machine learning in general. Despite its relevance, multimodal (visual-linguistic, VL) modelling of procedures, i.e. the task of learning and understanding procedures from language and visual data, is still a challenge for current visual-linguistic models. This talk focuses on certain crucial aspects that underlie the modelling of everyday procedural tasks based on visual-linguistic data. In particular, I will present past and ongoing work on (i) the commonsense types of "events" and "actions" that have proven to be very difficult for current VL systems, (ii) affordance learning, i.e., modelling the actions that an object offers to individuals in a given environment, (iii) predicting the effects in object state change caused by performing an action on objects, and (iv) reasoning about the sequential aspect of procedures in terms of the relationship between individual steps to achieve a task goal, in particular the tasks of visual goal step inference and identifying optional and interchangeable steps. + + 14:45 - 15:00: Coffee break 15:00 - 15:30: [Desmond Elliott](https://elliottd.github.io), University of Copenhagen @@ -79,8 +86,14 @@ The day after the workshop, we will have a doctoral thesis defense that everyone 15:30 - 16:00: [Mario Guilianelli](https://glnmario.github.io), ETH Zürich (online) + * **Title**: Measuring utterance uncertainty and predictability via simulation of contextually plausible alternatives + * **Abstract**: Viewing linguistic communication as information transmission between cognitive agents, successful language production can be understood as an act of reducing the uncertainty over future states that a comprehender may be anticipating. When an individual utters a sentence, they narrow down the comprehender’s expectations, and they do so by an amount proportional to the contextual predictability of the utterance. I will discuss two recent studies that demonstrate how we can empirically estimate utterance uncertainty and predictability by simulating potential upcoming linguistic contributions using neural text generators. The first study introduces a statistical framework to quantify utterance uncertainty as production variability, and evaluates the alignment of language generators to the production variability observed in humans. We find that different types of production tasks exhibit distinct levels of lexical, syntactic, and semantic variability, and neural text generators generally achieve satisfactory calibration of uncertainty. In the second study, we use the previously introduced statistical framework to define a novel measure of utterance predictability, which we term information value. Information value quantifies predictability by measuring the distance from contextually plausible alternatives and offers advantages over traditional measures by disentangling various dimensions of uncertainty and being less influenced by surface form competition. Psycholinguistic experiments demonstrate that information value is a superior predictor of utterance acceptability in written and spoken dialogue compared to token-level surprisal aggregates, and that it complements surprisal in predicting eye-tracked reading times. + 16:00 - 16:30: [Bill Noble](https://winobes.github.io), University of Gothenburg + * **Title**: + * **Abstract**: + 16:30: Closing