On the proto-role properties inferred by transformer language models
Are you already subscribed?
Login to check
whether this content is already included on your personal or institutional subscription.
Abstract
In recent years Language Models have taken the Computational Linguistics community by storm. Nevertheless, very little is known of the kind of linguistic knowledge that these systems are able to infer from the input they receive. In this work we address whether, and to what extent, different architectures of different sizes are able to encode the semantic content of Dowty (1989, 1991)’s semantic proto-roles in the contextual embeddings that they generate. Following Lebani & Lenci (2021) and Proietti et al. (2022), we test four different models by creating a linear mapping between the generated contextualized embeddings and a semantic space built on the basis of the proto-roles annotations collected by White et al. (2016). For each model, the embeddings generated by the learned mapping were tested against the manual annotation of a set of previously unseen verbs in context, as well as qualitatively investigated to test to what extent they are able to model the semantic properties of the agent of the verbs participating in the so-called causative alternation. All in all, our results not only extend to more Transformer Language Models previous findings showing that protoroles information is available in distributional semantic models, but also show that larger models are not necessarily better at modeling proto-role properties, in line with recent psycholinguistic evidence.
Keywords
- thematic proto-roles
- semantic roles
- distributional semantic models
- contextual word embeddings
- argument alternations