Lexical ambiguity in contextualized word embeddings: A case study of nominalizations

Varvara,Rossella; Salvadori,Justine; Huyghe,Richard

doi:10.1418/112743

Abstract

In this paper we investigate the extent to which contextualized word embeddings can encode lexical ambiguity. Specifically, we focus on nominalizations in French, which constitute an interesting case for the study of ambiguity because of their frequent polysemy and their relationship with polyfunctional morphological processes. Given a random sample of occurrences of 90 nouns, we compute for each word the pairwise cosine similarity (SelfSim) among their token embeddings extracted from the pre-trained model FlauBERT and we test it as a predictor of the degree of ambiguity of nominalizations. For the evaluation we make use of a manual annotation of lexical ambiguity, testing different annotation strategies: defining word senses with different semantic classifications and granularities; annotating lexemes in isolation or based on a sample of tokens. Our findings contribute to the understanding of (i) the lexical semantic component of contextual embeddings, enhancing their interpretability, (ii) aspects of lexical ambiguity related to derivational semantics and to the contextual variation of meaning.

Lexical ambiguity in contextualized word embeddings: A case study of nominalizations

Abstract

Keywords

What do you think about the recent suggestion?

Abstract

Keywords

Preview

What do you think about the recent suggestion?