Informations and abstract
Keywords: Thematic proto-roles, semantic roles, distributional semantic models, word embeddings
Distributional semantics represents words as multidimensional vectors recording their statistical distribution in context. Notwithstanding the wide use of this approach in fields as distant as Natural Language Processing, psycho-linguistic modeling and semantic analysis, relatively little work focused on the characterization of the semantic information encoded in these semantic vectors, especially for verbs. Here we investigate whether and to what extent distributional vectors are able to encode the semantic content of Dowty’s semantic proto-roles, which can be characterized as the set of entailment relations that an argument receives by virtue of its role in the event described by a predicate (Dowty 1989, 1991). We created several linear mappings between various kinds of static embeddings and a semantic space built on the basis of the proto-roles annotations collected by White et al. (2016). Our results show that, to a certain extent, proto-roles information is available in distributional models, and that a linear mapping can be used to infer the semantic characteristics of the arguments of novel verbs, thus testing the possibility of developing large-scale models able to extract the semantic properties for a wide inventory of verbs. Finally, we report a qualitative analysis in which we discuss which entailment relations our technique associates with a few semantic verb classes whose semantic roles are notoriously difficult to describe.