Generative models and generative grammar

Chesi,Cristiano; Vespignani,Francesco; Zamparelli,Roberto

doi:10.1422/108133

Cristiano Chesi Francesco Vespignani Roberto Zamparelli

Generative models and generative grammar

pp: 329-350 DOI: 10.1422/108133

Are you already subscribed?
Login to check whether this content is already included on your personal or institutional subscription.

Abstract

In this short paper we present the results of four experiments assessing various degree of morphosyntactic and semantic linguistic competence in three very large language models (LLMs), namely davinci (GPT-3/ChatGPT), davinci-002 and davinci-003 (GPT-3.5 with different training options). We focused on (i) acceptability, (ii) complexity and (iii) coherence judgments on 7-point Likert scales and on (iv) syntactic development by means of a forced choice task. The datasets used are taken from available test-sets presented in shared tasks by the NLP community or from linguistic tests. The results suggest that, despite a rather good performance overall, these LLMs cannot be considered competence models since they do not qualify neither as descriptively nor explanatorily adequate

Keywords

descriptive/explanatory adequacy
linguistic testing
very large language models

Abstract

Keywords

Preview

What do you think about the recent suggestion?