Informations and abstract
Keywords: Readability model, French as a foreign language, Beacco referentials, deep learning, cognitive and pedagogical features
In this paper, we report three experiments evaluating a variety of feature sets and models intended to develop a new readability model for French as a Foreign Language (FFL) materials. The purpose of this model is to predict the levels of texts on a scale widely used in foreign language teaching, namely the six proficiency levels defined in 2001 in the Common European Framework of Reference for Languages (CEFR). Our new model has two main advantages. Firstly, it is the first readability model for FFL based on a deep learning model, which shows substantially better accuracy and generalizability than the previous state-of-the-art work. Secondly, it will be freely available through a website for the community of FFL teachers to use. We also investigated new cognitive and pedagogical features, but they failed to outperform existing features sets. Our best performing readability model, obtained by fine-tuning BERT, outperforms the state-of-the-art model by a gain of 8 percentage points in accuracy and adjacent accuracy.