Word embedding models have become commonplace in a wide range of NLP applications. In order to train and use the best possible models, accurate evaluation is needed. For extrinsic evaluation of word embedding models, analogy evaluation sets have been shown to be a good quality estimator. We introduce an Icelandic adaptation of a large analogy dataset, BATS, evaluate it on three different word embedding models and show that our evaluation set is apt at measuring the capabilities of such models.
|Title of host publication||2022 Language Resources and Evaluation Conference, LREC 2022|
|Editors||Nicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Jan Odijk, Stelios Piperidis|
|Publisher||European Language Resources Association (ELRA)|
|Number of pages||8|
|Publication status||Published - 2022|
|Event||13th International Conference on Language Resources and Evaluation Conference, LREC 2022 - Marseille, France|
Duration: 20 Jun 2022 → 25 Jun 2022
|Name||2022 Language Resources and Evaluation Conference, LREC 2022|
|Conference||13th International Conference on Language Resources and Evaluation Conference, LREC 2022|
|Period||20/06/22 → 25/06/22|
Bibliographical notePublisher Copyright:
© European Language Resources Association (ELRA), licensed under CC-BY-NC-4.0.
- analogy test set
- word embeddings