Document Type

Journal Article

Publication Title

Neurocomputing

Volume

610

Publisher

Elsevier

School

School of Science

RAS ID

71855

Comments

Śliwiak, P., & Shah, S. A. A. (2024). Text-to-text generative approach for enhanced complex word identification. Neurocomputing, 610. https://doi.org/10.1016/j.neucom.2024.128501

Abstract

This paper presents a novel approach for solving the Complex Word Identification (CWI) task using the text-to-text generative model. The CWI task involves identifying complex words in text, which is a challenging Natural Language Processing task. To our knowledge, it is a first attempt to address CWI problem into text-to-text context. In this work, we propose a new methodology that leverages the power of the Transformer model to evaluate complexity of words in binary and probabilistic settings. We also propose a novel CWI dataset, which consists of 62,200 phrases, both complex and simple. We train and fine-tune our proposed model on our CWI dataset. We also evaluate its performance on separate test sets across three different domains. Our experimental results demonstrate the effectiveness of our proposed approach compared to state-of-the-art methods.

DOI

10.1016/j.neucom.2024.128501

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

 
COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.