I-Arabic: Computational Attempts and Corpus Issues in Modern Arabic

يونس, نجوى

doi:10.21608/mjoms.2023.299689

I-Arabic: Computational Attempts and Corpus Issues in Modern Arabic

نوع المستند : البحوث والدِّراسات.

المؤلف

نجوى يونس

أستاذة اللغويات في الجامعة العربية المفتوحة

10.21608/mjoms.2023.299689

المستخلص

Modern Arabic encounters many challenges concerning the use of computer-based methods for analyzing Arabic data. These methods include natural language processing, machine learning, and corpus linguistics, among others. This paper addresses the challenges, the computational attempts, and a proposed model: I-Arabic. One of the main challenges in using computational methods for Arabic is the lack of large, high-quality language resources, such as text corpora, annotated data, and lexical resources. This is due to various factors, including the diversity of Arabic dialects and the limited availability of digitized Arabic texts. Another challenge is the complexity of Arabic morphology and syntax, which can pose difficulties for natural language processing algorithms. Arabic is a highly inflected language, with a rich system of prefixes, suffixes, and internal vowel changes that can affect the meaning and function of words. Additionally, Arabic has a flexible word order and a complex system of grammatical agreement.

الكلمات الرئيسية

الموضوعات الرئيسية