Evaluation and Calibration of BERT's Handling of the Arabic Language for Measuring Semantic Similarity Between Sentences

doi:10.21608/mjoms.2023.250586.1134

Evaluation and Calibration of BERT's Handling of the Arabic Language for Measuring Semantic Similarity Between Sentences

10.21608/mjoms.2023.250586.1134

Abstract

This research aims to evaluate and calibrate the sentence similarity measurement tool associated with BERT (developed by Google), which is heavily relied upon in research concerning natural language processing, particularly in enhancing the outputs of machine translation. This is achieved by tracking the accuracy of its outputs and studying these outputs, then translating the results into statistics that illustrate the accuracy of this essential tool's handling of the language.

To achieve the desired research objective, a parallel corpus between Arabic and English was used. A random sample from the corpus in English was translated using Google Translate, followed by aligning the translation results with the Arabic corpus. Then, pairs of aligned sentences (Patterns) in Arabic were fed into BERT to measure the semantic similarity between them through the Sentence similarity tool.

Through practical application and analysis of BERT's outputs, the research identified the shortcomings that hinder the tool's performance with the Arabic language, comparing these results with the tool's performance in English. The outcome was in favor of the English language, where the tool's efficiency with it was about 65%, compared to 40% with Arabic.

The research proposed a solution that contributes to improving BERT's outputs with the Arabic language. This was based on the results of analyzing the study sample and identifying the most significant errors that the tool could not overcome, which reduced its efficiency.

Keywords