EVALUATING LARGE LANGUAGE MODELS FOR MACHINE TRANSLATION ON INDIAN LANGUAGES.
Loading...
Date
Authors
Advisors
License
DOI
Type
thesis
Journal Title
Journal ISSN
Volume Title
Publisher
Grantor
University of Wisconsin-Milwaukee
Abstract
This study assesses how well Large Language Models (LLMs), such as LLaMA-v3 and GPT-3.5, perform while translating English into Indian languages. For three Indian languages, translations from English were evaluated by humans and were found to be fairly good. Automated measures like BLEU, METEOR, and BERTScore were then evaluated by comparing them to human evaluation scores. Translations from English obtained by LLMs were then automatically evaluated on eleven Indian languages using the Samanantar dataset. The results show that while LLaMA has significant advantages in terms of fluency and semantic accuracy, LLMs are prone to errors related to language specific conventions. As part of the study, the impact of prompt engineering on improving translation quality was also examined.