EVALUATING LARGE LANGUAGE MODELS FOR MACHINE TRANSLATION ON INDIAN LANGUAGES.

Loading...
Thumbnail Image

License

DOI

Type

thesis

Journal Title

Journal ISSN

Volume Title

Publisher

Grantor

University of Wisconsin-Milwaukee

Abstract

This study assesses how well Large Language Models (LLMs), such as LLaMA-v3 and GPT-3.5, perform while translating English into Indian languages. For three Indian languages, translations from English were evaluated by humans and were found to be fairly good. Automated measures like BLEU, METEOR, and BERTScore were then evaluated by comparing them to human evaluation scores. Translations from English obtained by LLMs were then automatically evaluated on eleven Indian languages using the Samanantar dataset. The results show that while LLaMA has significant advantages in terms of fluency and semantic accuracy, LLMs are prone to errors related to language specific conventions. As part of the study, the impact of prompt engineering on improving translation quality was also examined.

Description

Related Material and Data

Citation

Sponsorship

Endorsement

Review

Supplemented By

Referenced By