Building an AI Medical Translator for Nepali with LLaMA 3.1

I fine-tuned Meta’s LLaMA 3.1-8B model to translate medical text from English to Nepali, using only a free Google Colab GPU (Tesla T4).

The result?
An 8.9× performance improvement over zero-shot translation—turning an unusable model into something genuinely helpful for 30 million Nepali speakers.

Key highlights:

BLEU: 11.63 (vs 1.31 zero-shot)
Trainable params: only 0.52% of the model
Model size: 5.7GB (runs on a laptop)
Training time: ~30 hours
Cost: $0

This is a story about access, efficiency, and why cutting-edge medical AI doesn’t have to be locked behind massive budgets.

The Problem

Most medical resources are written in English
30M+ Nepali speakers face language barriers in healthcare
Generic translators fail on medical terminology
Professional medical translation is slow and expensive

The Solution

A domain-specific AI medical translator built with:

Model: LLaMA 3.1-8B-Instruct
Technique: LoRA (0.52% trainable parameters)
Compression: 4-bit NF4 quantization
Data: 58,682 English–Nepali medical sentence pairs

This approach enables efficient training, low memory usage, and real-world deployability.

How It Was Built

Data

Health forums (patient Q&A)
Pregnancy & maternal health datasets
Professional Nepali health articles
Cleaned, normalized, and medically filtered

Training

GPU: Tesla T4 (Colab free tier)
Epochs: 1 (7,335 steps)
Optimizer: 8-bit AdamW
Framework: Unsloth (2× faster training)

Results

Metric	Zero-Shot	Fine-Tuned
BLEU	1.31	11.63
ChrF++	16.35	34.65

->Zero-shot translation was unusable.
->Fine-tuning made the model practically useful.

Example Translations

EN: Take two tablets after meals three times daily.
NE: दिनमा तीन पटक खाना पछि दुई ट्याब्लेट लिनुहोस्।

✔ Correct dosage
✔ Preserved medical terminology

Limitations

~3% critical errors on complex medical instructions
No human clinical validation yet
Not ready for unsupervised medical use