I fine-tuned Meta’s LLaMA 3.1-8B model to translate medical text from English to Nepali, using only a free Google Colab GPU (Tesla T4).
The result?
An 8.9× performance improvement over zero-shot translation—turning an unusable model into something genuinely helpful for 30 million Nepali speakers.
Key highlights:
- BLEU: 11.63 (vs 1.31 zero-shot)
- Trainable params: only 0.52% of the model
- Model size: 5.7GB (runs on a laptop)
- Training time: ~30 hours
- Cost: $0
This is a story about access, efficiency, and why cutting-edge medical AI doesn’t have to be locked behind massive budgets.
The Problem
- Most medical resources are written in English
- 30M+ Nepali speakers face language barriers in healthcare
- Generic translators fail on medical terminology
- Professional medical translation is slow and expensive
The Solution
A domain-specific AI medical translator built with:
- Model: LLaMA 3.1-8B-Instruct
- Technique: LoRA (0.52% trainable parameters)
- Compression: 4-bit NF4 quantization
- Data: 58,682 English–Nepali medical sentence pairs
This approach enables efficient training, low memory usage, and real-world deployability.
How It Was Built
Data
- Health forums (patient Q&A)
- Pregnancy & maternal health datasets
- Professional Nepali health articles
- Cleaned, normalized, and medically filtered
Training
- GPU: Tesla T4 (Colab free tier)
- Epochs: 1 (7,335 steps)
- Optimizer: 8-bit AdamW
- Framework: Unsloth (2× faster training)
Results
| Metric | Zero-Shot | Fine-Tuned |
|---|---|---|
| BLEU | 1.31 | 11.63 |
| ChrF++ | 16.35 | 34.65 |
->Zero-shot translation was unusable.
->Fine-tuning made the model practically useful.
Example Translations
EN: Take two tablets after meals three times daily.
NE: दिनमा तीन पटक खाना पछि दुई ट्याब्लेट लिनुहोस्।
✔ Correct dosage
✔ Preserved medical terminology
Limitations
- ~3% critical errors on complex medical instructions
- No human clinical validation yet
- Not ready for unsupervised medical use