StudyMate is a Bangla extractive question-answering application tailored for students. It compares and deploys models trained on Bangla datasets to deliver precise answers swiftly, enhancing academic learning and performance.
- Source: BanglaRQA on Hugging Face
- Source: SQuAD_bn on Hugging Face
Filtered dataset of BanglaRQA involving factoid and confirmation type questions converted to SQuAD_bn dataset format
4 BERT based models XLM-RoBERTa, mBert, BanglaBERT and IndicBERT were used on 2 datasets to produce a total of 8 models as follows:
The code for training the models is provided in the repository
To run the application, just run the notebook here
| Models | HasAns Total | HasAns Exact | HasAns F1 | NoAns Total | NoAns Exact | NoAns F1 | Exact | F1 |
|---|---|---|---|---|---|---|---|---|
| Bangla BERT | 625 | 15.52 | 27.29 | 575 | 0.00 | 0.00 | 8.08 | 14.22 |
| Indic BERT | 625 | 12.80 | 27.61 | 575 | 0.17 | 0.17 | 6.75 | 14.46 |
| XLM Roberta | 625 | 45.28 | 60.13 | 575 | 0.00 | 0.00 | 23.58 | 31.31 |
| mBert | 625 | 46.88 | 61.26 | 575 | 1.21 | 1.21 | 25.00 | 32.49 |
Table: Quantitative Evaluation of Various Models on SquadBN Dataset
| Models | HasAns Total | HasAns Exact | HasAns F1 | NoAns Total | NoAns Exact | NoAns F1 | Exact | F1 |
|---|---|---|---|---|---|---|---|---|
| Bangla BERT | 868 | 26.84 | 41.91 | 314 | 0.00 | 0.00 | 19.71 | 30.77 |
| Indic BERT | 868 | 13.94 | 33.16 | 314 | 2.23 | 2.23 | 10.83 | 24.94 |
| XLM Roberta | 868 | 64.98 | 81.53 | 314 | 0.64 | 0.64 | 47.89 | 60.04 |
| mBert | 868 | 63.13 | 80.04 | 314 | 0.00 | 0.00 | 46.36 | 58.78 |
Table: Quantitative Evaluation of Various Models on BanglaRQA Dataset
| Models | Training Loss | Evaluation Loss |
|---|---|---|
| Bangla BERT | 1.82 | 2.26 |
| Indic BERT | 2.79 | 2.74 |
| XLM Roberta | 1.17 | 1.39 |
| mBert | 0.88 | 1.46 |
Table: Training and Evaluation Loss of Various Models on SquadBN Dataset
| Models | Training Loss | Evaluation Loss |
|---|---|---|
| Bangla BERT | 1.07 | 1.41 |
| Indic BERT | 1.40 | 1.44 |
| XLM Roberta | 0.59 | 0.73 |
| mBert | 0.34 | 0.64 |
Table: Training and Evaluation Loss of Various Models on BanglaRQA Dataset
The loss curves are provided in the folder here