Skip to content

AsifCantCode/StudyMate

Repository files navigation

StudyMate

Table of Contents

  1. Introduction
  2. Datasets
  3. Models
  4. Results

Introduction

StudyMate is a Bangla extractive question-answering application tailored for students. It compares and deploys models trained on Bangla datasets to deliver precise answers swiftly, enhancing academic learning and performance.

Datasets

BanglaRQA

SQuAD_bn

Filtered Dataset of BanglaRQA: BanglaRQA_to_SquadBn_fact_confirm

Filtered dataset of BanglaRQA involving factoid and confirmation type questions converted to SQuAD_bn dataset format

Models

4 BERT based models XLM-RoBERTa, mBert, BanglaBERT and IndicBERT were used on 2 datasets to produce a total of 8 models as follows:

SQuAD_bn Trained Models

XLM-RoBERTa

mBERT

BanglaBERT

IndicBERT

BanglaRQA Trained Models

XLM-RoBERTa

mBERT

BanglaBERT

IndicBERT

Code

The code for training the models is provided in the repository

Running the application

To run the application, just run the notebook here

Results

F1 Score and Exact Match on SQuAD_bn dataset

Models HasAns Total HasAns Exact HasAns F1 NoAns Total NoAns Exact NoAns F1 Exact F1
Bangla BERT 625 15.52 27.29 575 0.00 0.00 8.08 14.22
Indic BERT 625 12.80 27.61 575 0.17 0.17 6.75 14.46
XLM Roberta 625 45.28 60.13 575 0.00 0.00 23.58 31.31
mBert 625 46.88 61.26 575 1.21 1.21 25.00 32.49

Table: Quantitative Evaluation of Various Models on SquadBN Dataset

F1 Score and Exact Match on BanglaRQA dataset

Models HasAns Total HasAns Exact HasAns F1 NoAns Total NoAns Exact NoAns F1 Exact F1
Bangla BERT 868 26.84 41.91 314 0.00 0.00 19.71 30.77
Indic BERT 868 13.94 33.16 314 2.23 2.23 10.83 24.94
XLM Roberta 868 64.98 81.53 314 0.64 0.64 47.89 60.04
mBert 868 63.13 80.04 314 0.00 0.00 46.36 58.78

Table: Quantitative Evaluation of Various Models on BanglaRQA Dataset

Training and Validation loss

Loss with SQuAD_bn dataset

Models Training Loss Evaluation Loss
Bangla BERT 1.82 2.26
Indic BERT 2.79 2.74
XLM Roberta 1.17 1.39
mBert 0.88 1.46

Table: Training and Evaluation Loss of Various Models on SquadBN Dataset

Loss with BanglaRQA dataset

Models Training Loss Evaluation Loss
Bangla BERT 1.07 1.41
Indic BERT 1.40 1.44
XLM Roberta 0.59 0.73
mBert 0.34 0.64

Table: Training and Evaluation Loss of Various Models on BanglaRQA Dataset

Images

The loss curves are provided in the folder here

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 3

  •  
  •  
  •