Hello,
I have tried to use ELMo instead of BERT as you can see on my fork
The training is working but the results are very similar with the training without any contextual embedding (just GLoVE).
Do you have any idea why or how to fix it?
I think that I might have forgotten smth in my code...
Moreover I can notice that x_cemb and ques_cemb are never instanciate, they are always None, would this be part of the issue?
Thanks in advance