ng-vocab.tsv file is missing. I checked this issue . Added this code to generate the file and verify the result. It gave me accuracy score of 55% (approx). Then instead of using this I used this dataset. Result improved to 59%. I think the dataset is playing an important role here. So I would request to post the original file, if not then atleast the process of getting or generating it, so that we can run it and get the result that you mentioned. (71% accuracy). I ran this code.