NMT models speedup abnormally related to batch size

Hi, Thanks for the great work. I just tested the fairseq-generate in my test set(ZH-EN translation) using the FastSeq and Fairseq, and the speedup is quiet abnormal comparing with the example [link](https://github.com/microsoft/fastseq/blob/main/examples/wmt/README.md). 
My test set has 1526 sentences with 5~150 Chinese characters each, and my experiment is on NVIDIA Tesla T4. The translation model I used is base transformer arch in fairseq, with encoder layer nums equals to 30.
I tested with following command:
for fairseq, `fairseq-generate ../data-bin --path model_avg.pt --remove-bpe --batch-size 128`
for fastseq, `fastseq-generate-for-fairseq ../data-bin --path model_avg.pt --remove-bpe --batch-size 128 --postprocess-workers 5`
I didn't use the **--no-repeat-ngram-size** in fastseq, and the beam size is default 5, lenpen is 1.
My test result is as follows:

  |     BatchSize    |      not assigned       |      128       |    10        |      5      |    1    |
  |:----------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|
  | fairseq-0.10.2    |  65.79 sentences/s  |      63.18 sentences/s       |      19.06 sentences/s     | 11.79 sentences/s     |  3.06 sentences/s |
  | above + fastseq  | 75.55 sentences/s  |  74.28 sentences/s |  17.38 sentences/s |  11.47 sentences/s |   2.92 sentences/s|

I found when the batch size is large(such as 128 and above), the fastseq has obvious speedup(but not as much as 2x or above), but when the batch size is small( I test this because of my need for model used in actual situation for deployment), the fastseq seems like behaving no speedup at all, and even slower. I think the phenomenon quiet abnormal and ask for your help. Looking  for your reply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NMT models speedup abnormally related to batch size #106

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BatchSize	not assigned	128	10	5	1
fairseq-0.10.2	65.79 sentences/s	63.18 sentences/s	19.06 sentences/s	11.79 sentences/s	3.06 sentences/s
above + fastseq	75.55 sentences/s	74.28 sentences/s	17.38 sentences/s	11.47 sentences/s	2.92 sentences/s

NMT models speedup abnormally related to batch size #106

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions