Skip to content

Behavior of learning rate decay #3

@djshen

Description

@djshen

Currently, the learning rate decay happens after each iteration and the update rule is

lr = config.lr/(1 + args.lr_decay*step)

So, the learning rate of step 0 and 1 will be the same value config.lr.
Is this the expected behavior? Or, the following is correct

lr = config.lr/(1 + args.lr_decay*(step+1))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions