1cycle Policy. Unfamiliar results

Hey, 

I was implementing 1 cycle policy as an exercise. And I have a few observations from my experiments.
I have a 
__Model__ : Resnet18.
__Batch size for training__ = 128
__Batch size for testing__ = 100

Optimser :  ```optim.SGD(net.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)```
Total number of epochs 26

__1 cycle policy__ : Learning rate goes from __0.01 to 0.1__ and back till __24__ epochs

Then model is trained for __2 epochs at 0.001 learning rate__.

No cyclic momentum used or adamw. 

I achieved a test set accuracy of __93.4%in 26__ epochs.

_This seems like a big difference from the 70 epochs at 512 batch size that is quoted in your blog post._

Am I doing something wrong ? _Is the number of epochs a good metric to base your results on, as those are dependant on the batch size_ ? .

The whole point of using super convergence is using high learning rates to converge quicker , but it seems like using low learning rates (0.01- 0.1 < 0.8-3) is __faster__ to train.  



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1cycle Policy. Unfamiliar results #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

1cycle Policy. Unfamiliar results #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions