[Suggestion] Add a note about the training of Bengio et al. MLP

Hi @karpathy, thanks for that great repo! 

Maybe it would be better to note in your code that while you're training by [minimizing the CE loss](https://github.com/karpathy/makemore/blob/f61811b994280cb12ddae15ef5800baa2e3a1ca4/makemore.py#L392), Bengio actually **maximized** the log-likelihood. I know that it is equivalent in this case (one-hot vectors as ground-truth), but that's not the case in general, so maybe better to note. Thanks!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Suggestion] Add a note about the training of Bengio et al. MLP #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Suggestion] Add a note about the training of Bengio et al. MLP #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions