You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Zero the gradient buffers of all parameters and backprops with random gradients:
net.zero_grad()
out.backward(torch.randn(1, 10))
What is the purpose of this? It is not part of standard ML workflows and can be confusing to beginners. (As evidence,I am helping some people learn basics of ML and I got questions about this line. This is how I found out about it!)
If there is no good reason for it, then I suggest:
dropping these few lines
changing wording of other parts of the page if needed. E.g. 'at this point we covered... calling backward'
I would agree the random gradient can be confusing if you're not already familiar with how backprop work. out.sum().backward() might be less confusing here?
@albanD What is the downside of just dropping doing backward in this cell? I think out.sum().backward() is also confusing because it is not part of the standard ml workflow.
🚀 Describe the improvement or the new tutorial
In neural networks tutorial for beginners, we have the following:
Zero the gradient buffers of all parameters and backprops with random gradients:
What is the purpose of this? It is not part of standard ML workflows and can be confusing to beginners. (As evidence,I am helping some people learn basics of ML and I got questions about this line. This is how I found out about it!)
If there is no good reason for it, then I suggest:
Existing tutorials on this topic
No response
Additional context
No response
cc @subramen @albanD
The text was updated successfully, but these errors were encountered: