You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have one question related to the loss calculation. Could you tell me why the average loss is calculated across all segments during training, but only the loss from the last segment is used as the evaluation loss or perplexity? I understand the average loss during the training, but shouldn't we also calculate the average loss during the test to have a fair comparison with other methods that do not segment the data?
The text was updated successfully, but these errors were encountered:
Hi, thank you for sharing the great code base!!
I have one question related to the loss calculation. Could you tell me why the average loss is calculated across all segments during training, but only the loss from the last segment is used as the evaluation loss or perplexity? I understand the average loss during the training, but shouldn't we also calculate the average loss during the test to have a fair comparison with other methods that do not segment the data?
The text was updated successfully, but these errors were encountered: