PyTorch nn.LayerNorm now takes bias arg - removed custom class #454

calmitchell617 · 2024-03-10T09:21:08Z

Hi, I noticed that the PyTorch nn.LayerNorm class now takes a bias arg. This PR removes the custom LayerNorm class and replaces it with the built-in.

I tested the qualitative effect of this change by checking out a fresh version of master branch, then running:

python data/shakespeare_char/prepare.py

python train.py config/train_shakespeare_char.py

I ran the same commands after making the code changes, and compared the results after 1000 iters on an RTX 6000 Ada. The eval results were:

step 1000: train loss 1.2743, val loss 1.5198
step 1000: train loss 1.2760, val loss 1.5265

Not identical, but it seems to be working well enough. A sample taken after 2000 iters:

$ python sample.py --out_dir=out-shakespeare-char
Overriding: out_dir = out-shakespeare-char
number of parameters: 10.65M
Loading meta from data/shakespeare_char/meta.pkl...


KING RICHARD III:
The last through thy beauteous graves
Beating her brother uncontrary'd.

DUKE OF YORK:
Prove, my lord, I'll not speak to thy course;
And that might send in this maid overth and the king
And and selfsame to my life, once by a crown,
And why he's not known to-day sweet to woe more.

KING RICHARD III:
Then at this is a gentleman poor great to
The woman's part son; and therefore you are the prince
Of your hands, being, you are advance.

EDWARD:
It is true.

sopotc · 2024-03-17T11:26:35Z

Just adding that I also noticed that bias is there and did a bunch of measurements, including on GPT-2 owt and the loss delta is negligible.

Ran experiments on a 2x 4090 setup.

vhmth · 2024-07-02T20:58:16Z

model.py

@@ -95,9 +84,9 @@ class Block(nn.Module):

    def __init__(self, config):
        super().__init__()
-        self.ln_1 = LayerNorm(config.n_embd, bias=config.bias)
+        self.ln_1 = nn.LayerNorm(config.n_embd, bias=config.bias)


Nice. To add another confirmation, I wrote my own model and used nn.LayerNorm directly with no issues: https://github.com/vhmth/gpt-2/blob/main/model.py#L74

pytorch layernorm now takes bias arg

2d6e6ba

sopotc added a commit to sopotc/nano-Force that referenced this pull request Mar 17, 2024

Remove custom LayerNorm. See karpathy#454 (comment)

2ba87b1

vhmth approved these changes Jul 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch nn.LayerNorm now takes bias arg - removed custom class #454

PyTorch nn.LayerNorm now takes bias arg - removed custom class #454

calmitchell617 commented Mar 10, 2024

sopotc commented Mar 17, 2024 •

edited

Loading

vhmth Jul 2, 2024

PyTorch nn.LayerNorm now takes bias arg - removed custom class #454

Are you sure you want to change the base?

PyTorch nn.LayerNorm now takes bias arg - removed custom class #454

Conversation

calmitchell617 commented Mar 10, 2024

sopotc commented Mar 17, 2024 • edited Loading

vhmth Jul 2, 2024

Choose a reason for hiding this comment

sopotc commented Mar 17, 2024 •

edited

Loading