https://arxiv.org/abs/1409.0473
- Seq2Seq using Bahdanau Attention
Seq2Seq(
(encoder): Encoder(
(embedding): Embedding(24745, 512)
(bi_rnn): RNN(512, 128, bidirectional=True)
(dropout): Dropout(p=0, inplace=False)
)
(decoder): Decoder(
(embedding): Embedding(8854, 256)
(rnn): RNN(256, 64)
(fc): Linear(in_features=64, out_features=8854, bias=True)
(dropout): Dropout(p=0, inplace=False)
)
(attention): BahdanauAttention(
(Wb): Linear(in_features=64, out_features=64, bias=False)
(Wc): Linear(in_features=256, out_features=64, bias=False)
(Wa_T): Linear(in_features=64, out_features=1, bias=False)
)
(fc): Linear(in_features=256, out_features=64, bias=True)
)
The model has 15,733,526 trainable parameters.