Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

复现精度异常 #1

Open
mieya opened this issue Oct 25, 2023 · 7 comments
Open

复现精度异常 #1

mieya opened this issue Oct 25, 2023 · 7 comments

Comments

@mieya
Copy link

mieya commented Oct 25, 2023

你好,我在使用你复现的PatchMixer跑数据集Weather预测长度为96时的mse为0.001239,远远低于论文数据,请问你在weather数据集上运行时有无遇到这个问题。

@hughxx
Copy link
Owner

hughxx commented Oct 25, 2023

没有,这是我的结果,weather96比论文数据好一些
Args in experiment:
Namespace(activation='gelu', batch_size=128, c_out=7, checkpoints='./checkpoints/', d_ff=128, d_layers=1, d_model=256, data='custom', data_path='weather.csv', dec_in=7, des='test', devices='0,1,2,3', distil=True, do_predict=False, dropout=0.3, e_layers=3, embed='timeF', embed_type=0, enc_in=21, factor=1, features='M', freq='h', gpu=0, is_training=1, itr=1, label_len=0, learning_rate=0.0008, loss='mse', lradj='type3', model='PatchMixer', model_id='test', moving_avg=25, n_heads=4, num_workers=10, output_attention=False, padding_patch='end', patch_len=16, patience=100, pct_start=0.3, pred_len=96, random_seed=2021, res_attention=True, root_path='./dataset/', seq_len=336, stride=8, target='OT', test_flop=False, train_epochs=10, use_amp=False, use_gpu=True, use_multi_gpu=False)
Use GPU: cuda:0

start training : test_PatchMixer_custom_ftM_sl336_pl96_ebtimeF_test_0>>>>>>>>>>>>>>>>>>>>>>>>>>
train 36456
val 5175
test 10444
iters: 100, epoch: 1 | loss: 0.5498875
speed: 0.0373s/iter; left time: 102.3395s
iters: 200, epoch: 1 | loss: 0.3939515
speed: 0.0230s/iter; left time: 60.6724s
Epoch: 1 cost time: 7.821044921875
Epoch: 1, Steps: 284 | Train Loss: 0.4880658 Vali Loss: 0.3786965 Test Loss: 0.2063423
Validation loss decreased (inf --> 0.378697). Saving model ...
Updating learning rate to 0.0008
iters: 100, epoch: 2 | loss: 0.3542251
speed: 0.0807s/iter; left time: 198.3336s
iters: 200, epoch: 2 | loss: 0.4140439
speed: 0.0235s/iter; left time: 55.4837s
Epoch: 2 cost time: 7.100579738616943
Epoch: 2, Steps: 284 | Train Loss: 0.4020058 Vali Loss: 0.3337559 Test Loss: 0.1755062
Validation loss decreased (0.378697 --> 0.333756). Saving model ...
Updating learning rate to 0.0008
iters: 100, epoch: 3 | loss: 0.3140059
speed: 0.0792s/iter; left time: 172.2051s
iters: 200, epoch: 3 | loss: 0.4344665
speed: 0.0240s/iter; left time: 49.7391s
Epoch: 3 cost time: 6.923853635787964
Epoch: 3, Steps: 284 | Train Loss: 0.3608040 Vali Loss: 0.3232527 Test Loss: 0.1712785
Validation loss decreased (0.333756 --> 0.323253). Saving model ...
Updating learning rate to 0.0008
iters: 100, epoch: 4 | loss: 0.3087889
speed: 0.0788s/iter; left time: 148.9078s
iters: 200, epoch: 4 | loss: 0.3484375
speed: 0.0231s/iter; left time: 41.3357s
Epoch: 4 cost time: 7.2653889656066895
Epoch: 4, Steps: 284 | Train Loss: 0.3451175 Vali Loss: 0.3195348 Test Loss: 0.1698665
Validation loss decreased (0.323253 --> 0.319535). Saving model ...
Updating learning rate to 0.00072
iters: 100, epoch: 5 | loss: 0.2818712
speed: 0.0795s/iter; left time: 127.5898s
iters: 200, epoch: 5 | loss: 0.2930407
speed: 0.0276s/iter; left time: 41.5472s
Epoch: 5 cost time: 7.726243734359741
Epoch: 5, Steps: 284 | Train Loss: 0.3340447 Vali Loss: 0.3206680 Test Loss: 0.1697336
EarlyStopping counter: 1 out of 100
Updating learning rate to 0.000648
iters: 100, epoch: 6 | loss: 0.2909068
speed: 0.0763s/iter; left time: 100.8577s
iters: 200, epoch: 6 | loss: 0.4027271
speed: 0.0221s/iter; left time: 26.9298s
Epoch: 6 cost time: 6.811914682388306
Epoch: 6, Steps: 284 | Train Loss: 0.3277687 Vali Loss: 0.3194360 Test Loss: 0.1682749
Validation loss decreased (0.319535 --> 0.319436). Saving model ...
Updating learning rate to 0.0005832000000000001
iters: 100, epoch: 7 | loss: 0.4234127
speed: 0.0790s/iter; left time: 81.9223s
iters: 200, epoch: 7 | loss: 0.2571840
speed: 0.0229s/iter; left time: 21.5006s
Epoch: 7 cost time: 6.880347490310669
Epoch: 7, Steps: 284 | Train Loss: 0.3215254 Vali Loss: 0.3212784 Test Loss: 0.1697771
EarlyStopping counter: 1 out of 100
Updating learning rate to 0.00052488
iters: 100, epoch: 8 | loss: 0.3818762
speed: 0.0755s/iter; left time: 56.8354s
iters: 200, epoch: 8 | loss: 0.2794770
speed: 0.0221s/iter; left time: 14.4045s
Epoch: 8 cost time: 6.708326101303101
Epoch: 8, Steps: 284 | Train Loss: 0.3179038 Vali Loss: 0.3181768 Test Loss: 0.1702140
Validation loss decreased (0.319436 --> 0.318177). Saving model ...
Updating learning rate to 0.0004723920000000001
iters: 100, epoch: 9 | loss: 0.3956600
speed: 0.0757s/iter; left time: 35.5220s
iters: 200, epoch: 9 | loss: 0.3242335
speed: 0.0223s/iter; left time: 8.2403s
Epoch: 9 cost time: 6.609601020812988
Epoch: 9, Steps: 284 | Train Loss: 0.3134815 Vali Loss: 0.3215825 Test Loss: 0.1711207
EarlyStopping counter: 1 out of 100
Updating learning rate to 0.00042515280000000004
iters: 100, epoch: 10 | loss: 0.2710871
speed: 0.0734s/iter; left time: 13.5724s
iters: 200, epoch: 10 | loss: 0.2887837
speed: 0.0226s/iter; left time: 1.9207s
Epoch: 10 cost time: 6.79282283782959
Epoch: 10, Steps: 284 | Train Loss: 0.3103537 Vali Loss: 0.3229509 Test Loss: 0.1704234
EarlyStopping counter: 2 out of 100
Updating learning rate to 0.0003826375200000001
testing : test_PatchMixer_custom_ftM_sl336_pl96_ebtimeF_test_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 10444
mse:0.14862294495105743, mae:0.19180501997470856, rse:0.5078704953193665

Process finished with exit code 0

@mieya
Copy link
Author

mieya commented Oct 26, 2023

这篇论文里提到的消融实验比如说将卷积换成attention的实验,大佬你有尝试过吗?请问是直接将两个卷积算子替换成encoder吗?想跟你请教一下。

@hughxx
Copy link
Owner

hughxx commented Oct 26, 2023

没有尝试过。看样子只需要替换第一个depth_wise卷积(112行)。
另外,这个代码很多数据集达不到论文中的效果,可以一起交流一下。
我目前得到的有用结论是:他提出的loss函数比较work,其他的貌似一般。

@mieya
Copy link
Author

mieya commented Nov 2, 2023

我将第一个depth_wise卷积替换成attention后在exchange数据集上表现的比纯卷积要好,感觉消融实验有点不太一样。

@LiuHao-THU
Copy link

多谢楼主:复现了一下weather,基本上和论文里的结果都差不多
weather_336_96_PatchMixer_custom_ftM_sl336_pl96_ebtimeF_test_0
mse:0.1495588719844818, mae:0.19227443635463715, rse:0.5090857744216919

weather_336_192_PatchMixer_custom_ftM_sl336_pl192_ebtimeF_test_0
mse:0.1922815442085266, mae:0.2341286838054657, rse:0.5768788456916809

weather_336_336_PatchMixer_custom_ftM_sl336_pl336_ebtimeF_test_0
mse:0.22178223729133606, mae:0.2645467221736908, rse:0.6364506483078003

weather_336_720_PatchMixer_custom_ftM_sl336_pl720_ebtimeF_test_0
mse:0.29902246594429016, mae:0.3186098337173462, rse:0.7336626052856445

@LiuHao-THU
Copy link

楼主有尝试过拿这个模型做分类吗

@hughxx
Copy link
Owner

hughxx commented Jan 27, 2024

没有,我主要混个毕业,哈哈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants