-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FQE support .d3 file model? #381
Labels
enhancement
New feature or request
Comments
@CastleImitation Hi, thanks for the issue. FQE does support .d3 file as well. Could you share the minimal code that I can reproduce your issue? |
Dear Takuseno, Since I'm not sure whether FQE can support d3. I have save my model with .pt. haha |
Dear Takuseno
Thanks for your kind reply.
I have checked the official document of d3rlpy and found the way to save the model as .pt.
Regards!
| |
Felix Li
|
|
***@***.***
|
---- Replied Message ----
| From | Takuma ***@***.***> |
| Date | 3/3/2024 19:42 |
| To | ***@***.***> |
| Cc | ***@***.***>,
***@***.***> |
| Subject | Re: [takuseno/d3rlpy] FQE support .d3 file model? (Issue #381) |
@CastleImitation Hi, thanks for the issue. FQE does support .d3 file as well. Could you share the minimal code that I can reproduce your issue?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Dear Takuseno,
By the way, could I have one additional question?
I have trained my model.
But when I tried to load my model and build the test dataset for the model, it reported error :
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ModuleList:
size mismatch for 0._fc.weight: copying a param with shape torch.Size([343, 256]) from checkpoint, the shape in current model is torch.Size([340, 256]).
size mismatch for 0._fc.bias: copying a param with shape torch.Size([343]) from checkpoint, the shape in current model is torch.Size([340]).
I also tried to build the trained dataset for the trained model, but it didn't report any error.
Here is my code
config_BCQ = DiscreteBCQConfig()
config_SAC = DiscreteSACConfig()
config_DDQN = DoubleDQNConfig()
# model_BCQ = DiscreteBCQ(algo=self.policy,config=fqe_config,device="cuda:0")
device = "cuda"
model_BCQ = config_BCQ.create(device=device)
model_SAC = config_SAC.create(device=device)
model_DDQN = config_DDQN.create(device=device)
POLICIES = {
'BCQ':[
[f'{FINAL_POLICIES_PATH}/raw_intermediate/BCQ/run_0/DiscreteBCQ_train.pt',model_BCQ],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/BCQ/run_1/DiscreteBCQ_train.pt',model_BCQ],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/BCQ/run_2/DiscreteBCQ_train.pt',model_BCQ],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/BCQ/run_3/DiscreteBCQ_train.pt',model_BCQ],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/BCQ/run_4/DiscreteBCQ_train.pt',model_BCQ]
],
'SAC': [
[f'{FINAL_POLICIES_PATH}/raw_intermediate/SAC/run_0/DiscreteSAC_train.pt', model_SAC],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/SAC/run_1/DiscreteSAC_train.pt', model_SAC],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/SAC/run_2/DiscreteSAC_train.pt', model_SAC],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/SAC/run_3/DiscreteSAC_train.pt', model_SAC],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/SAC/run_4/DiscreteSAC_train.pt', model_SAC]
],
'DDQN': [
[f'{FINAL_POLICIES_PATH}/raw_intermediate/DQN/run_0/DoubleDQN_train.pt', model_DDQN],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/DQN/run_1/DoubleDQN_train.pt', model_DDQN],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/DQN/run_2/DoubleDQN_train.pt', model_DDQN],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/DQN/run_3/DoubleDQN_train.pt', model_DDQN],
[f'{FINAL_POLICIES_PATH}/raw_intermediate/DQN/run_3/DoubleDQN_train.pt', model_DDQN]
]
}
for P in POLICIES.keys(): #
for i in range(len(POLICIES[P])): # 5个
data = load_data(states='raw', rewards='intermediate',index_of_split=i)[1] # 测试集数据 循环到最后一个算法的最后一轮时,data是最后一个数据集的数据
POLICIES[P][i][1].build_with_dataset(data) # 把类添加到字典的时候就已经实例化了,即便没有显式地赋予一个变量名
POLICIES[P][i][1].load_model(POLICIES[P][i][0])
data = load_data(states='raw', rewards='intermediate',index_of_split=i)[1] here is my dataset. [1] indicates the test data. In my experiment, there are totally 7*7*7=343 action possibilities.
By the way, since you have willingly helped me a lot, would you mind if I add your name in the authors when I issue my paper?
Awaiting your kind reply.
| |
Felix Li
|
|
***@***.***
|
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Dear Sir,
Is our FQE evaluation method support .d3 file model?
When I'm going to run my trained DDQN .d3 file model on FQE, an error occurred saying "RuntimeError: Invalid magic number; corrupt file?"
It seems that FQE does not support .d3 file model.
Or, should I save the model to .pt file?
Looking forward your kind reply!
The text was updated successfully, but these errors were encountered: