Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] difference between evaluators and OPE ? #380

Open
ericyue opened this issue Feb 19, 2024 · 2 comments
Open

[QUESTION] difference between evaluators and OPE ? #380

ericyue opened this issue Feb 19, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@ericyue
Copy link

ericyue commented Feb 19, 2024

I'm build a offline RL model with custom collected logs data.
I'm not sure how to understand trained model performance, one way is to add evaluators such as TDErrorEvaluator , the other way is to train a other d3rlpy.ope.FQE with test dataset to see soft_opc or other metrics.
considering theses two way is all do with the test dataset and calc some metric, which one should I take to value the model?

model = d3rlpy.algos.BCQConfig(xxxx)
ret = model.fit(
train_dataset,
n_steps=N_STEPS,
n_steps_per_epoch=N_STEPS_PER_EPOCH,
logger_adapter=logger_adapter,
save_interval = 10,
evaluators={
'test_td_error': TDErrorEvaluator(episodes=test_dataset.episodes),
'test_value_scale': AverageValueEstimationEvaluator(episodes=test_dataset.episodes),
"test_init_value": InitialStateValueEstimationEvaluator(episodes=test_dataset.episodes),
}
)

@ericyue ericyue added the enhancement New feature or request label Feb 19, 2024
@takuseno
Copy link
Owner

takuseno commented Mar 3, 2024

@ericyue Hi, thanks for the issue. I would redirect you to the following papers since this is a general offline RL question.

@LorenzoBottaccioli
Copy link

@takuseno I have similar dubts as the one of @ericyue . Could you please elaborate a little bit more, the papers are interesting but very theoretical could you provide more practical example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants