-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] action_spec_unbatched whenever necessary #2592
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2592
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New Failures, 1 Cancelled Job, 15 Unrelated FailuresAs of commit 98a3b39 with merge base a47b32c (): NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ghstack-source-id: a6748fa882a41fdd50795b46b261e6e214af2c0e Pull Request resolved: #2592
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
Love that there is a way to get the unbatched specs easily now. This is super helpful.
I guess the prefix single_
has already been decided but I will give my 2 cents on this anyway as I think there are some important points.
I think single_
is a really confusing name. In gymnasium I guess it can work as the library is not multi-agent/multi-task. but in torchrl it sounds super confusing to me:
- it could be meaning single agent
- or single spec (as opposed to full spec). this is why
single_full_spec
sounds oximoric to me - it does not correlate with the concept of a batch_size
To me the prefix unbatched_
(or variations of it) drives the concept home better without any of the confusions above
@@ -91,7 +91,7 @@ def train(cfg: "DictConfig"): # noqa: F821 | |||
("agents", "action_value"), | |||
("agents", "chosen_action_value"), | |||
], | |||
spec=env.unbatched_action_spec, | |||
spec=env.single_action_spec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spec=env.single_action_spec, | |
spec=env.single_full_action_spec, |
Wherever unbatched_action_spec
is used, it should be changed to single_full_action_spec
This is valid for all the places here. As the components that were receiving unbatched_
are expecting a composite
We can name it unbatched but it will clash with VMAS unbatched (which are somewhat different). |
Yeah, the vmas ones were there for the absence of this feature. Now that this exists the vmas ones could be removed. Your are right that the vmas name implied full so there would be a clash there. We could think about how to best deal with this but i think torchrl should not be prevented to use the better name cause of this |
so the difference between full and normal is just present when there is only one agent group. What we could do is that for a period of time we could warn vmas users that request |
That's bc breaking, we need to warn them the behaviour will change and they can make the warning disappear by using the full version |
Intermediate solution: |
Genius! Love it |
ghstack-source-id: 4168c6c8b6b5febd8db4fd43e71e46e9bfeb10cc Pull Request resolved: #2592
Ok I also simplified the mock envs that had unbatched specs, LMK if that makes sense! |
I think that makes sense. My comments from the first review still hold tho. I would just make this a refactoring and no logic change so |
There is not logic change, see my comment above |
ghstack-source-id: f346c47cd2d87a9306059e3ca56affcc68a7ff9c Pull Request resolved: #2592
ghstack-source-id: c235e024bc208155e0c74d08c67a581b6a7cbc79 Pull Request resolved: #2592
@matteobettini I'll wait until the tests pass but it should be good. |
Maybe we can put a warning to use the new feature? |
ghstack-source-id: ec87794dabaf5023dac85cfc898a7c000e93331d Pull Request resolved: #2592
Stack from ghstack (oldest at bottom):
cc @matteobettini we now have a more generic
single_action_spec
that is somewhat similar to unbached_action_spec (name is borrowed from gymnasium)There is something a bit fishy about
unbatched_action_spec
though which is that it's theunbatched_full_action_spec
, and it doesn't (always) match action spec in its sturcture (it will always be composite).LMK if these changes make sense (implemented these changes to fix
tutorials/sphinx-tutorials/multiagent_competitive_ddpg.py
which was broken)