Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect number of arguments from call to env.step(action) #119

Open
JRod-Seed opened this issue Oct 10, 2022 · 5 comments
Open

Incorrect number of arguments from call to env.step(action) #119

JRod-Seed opened this issue Oct 10, 2022 · 5 comments

Comments

@JRod-Seed
Copy link

JRod-Seed commented Oct 10, 2022

I just installed gym-super-mario-bros and I've attempted to run the gym using a number of methods including the provided sample code:

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
if done:
state = env.reset()
state, reward, done, info = env.step(env.action_space.sample())
env.render()

env.close()

... as well as from the command line using:

gym_super_mario_bros -e SuperMarioBrosRandomStages-v0 -m human --stages 1-1

In both cases, the gym attempts to render the first frame of the game and then I receive the following error:

/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/envs/registration.py:555: UserWarning: WARN: The environment SuperMarioBrosRandomStages-v0 is out of date. You should consider upgrading to version v3.
logger.warn(
/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:195: UserWarning: WARN: The result returned by env.reset() was not a tuple of the form (obs, info), where obs is a observation and info is a dictionary containing additional information. Actual type: <class 'numpy.ndarray'>
logger.warn(
/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:219: DeprecationWarning: WARN: Core environment is written in old step API which returns one bool instead of two. It is recommended to rewrite the environment with new step API.
logger.deprecation(
Traceback (most recent call last):
File "/Users/--/miniforge3/envs/tf/bin/gym_super_mario_bros", line 8, in
sys.exit(main())
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym_super_mario_bros/_app/cli.py", line 71, in main
play_human(env)
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/nes_py/app/play_human.py", line 69, in play_human
next_state, reward, done, _ = env.step(action)
File "/Users/--/miniforge3/envs/tf/lib/python3.9/site-packages/gym/wrappers/time_limit.py", line 50, in step
observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)

My library versions are as follows:

gym 0.26.2
gym-notices 0.0.8
gym-super-mario-bros 7.4.0
nes-py 8.2.1

Running Python 3.9

Any help would be appreciated

@JRod-Seed
Copy link
Author

JRod-Seed commented Oct 10, 2022

It would appear that the issue stems from a mismatch in the expected return from env.step. gym/wrappers/time_limit.py returns 5 parameters (observation, reward, terminated, truncated, info) but the invocation is only expecting 4 (observation, reward, terminated, info)

Likely related to: Kautenja/nes-py#85

@Robwc000
Copy link

I'm having the same issue what is the fix?

@zxdclyz
Copy link

zxdclyz commented Oct 24, 2022

Try to use gym==0.25.1, it works for me

@pseudo-rnd-thoughts
Copy link

pseudo-rnd-thoughts commented Oct 26, 2022

With v0.26 you can use this code

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import gym

env = gym.make('SuperMarioBros-v0', apply_api_compatibility=True, render_mode="human")
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
env.reset()
for step in range(5000):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

    if done:
       state = env.reset()

env.close()

You can read about the migration guide for the new API https://gymnasium.farama.org/content/migration-guide/

@PowderFan87
Copy link

I've debugged the step functions and it seems that nes-py returns 4 or 5 length tuple.
They have a code snippet that checks for if(Len(result)==4) and for 5.
That seems to be the "way to go" it just is not implemented in the gym/wrappers/time_limit.py

I adapted the snipped and set truncated to False as a default:

def step(self, action):
    """Steps through the environment and if the number of steps elapsed exceeds ``max_episode_steps`` then truncate.

    Args:
        action: The environment step action

    Returns:
        The environment step ``(observation, reward, terminated, truncated, info)`` with `truncated=True`
        if the number of steps elapsed >= max episode steps

    """

    truncated = False
    result = self.env.step(action)

    if (len(result) == 4):
        observation, reward, terminated, info = self.env.step(action)
    elif (len(result) == 5):
        observation, reward, terminated, truncated, info = self.env.step(action)
    else:
        raise Exception('Tupel length of step return was neither 4 nore 5. Stop run.')

    self._elapsed_steps += 1

    if self._elapsed_steps >= self._max_episode_steps:
        truncated = True

    return observation, reward, terminated, truncated, info

so instead of just reading 5 every time I do the same check and this works fine.
But I realise changing the lib is bad mojo so maybe this could be a fix ;) with a proper Exception added...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants