-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MOMDP representation? #67
Comments
Yes, it is possible. Check out this documentation page. Regarding fully observable state variables, you can achieve that by having an observation model that simply returns the state. |
Sorry, I did not formulate my question well. What I was meant to ask is how should I formulate my code so, when converted to .pomdpx, it represents some state variables as fully observable. Right now I am working with a model that has access to the time that has passed since the beginning of the operation, and has a maximum number of time steps to act (finite horizon). The way I have approached is by creating states that, on top of their ID (either an int or 'term' for the terminal state), they have also a property class TDState(pomdp_py.State):
def __init__(self, state_id, time_step):
self.id = state_id
self.t = time_step
self.name = f"s_{state_id}-t_{time_step}" The methods for class TDObservationModel(pomdp.ObservationModel)
def __init__(self, conf_matrix):
self. observation_matrix = conf_matrix
self.n_steps, self.n_states, self.n_obs = self.observation_matrix.shape
def probability(self, observation, next_state, action):
obs_idx = observation.id
state_idx = next_state.id
state_step = next_state.t
return self.observation_matrix[state_step][state_idx][obs_idx] The transition model includes the parameter I would like the time to be fully observable in the produced .pomdpx file, but since you commented:
I think the way I am handling it would not accomplish the MOMDP representation. How should I do it instead? |
Follow-up: I tried to convert to .pomdpx with my current problem definition and the file reflects only one state variable, which has a number of states equal to the number of possible state IDs times the possible values of t. In the case of 5 targets and 8 time-steps, I get 41 states of a single state variable (the extra state is the terminal state). I would like to know how to define my model to have a state variable with 5 values (ID), which is not fully observable, and another state variable with 8 values (time), which would be fully observable. |
I will provide a sketch for the idea. class State(pomdp_py.SimpleState)
def __init__(self, target, time_step):
super().__init__(data=(target, time_step))
class ObservationModel(pomdp_py.ObservationModel):
def sample(self, next_state, action):
time_step = next_state.data[1]
return pomdp_py.SimpleObservation(data=time_step) This makes time_step observable, but not the target. |
That makes it clear, thank you! I imagine the ObservationModel need to know What I mean is that the target needs to be part of the observation as well. |
Hello,
I would like to know if it is currently possible to create a problem with fully observable state variables and solve them using a
.pomdpx
file using SARSOP.The text was updated successfully, but these errors were encountered: