You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MaxEnt IRL currently estimates the initial state distribution based on the provided expert trajectories: for traj in self.expert_trajs: mu[traj['states'][0], 0] += 1 mu[:, 0] = mu[:, 0] / len(self.expert_trajs)
Is there a way to instead get the exact initial state distribution from the gym.Env, similar to the way we extract the exact transition dynamics?
The text was updated successfully, but these errors were encountered:
MaxEnt IRL currently estimates the initial state distribution based on the provided expert trajectories:
for traj in self.expert_trajs:
mu[traj['states'][0], 0] += 1
mu[:, 0] = mu[:, 0] / len(self.expert_trajs)
Is there a way to instead get the exact initial state distribution from the gym.Env, similar to the way we extract the exact transition dynamics?
The text was updated successfully, but these errors were encountered: