Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxEnt estimates the initial state distribution rather than getting it from the env #38

Open
maxmdaniel opened this issue Oct 12, 2018 · 0 comments

Comments

@maxmdaniel
Copy link
Collaborator

MaxEnt IRL currently estimates the initial state distribution based on the provided expert trajectories:
for traj in self.expert_trajs:
mu[traj['states'][0], 0] += 1
mu[:, 0] = mu[:, 0] / len(self.expert_trajs)

Is there a way to instead get the exact initial state distribution from the gym.Env, similar to the way we extract the exact transition dynamics?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant