Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert problem like :"assert len(observations) == self.number_of_predator_observations" #2

Open
kudouxiao opened this issue Mar 1, 2024 · 2 comments

Comments

@kudouxiao
Copy link

The internal logic of this environment seems to have issues. I want to increase the number of predators but need to modify a large number of built-in parameters. After the modifications, the program can only run for a while and cannot run for a large number of episodes. Even when I try to reproduce the situation in your paper by using the default parameters or adjusting the number of prey to 8, I still encounter issues like "assert len(observations) == self.number_of_predator_observations" after running for a while. I have tried using algorithms like MADDPG, PPO, and MATD3, and they all exhibit the same problem. In other words, whenever the running steps per episode are longer, such issues inevitably arise, preventing me from adequately training the agents. Do you have any suggestions for resolving this issue? Is it possible to reproduce the scenario presented in your paper?
image

@kudouxiao
Copy link
Author

The issue I was facing has already been resolved, and I am keeping the question open in the hope that it will be helpful to others. In your environmental source code, there is a mistake in the parenthesis placement in this line: "observations += [0] * self.obs_size * (n_nearest_shark - len(observations))". It should be corrected to "observations += [0] * (self.obs_size * n_nearest_shark - len(observations))". Moreover, according to the paper, the way prey and predators return observations should be the same, meaning their logic for observing the environment is identical. However, in the source code (which seems to have been written by different authors), the logic for predators observing the environment appears to be different from that of the prey, which could potentially lead to errors during prolonged training. Therefore, it is recommended to standardize their methods of observing the environment.

@michaelkoelle
Copy link
Owner

Thank you for bringing this to our attention. We are currently actively working on the v1.0 version where this will be fixed. After our experiments in the paper we did a major refactor of the whole environment which broke many things unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants