Skip to content

Commit

Permalink
Merge pull request #89 from Unity-Technologies/develop-parameters
Browse files Browse the repository at this point in the history
Obstacle Tower Environment 2.0
  • Loading branch information
awjuliani authored May 14, 2019
2 parents 5699e90 + 4c9e1f6 commit d31de13
Show file tree
Hide file tree
Showing 6 changed files with 139 additions and 54 deletions.
15 changes: 12 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ To learn more, please read our AAAI Workshop paper:
* v1.3 Hotfix release.
* Resolves memory leak when running in Docker.
* Fixes issue where environment could freeze on certain higher floors.
* v2.0 Obstacle Tower Challenge Round 2 Release.
* Towers can now be generated with up to 100 floors.
* Additional visual themes, obstacles, enemy types, and floor difficulties added.
* Additional reset parameters added to customize generated towers. Go [here](./reset-parameters.md) for details on the parameters and their values.
* Various bugs fixed and performance improvements.


## Installation
Expand All @@ -48,9 +53,9 @@ Python dependencies (also in [setup.py](https://github.com/Unity-Technologies/ob

| *Platform* | *Download Link* |
| --- | --- |
| Linux (x86_64) | https://storage.googleapis.com/obstacle-tower-build/v1.3/obstacletower_v1.3_linux.zip |
| Mac OS X | https://storage.googleapis.com/obstacle-tower-build/v1.3/obstacletower_v1.3_osx.zip |
| Windows | https://storage.googleapis.com/obstacle-tower-build/v1.3/obstacletower_v1.3_windows.zip |
| Linux (x86_64) | https://storage.googleapis.com/obstacle-tower-build/v2.0/obstacletower_v2.0_linux.zip |
| Mac OS X | https://storage.googleapis.com/obstacle-tower-build/v2.0/obstacletower_v2.0_osx.zip |
| Windows | https://storage.googleapis.com/obstacle-tower-build/v2.0/obstacletower_v2.0_windows.zip |

For checksums on these files, see [here](https://storage.googleapis.com/obstacle-tower-build/v1.3/ote-v1.3-checksums.txt).

Expand All @@ -68,6 +73,10 @@ $ pip install -e .

To see an example of how to interact with the environment using the gym interface, see our [Basic Usage Jupyter Notebook](examples/basic_usage.ipynb).

### Customizing the environment

Obstacle Tower can be configured in a number of different ways to adjust the difficulty and content of the environment. This is done through the use of reset parameters, which can be set when calling `env.reset()`. See [here](./reset-parameters.md) for a list of the available parameters to adjust.

### Player Control

It is also possible to launch the environment in "Player Mode," and directly control the agent using a keyboard. This can be done by double-clicking on the binary file. The keyboard controls are as follows:
Expand Down
87 changes: 57 additions & 30 deletions examples/basic_usage.ipynb

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions examples/gcp_training.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,8 @@ cd ../
This page lists the URLs for downloading Obstacle Tower for various platforms. [https://github.com/Unity-Technologies/obstacle-tower-env](https://github.com/Unity-Technologies/obstacle-tower-env). For GCP, run

```bash
wget https://storage.googleapis.com/obstacle-tower-build/v1.3/obstacletower_v1.3_linux.zip
unzip obstacletower_v1.3_linux.zip
wget https://storage.googleapis.com/obstacle-tower-build/v2.0/obstacletower_v2.0_linux.zip
unzip obstacletower_v2.0_linux.zip
```

### Install Dopamine
Expand Down Expand Up @@ -157,7 +157,7 @@ cp dopamine_otc/unity_lib.py dopamine/dopamine/discrete_domains/unity_lib.py
cp dopamine_otc/rainbow_otc.gin dopamine/dopamine/agents/rainbow/configs/rainbow_otc.gin
```

If you didn’t extract the `obstacletower_v1.3_linux.zip` to the home directory, you will need to edit `rainbow_otc.gin`, specifically `create_otc_environment.environment_path` should correspond to the path to your extracted OTC executable file.
If you didn’t extract the `obstacletower_v2.0_linux.zip` to the home directory, you will need to edit `rainbow_otc.gin`, specifically `create_otc_environment.environment_path` should correspond to the path to your extracted OTC executable file.

Furthermore, within this file you will find settings on how long to train for, and how often to evaluate your agent. Each iteration, Dopamine will train for `Runner.training_steps`, evaluate (i.e. run in inference mode) for `Runner.evaluation_steps`, record these results, and checkpoint the agent. It will repeat this process `Runner.num_iterations` number of times before quitting. For instance, you can change `Runner.num_iterations` to 40 to train for 10 million steps. You can also reduce `Runner.evaluation_steps` to reduce the time spent not training. There are other hyperparameters found in this file, which you can modify to improve performance. But for the sake of this exercise, you may leave them as-is.

Expand Down
64 changes: 47 additions & 17 deletions obstacle_tower_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ class UnityGymException(error.Error):


class ObstacleTowerEnv(gym.Env):
ALLOWED_VERSIONS = ['1', '1.1', '1.2', '1.3']
ALLOWED_VERSIONS = ['2.0']

def __init__(self, environment_filename=None, docker_training=False, worker_id=0, retro=True,
timeout_wait=30, realtime_mode=False):
timeout_wait=30, realtime_mode=False, config=None, greyscale=False):
"""
Arguments:
environment_filename: The file path to the Unity executable. Does not require the extension.
Expand Down Expand Up @@ -64,11 +64,19 @@ def __init__(self, environment_filename=None, docker_training=False, worker_id=0
self._n_agents = None
self._done_grading = False
self._flattener = None
self._greyscale = greyscale

# Environment reset parameters
self._seed = None
self._floor = None

self.realtime_mode = realtime_mode
self.game_over = False # Hidden flag used by Atari environments to determine if the game is over
self.retro = retro
if config != None:
self.config = config
else:
self.config = None

flatten_branched = self.retro
uint8_visual = self.retro
Expand Down Expand Up @@ -107,7 +115,10 @@ def __init__(self, environment_filename=None, docker_training=False, worker_id=0
high = np.array([np.inf] * brain.vector_observation_space_size)
self.action_meanings = brain.vector_action_descriptions

depth = 3
if self._greyscale:
depth = 1
else:
depth = 3
image_space_max = 1.0
image_space_dtype = np.float32
camera_height = brain.camera_resolutions[0]["height"]
Expand Down Expand Up @@ -139,15 +150,20 @@ def done_grading(self):
def is_grading(self):
return os.getenv('OTC_EVALUATION_ENABLED', False)

def reset(self):
def reset(self, config=None):
"""Resets the state of the environment and returns an initial observation.
In the case of multi-agent environments, this is a list.
Returns: observation (object/list): the initial observation of the
space.
"""
reset_params = {}
if config is None:
reset_params = {}
if self.config is not None:
reset_params = self.config
else:
reset_params = config
if self._floor is not None:
reset_params['floor-number'] = self._floor
reset_params['starting-floor'] = self._floor
if self._seed is not None:
reset_params['tower-seed'] = self._seed

Expand Down Expand Up @@ -197,18 +213,31 @@ def step(self, action):
def _single_step(self, info):
self.visual_obs = self._preprocess_single(info.visual_observations[0][0, :, :, :])

self.visual_obs, keys, time, current_floor = self._prepare_tuple_observation(
self.visual_obs, info.vector_observations[0])

if self.retro:
self.visual_obs = self._resize_observation(self.visual_obs)
self.visual_obs = self._add_stats_to_image(
self.visual_obs, info.vector_observations[0])
default_observation = self.visual_obs
else:
default_observation = self._prepare_tuple_observation(
self.visual_obs, info.vector_observations[0])
default_observation = self.visual_obs, keys, time, current_floor

if self._greyscale:
default_observation = self._greyscale_obs(default_observation)

return default_observation, info.rewards[0], info.local_done[0], {
"text_observation": info.text_observations[0],
"brain_info": info}
"brain_info": info,
"total_keys": keys,
"time_remaining": time,
"current_floor": current_floor
}

def _greyscale_obs(self, obs):
new_obs = np.floor(np.expand_dims(np.mean(obs, axis=2), axis=2)).astype(np.uint8)
return new_obs

def _preprocess_single(self, single_visual_obs):
if self.uint8_visual:
Expand Down Expand Up @@ -244,11 +273,11 @@ def seed(self, seed=None):

seed = int(seed)
if seed < 0 or seed >= 100:
logger.warn(
logger.warning(
"Seed outside of valid range [0, 100). A random seed "
"within the valid range will be used on next reset."
)
logger.warn("New seed " + str(seed) + " will apply on next reset.")
logger.warning("New seed " + str(seed) + " will apply on next reset.")
self._seed = seed

def floor(self, floor=None):
Expand All @@ -259,12 +288,12 @@ def floor(self, floor=None):
return

floor = int(floor)
if floor < 0 or floor >= 25:
logger.warn(
"Starting floor outside of valid range [0, 25). Floor 0 will be used"
if floor < 0 or floor >= 99:
logger.warning(
"Starting floor outside of valid range [0, 99). Floor 0 will be used"
"on next reset."
)
logger.warn("New starting floor " + str(floor) + " will apply on next reset.")
logger.warning("New starting floor " + str(floor) + " will apply on next reset.")
self._floor = floor

@staticmethod
Expand All @@ -283,8 +312,9 @@ def _prepare_tuple_observation(vis_obs, vector_obs):
"""
key = vector_obs[0:6]
time = vector_obs[6]
floor_number = vector_obs[7]
key_num = np.argmax(key, axis=0)
return vis_obs, key_num, time
return vis_obs, key_num, time, floor_number

@staticmethod
def _add_stats_to_image(vis_obs, vector_obs):
Expand All @@ -293,7 +323,7 @@ def _add_stats_to_image(vis_obs, vector_obs):
"""
key = vector_obs[0:6]
time = vector_obs[6]
key_num = np.argmax(key, axis=0)
key_num = int(np.argmax(key, axis=0))
time_num = min(time, 10000) / 10000

vis_obs[0:10, :, :] = 0
Expand Down
19 changes: 19 additions & 0 deletions reset-parameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Obstacle Tower Reset Parameters

Obstacle Tower can be configured in a variety of ways both when launching the environment and on episode reset. Below are a list of parameters, along with the ranges of values and what they correspond to. Pass these as part of a `config` dictionary when calling `env.reset(config=config)`, or pass them as part of a dicitonary when launching the environment in `ObstacleTowerEnv('path_to_binary', config=config)`.

*Note: The config passed on environment launch will be the default used when starting a new episode if there is no config passed during `env.reset()`.*

| *Parameter* | *Value range* | *Effect* |
| --- | --- | --- |
| `tower-seed` | (-1 - 9999)| Sets the seed used to generate the tower. -1 corresponds to a random tower on every `reset()` call.
| `starting-floor` | (0, 99)| Sets the starting floor for the agent on `reset()`.
| `total-floors` | (1, 100) | Sets the maximum number of possible floors in the tower.
| `dense-reward` | (0, 1) | Whether to use the sparse (0) or dense (1) reward function.
| `lighting-type` | (0, 1, 2) | Whether to use no realtime light (0), a single realtime light with minimal color variations (1), or a realtime light with large color variations (2).
| `visual-theme` | (0, 1, 2) | Whether to use only the `default-theme` (0), the normal ordering or themes (1), or a random theme every floor (2).
| `agent-perspective` | (0, 1) | Whether to use first-person (0) or third-person (1) perspective for the agent.
| `allowed-rooms` | (0, 1, 2) | Whether to use only normal rooms (0), normal and key rooms (1), or normal, key, and puzzle rooms (2).
| `allowed-modules` | (0, 1, 2) | Whether to fill rooms with no modules (0), only easy modules (1), or the full range of modules (2).
| `allowed-floors` | (0, 1, 2) | Whether to include only straightforward floor layouts (0), layouts that include branching (1), or layouts that include branching and circling (2).
| `default-theme` | (0, 1, 2, 3, 4) | Whether to set the default theme to `Ancient` (0), `Moorish` (1), `Industrial` (2), `Modern` (3), or `Future` (4).
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

setup(
name='obstacle_tower_env',
version='1.3',
version='2.0',
author='Unity Technologies',
url='https://github.com/Unity-Technologies/obstacle-tower-env',
py_modules=["obstacle_tower_env"],
Expand Down

0 comments on commit d31de13

Please sign in to comment.