Human Experiment Data

This subdirectory is the home of all human experiment data for the Overcooked game. All data was collected through Mturk and is fully anonymized. While the data collection code is proprietary, it relies heavily on the open source Overcooked-demo project.

Data was collected on behalf of the Center for Human-Compatible AI (CHAI) at UC Berkeley. Do not distribute in any manner without the express consent of CHAI or its affiliates. If you have questions regarding data rights, privacy, or distrubion please contact either Nathan Miller at nathan_miller23@berkeley.edu or Micah Carroll at mdc@berkeley.edu.

Overview

Data directory structure
Schema
Processing utils

Data Directory Structure

This directory is subdivided into three subdirectories as follows

human_data/

raw/
- Contains all unprocessed, unfiltered data in CSV form
- Data is divided into 2019 experiments, collected for this paper, and 2020 experiments, collected on more complex layouts with updated dynamics
cleaned/
- Contains processed, filted data as pickled pandas DataFrames
- Data is again divided into 2019 and 2020 experiments
- Data is futher divided into 'all', 'train', and 'test' sets
- Code for performing this pre-processing is available here, with further info found below
dummy/
- A strict subset of the data in other two repos
- Useful for making tests more lightweight and reproducible
- Do NOT use for production purposes

Schema

Raw Schema

The current raw data schema is as follows

NEW_SCHEMA = set(['state', 'joint_action', 'reward', 'time_left', 'score', 'time_elapsed', 'cur_gameloop', 'layout', 
              'layout_name', 'trial_id', 'player_0_id', 'player_1_id', 'player_0_is_human', 'player_1_is_human'])

Each row in the CSV corresponds to a single, discrete timestep in the underlying MDP.

Note: A 'trial' refers to a singular Overcooked game on a single layout.

state (JSON): A JSON serialized version of a OvercookedState instance. Support for converting JSON into an OvercookedState python object is found in the Overcooked-ai repo
joint_action (JSON): A JSON serialized version of a joint overcooked action. Player 0 action is at index 0, similarly for player 1.
reward (int): The sparse reward achieved in this particular transition
time_left (float): The wall-clock time remaining in the trial
score (float): Cumulative sparse reward achieved by both players at this point in the game
time_elapsed (float): Wall clock time since begining of the trial
cur_gameloop (int): Number of discrete MDP timesteps since beginning of trial
layout (string): The 'terrain', or all static parts (pots, ingredients, counters, etc) of the layout, serialized in a string encoding
layout_name (string): Human readable name given to the specific layout
trial_id (string): unique identifier given to the trial (again, note this is a single Overcooked game; a single player pair could experience many trials).
player_0_id (string): Anonymized ID given to this particular Psiturk worker. Note, these were independently generated by us on the backend so there is no relation to Turk ID. If player is AI, the the the player ID is a hardcoded AI_ID constant
player_1_id (string): Symmetric to player_0_id
player_0_is_human (bool): Indicates whether player_0 is controlled by human or AI
player_1_is_human (bool): Symmetric to player_0_is_human

Processed Schema

In the course of pre-processing, several additional columns are computed from the underlying raw data and added for convenience. They are as follows

cur_gameloop_total (int): Total number of MDP timesteps in this trial. Note that this is a constant across all rows with equivalent trial_id
score_total (int): Final score of the trial
button_press (int): Whether a keyboard stroke was performed by a human at this timestep. Each non-wait action counts as one button press
button_press_total (int): Total number of (human) button presses performed in entire trial
button_presses_per_timestep (float): button_press_total / cur_gameloop_total
timesteps_since_interact (int): Number of MDP timesteps since the last human-input 'INTERACT' action

Processing Utils

All data processing utils are found in the human directory.

process_dataframes.py
- High level, user facing methods are found in this file
- get_human_human_trajectories accepts a layout name and returns de-serialized overcooked trajectories, if data for that layout exists. Highest level function used by BC training pipeline
- csv_to_df_pickle loads, processes, and filters raw CSV data and saves as pickled DataFrame
- format_trials_df helper to csv_to_df_pickle, this method handles all pre-processing and builds the processed schema mentioned in previous section
- filter_trials helper to csv_to_df_pickle, this method allows user to specify a filter function that filters entire trials
- filter_transitions allows user to specify filter function that filters at a transition-by-transition level
data_processing_utils.py
- Lower level helper functions found in this file
- Primarily for converting CSV and DataFrame representations into python Overcooked objects
- One abstraction level lower than process_dataframs.py, recommended to be used by advanced users only
data_wrangling.ipynb
- Interactive Jupyter Notebook examplifying use of the process_dataframes functionality
process_human_trials.py
- Script for converting legacy dynamics into form comaptible with current dynamics
- Previously, Overcooked MDP began automatically cooking a soup once valid recipe was in pot, now, an INTERACT action is explicitely required to begin cooking
- This script imputes dummy INTERACT actions at every timestep where soup cooking begins
- See overcooked-ai for more details on MDP dynamics and game rules
human_data_forward_comp.py
- Utils script for converting deprecated schema to updated schema listed in previous section
- All data currently in repo is under updated schema, so this is only included for legacy reasons

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Human Experiment Data

Overview

Data Directory Structure

Schema

Raw Schema

Processed Schema

Processing Utils

Files

README.md

Latest commit

History

README.md

File metadata and controls

Human Experiment Data

Overview

Data Directory Structure

Schema

Raw Schema

Processed Schema

Processing Utils