fix: installation instructions

nesl · Feb 20, 2024 · 6f1d6d4 · 6f1d6d4
1 parent 6c5750e
commit 6f1d6d4
Show file tree

Hide file tree

Showing 5 changed files with 199 additions and 24 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 
 
 # GTDM
-Dataset Repository of NeurIPS 2023 Track on Datasets and Benchmarks Paper #207
+Dataset Repository of GDTM: An Indoor Geospatial Tracking Dataset with Distributed Multimodal Sensors
 
 ## Overview
 
@@ -11,7 +11,7 @@ One of the critical sensing tasks in indoor environments is geospatial tracking,
 
 ### External Links 
 
--  **[Data]** We will be hosting the dataset on IEEE Dataport under a CC-BY-4.0 license for public access. The public dataset repository will be ready before cameraready, as it takes a lot of time to upload terabytes of data online. We hereby provide **a Google Drive link to part of the dataset available for the reviewers** before the terabyte full dataset is available online.
+-  **[Data]** We will be hosting the dataset on IEEE Dataport under a CC-BY-4.0 license for public access in the long term. We hereby provide **a Google Drive link to the dataset** before the long-term-support dataset repository is available online.
 
 	https://drive.google.com/drive/folders/1N0b8-o9iipR7m3sq7EnTHrk_fRR5eFug?usp=sharing
 
@@ -56,9 +56,9 @@ The dataset covers three cases: one car, two cars, and one car in poor illuminat
 |                 | Good Lighting  Condition | Good Lighting  Condition | Good Lighting  Condition | Poor Lighting  Condition | Poor Lighting  Condition |
 |:---------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|:------------------------:|
 |                 |          Red Car         |         Green Car        |         Two Cars         |          Red Car         |         Green Car        |
-| # of Viewpoints |            16            |            17            |             7            |             9            |             4            |
-|  Durations (minutes) |            150           |            140           |            80            |            140           |            40            |
-|    Size (GB)    |            220           |            220           |            120           |            195           |            55            |
+| # of Viewpoints |            18            |            15            |             6           |             7            |             2            |
+|  Durations (minutes) |            200           |            165           |            95            |            145           |            35           |
+|    Size (GB)    |            278           |            235           |            134           |            202           |            45            |
 
 ### Data Hierachy
 
@@ -87,7 +87,7 @@ The data set is origanized by firstly the experiment settings, then viewpoints.
         ├── View 3
         └── ...
 ```
-Here each _dataN/_ folder indicates one experiment session, which lasts typically 5-15 minutes.
+Here each _dataN/_ folder indicates one experiment session, which lasts typically 5-15 minutes. Please refer to ./USING_GDTM/dataset_metadata_final.csv for metadata information such as sensor placement viewpoints, lighting conditions, and tracking targets.
 
 Inside _dataN/_ folder lay the data of that session, the hierachy is shown below
 ```
@@ -192,15 +192,43 @@ To reduce the burden of our users, we aggreate each step into a few one-line scr
         │   └── respeaker.hdf5
         └── mocap.hdf5
 ```
-### Installation Instructions
+### Installation Instructions (Build the container)
 
-<!-- ros -->
+The data-processing requires building our docker image. To install docker, please refer to the"Install using the apt repository" section in https://docs.docker.com/engine/install/ubuntu/.
 
-Under the _gdtm_preprocess/_ directory, run `bash build.sh` to build the Docker container. Run `bash run-dp.sh <data_directory>` to drop into a shell in the container with `<data_directory>` mounted at `./data`. Alternatively, run `bash cmd-run-dp.sh <data_directory> <command>` to execute a single command within the container.
+Under the _gdtm_preprocess/_ directory, modify ```build.sh```,```cmd-run-dp.sh```, ```run-dp.sh```, change the first three lines to: 
+
+```
+export MCP_USER=$USER
+export MCP_HOME=YOUR_PATH_TO_GDTM
+export MCP_ROOT=${MCP_HOME}/gdtm_preprocess
+```
+
+Then, still under the _gdtm_preprocess/_ directory, run 
+```
+bash build.sh
+```
+to build the Docker container. 
+
+If "permission denied" appears, try:
+
+(1) Create the docker group.
+```
+sudo groupadd docker
+```
+It's OKay if group 'docker' already exists.
+
+(2) Add your user to the docker group.
+```
+sudo usermod -aG docker $USER
+```
+A reboot may be necessary.
+
+**Basic usage of our docker container**:[Note: for your information only. You don't need to run this paragraph to pre-process the data.] Run `bash run-dp.sh <data_directory>` to drop into a shell in the container with `<data_directory>` mounted at `./data`. Alternatively, run `bash cmd-run-dp.sh <data_directory> <command>` to execute a single command within the container.
 
 The Docker container is built and tested in Ubuntu 20.04.6 LTS on an x86-64 Intel NUC with an NVIDIA GeForce RTX 2060 graphics card and NVIDIA driver version 470.182.03 and CUDA version 11.4.
 
-You may find [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) useful if you encounter the error message
+You may find the "Installing with Apt" and "Configuring Docker" sections in [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) useful if you encounter the error message
 ```
 could not select device driver "" with capabilities: [[gpu]].
 ```
@@ -209,13 +237,17 @@ could not select device driver "" with capabilities: [[gpu]].
 
 We will refer to an experiment as raw data formatted in the hierarchy shown above as _dataN/_.
 
-To process a single experiment, move contents of the experiment into a directory named _raw/_ and then run `cmd-run-dp.sh <data_directory> 'bash helper/convert.sh'`. For example, to process _dataN/_, we would arrange its contents in the following hierarchy: 
+To process a single experiment, move contents of the experiment into a directory named _<data_directory>/raw/_. <data_directory> can use any name and be placed at any place, but must be a full and abolute path. For example, to process _dataN/_, we would arrange its contents in the following hierarchy: 
 ```
-data/
+<data_directory>
 └── raw/
-    └── contents of dataN/...
+    └── contents of dataN/... (folders node1/2/3, metadata.json, optitrack.csv)
 ```
-Then we would run `bash cmd-run-dp.sh data/ 'bash helper/convert.sh'`.
+Then we would run 
+```
+bash cmd-run-dp.sh <data_directory> 'bash helper/convert.sh'
+```
+
 
 The HDF5 files generated will be placed in a newly created _processed/_ directory. For each interval in the `valid_ranges` field of the metadata, a directory _chunk_N/_ will be created and will contain clipped HDF5 files containing only the data within the respective range. We will refer to each clipped set of HDF5 files (_chunk_N/_) as a dataset. An example of the file structure after preprocessing an experiment is shown below: 
 ```
@@ -257,9 +289,15 @@ data/
     └── unchanged...
 ```
 
+You can use ```sudo chown -R $USER <data_repository>``` to obtain the ownership of the files so you can move or delete the data easily. 
+
 ### Merging
 
-To merge multiple datasets into a single dataset, first move the datasets we want to merge under the same directory `<merge_directory>` and then run `bash cmd-run-dp.sh <merge_directory> 'bash helper/merge.sh'`. The merged dataset will be at `<merged_directory>/merged`. For example, if we want to merge datasets `data_1_chunk_0` and `data_2_chunk_1`, we would arrange them in the following format: 
+To merge multiple datasets into a single dataset, first move the datasets we want to merge under the same directory `<merge_directory>` and then run 
+```
+bash cmd-run-dp.sh <merge_directory> 'bash helper/merge.sh $(ls data | grep chunk)'
+```
+The merged dataset will be at `<merged_directory>/merged`. For example, if we want to merge datasets `data_1_chunk_0` and `data_2_chunk_1`, we would arrange them in the following format: 
 ```
 merge_directory/
 ├── data_1_chunk_0/
@@ -298,7 +336,7 @@ merge_directory/
 
 ### Rendering and Visualization (Optional)
 
-To visualize all of the data from a single node, run `bash cmd-run-dp.sh <data_directory> 'bash helper/visualize-hdf5.sh [node_id] [start_frame] [duration] [output_mp4]'`. For example, `bash cmd-run-dp.sh <data_directory> 2 500 300 test.mp4` visualizes data from node 2 between frame 500 and 800 and saves it to `<data_directory>/processed/test.mp4`.
+To visualize all of the data from a single node, run `bash cmd-run-dp.sh <data_directory> 'bash helper/visualize-hdf5.sh [node_id] [start_frame] [duration] [output_mp4]'`. For example, `bash cmd-run-dp.sh <data_directory> 'bash helper/visualize-hdf5.sh 2 500 300 test.mp4'` visualizes data from node 2 between frame 500 and 800 and saves it to `<data_directory>/processed/test.mp4`.
 
 ## How to Use Pre-processed GDTM
 
@@ -364,6 +402,8 @@ rm -r /dev/shm/cache_*
 ```
 to clear any pre-loaded data in the memory.
 
+**Visualize Viewpoints**: In order to develop models robust to sensor placement location/perspective changes, uses may want to select data coming from different sensor viewpoints. Apart from dataset_metadata_final.csv, we also provide a tool, viewpoint_plot.py to visualize a few selected data entries
+
 ### See Also
 For further usage such as how to train a multimodal sensor fusion models, we provide examples in https://github.com/nesl/GDTM-Tracking
 
@@ -375,7 +415,8 @@ If you find this project useful in your research, please consider cite:
 @inproceedings{wang2023gdtm,
     title={GTDM: An Indoor Geospatial Tracking Dataset with Distributed Multimodal Sensors},
     author={Jeong, Ho Lyun and Wang, Ziqi and Samplawski, Colin and Wu, Jason and Fang, Shiwei and Kaplan, Lance and Ganesan, Deepak and Marlin, Benjamin and Srivastava, Mani},
-    booktitle={submission to the Thirty-seventh Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
-    year={2023}
+    year={2024}
 }
 ```
+
+
diff --git a/USING_GDTM/dataset_metadata_final.csv b/USING_GDTM/dataset_metadata_final.csv
@@ -0,0 +1,74 @@
+Exp ID,Cars,Duration(Minutes),Good Illumination,Viewpoint #,Notes
+15,r,5,1,0,
+18,g,5,1,0,
+19,g,5,1,1,
+20,g,5,1,1,
+21,g,5,1,1,
+22,r,5,1,1,
+24,r,5,1,2,"(24,25,26): same node 1"
+25,g,5,1,3,"(24,25): only node 2 changed"
+26,g,5,1,4,"(25,26): only node 3 changed"
+27,g,5,1,5,
+28,r,5,1,6,
+29,r,5,1,7,
+32,r,5,1,8,
+33,r,5,1,9,
+34,r,5,1,9,
+41,r,5,1,10,
+42,r,10,1,10,
+43,r,10,1,10,
+44,g,10,1,10,
+45,g,10,1,11,
+46,r,10,1,11,"(45,46): minimal viewpoint change due to car hit"
+47,g,10,1,11,
+48,g,5,1,12,
+49,g,5,1,12,
+50,r,5,1,12,
+51,r,5,1,12,
+53,g,5,1,13,"(51,53): major change in node1, minor change in node2/3"
+54,g,5,1,14,
+55,g,5,1,14,
+56,g,5,1,14,
+57,r,5,1,14,
+58,r,5,1,14,
+59,r,5,1,14,
+61,g,10,1,15,
+63,r,10,1,15,
+64,r,10,1,15,
+65,r,10,1,16,
+66,r,10,1,16,
+67,g,10,1,16,
+70,r,10,1,17,
+72,r,10,1,18,
+73,g,20,1,18,
+74,g,10,1,19,"(74,75): minimal viewpoint change due to car hit"
+75,r,10,1,19,
+76,r,10,1,19,
+77,r,10,1,20,
+78,g,20,1,20,
+85,rg,10,1,21,
+90,r,15,0,21,
+91,g,25,0,21,
+92,g,10,0,22,
+95,r,10,0,22,
+96,rg,10,1,23,
+100,rg,10,1,24,
+102,r,10,0,24,
+103,rg,5,1,24,
+104,r,10,0,24,
+105,r,15,0,24,
+106,r,10,0,25,
+107,rg,10,1,25,
+108,rg,10,1,25,
+109,r,10,1,25,
+111,r,10,0,26,
+112,r,10,0,26,
+113,r,10,0,26,
+115,rg,10,1,26,
+119,r,10,0,27,
+123,r,10,0,28,
+124,r,10,0,28,
+125,r,10,0,28,
+126,r,5,0,28,
+127,rg,10,1,28,
+129,rg,20,1,28,
diff --git a/USING_GDTM/viewpoint_plot.py b/USING_GDTM/viewpoint_plot.py
@@ -0,0 +1,60 @@
+import numpy as np
+import cv2
+import os
+from skimage import metrics
+from tqdm import tqdm
+import matplotlib.pyplot as plt
+
+metadata = []
+with open("dataset_metadata.csv", "r") as fp:
+    metarows = fp.readlines()
+    for i in range(1,len(metarows)): # skip row 1
+        temp = metarows[i].strip().split(",")
+        if temp[4] == '1' and (temp[1]=="r" or temp[1]=="g"): 
+        # select good illunimation + one car
+            metadata.append([eval(temp[0]), eval(temp[3])])
+metadata = np.array(metadata)
+# for each in metadata:
+#     print(each[0])
+
+fig, axs = plt.subplots(nrows=6, ncols=3, figsize=(85,110))
+
+data_to_look_at = [26, 39, 41, 45] # put the viewpoints you would like to visually inspect
+for i in range(len(data_to_look_at)):
+    v1 = metadata[data_to_look_at[i],0]
+    for node1 in range(3):
+        curr_node1 = "node" + str(node1+1)
+        rootdir = "PATH_TO_DATASET"
+        # Example folder structure
+        # ---
+        # ─── PATH_TO_DATASET/
+        #     ├── data15
+        #     │   ├── node1/
+        #     │   ├── node2/
+        #     |   ├── node3/
+        #     |   ├── metadata.json
+        #     │   └── mocap.hdf5
+        #     ├── data16
+        #     └── ...  
+        rootfolder = os.path.join(rootdir+str(v1),curr_node1)
+        filelist = os.listdir(rootfolder)
+        fname1 = ""
+        for each in filelist:
+            if "realsense_rgb" in each: # use realsense_depth if using low lighing conditions
+                fname1 = os.path.join(rootfolder, each)
+                break
+
+        vidcap = cv2.VideoCapture(fname1)
+        success,image1 = vidcap.read()
+        count = 0
+        success = True
+        while success:
+            success,image1 = vidcap.read()
+            count += 1
+            if count>500:
+                break
+        image1 = cv2.cvtColor(image1, cv2.COLOR_BGR2RGB)
+        ax = axs[i][node1]
+        ax.imshow(image1)
+plt.show()
+
diff --git a/gdtm_preprocess/Dockerfile b/gdtm_preprocess/Dockerfile
@@ -9,7 +9,7 @@ RUN apt-key del "7fa2af80" \
 
 
 
-ARG DEBIAN_FRONTEND="noninteractive" 
+ARG DEBIAN_FRONTEND="noninteractive"
 ENV TZ="America/New_York"
 
 
@@ -18,7 +18,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends make g++ cmake
 
 
 #Install opencv library
-RUN apt-get install -y libopencv-dev 
+RUN apt-get install -y libopencv-dev
 
 # WORKDIR /workspace
 
@@ -68,7 +68,7 @@ ENV MCP_ROOT=$MCP_ROOT
 ENV MCP_APP_DIR=$MCP_APP_DIR
 
 #Create the MCP user
-RUN groupadd ${MCP_USER} && useradd --create-home --shell /bin/bash -g ${MCP_USER} -G sudo,audio,dip,video,plugdev,dialout ${MCP_USER}   
+RUN groupadd ${MCP_USER} && useradd --create-home --shell /bin/bash -g ${MCP_USER} -G sudo,audio,dip,video,plugdev,dialout ${MCP_USER}
 
 RUN add-apt-repository universe && \
     apt update && apt-get install -y ffmpeg python3 python-dev python3-dev build-essential python3-pip
@@ -80,7 +80,7 @@ RUN pip3 install ipdb
 RUN pip3 install h5py
 # RUN pip install wheel
 # RUN pip install pyaudio
-
+RUN pip3 install psutil==5.9.8
 ## Ouster LIDAR related
 RUN python3 -m pip install --upgrade pip
 RUN python3 -m pip install 'ouster-sdk[examples]'
@@ -93,7 +93,7 @@ RUN cd $MCP_APP_DIR/data_processing/ros_pkgs   && \
     catkin_make &&\
     catkin_make install && \
     echo "source $MCP_APP_DIR/data_processing/ros_pkgs/devel/setup.bash" >> ~/.bashrc
-    
+
 RUN echo "source $MCP_APP_DIR/data_processing/ros_pkgs/devel/setup.bash"
 
 RUN mkdir -p $MCP_APP_DIR/data_processing/data

diff --git a/gdtm_preprocess/app/helper/convert-optitrack.sh b/gdtm_preprocess/app/helper/convert-optitrack.sh
@@ -1,3 +1,3 @@
 #!/bin/bash
-
+cp data/raw/metadata.json data/processed
 python3 src/convert_optitrack.py data/processed/aligned.csv data/processed/metadata.json data/processed