Efficient Pre-trained Semantics Refinement for Video Temporal Grounding

Dataset Preparation

We using the pre-extracted features coming from this awesome paper R$^2$-Tuning, can be downloaded from HuggingFace Hub directly. And We express our sincere gratitude for their contribution to the community.

Please follow our baseline to prepare the dataset and place the corresponding files in the correct directory. And change the config file to the correct path.

Here are the origin video datasets download links:

Training

# Single GPU
python tools/launch.py <path-to-config>

# Multiple GPUs on a single node (elastic)
torchrun --nproc_per_node=<num-gpus> tools/launch.py <path-to-config>

Arguments of tools/launch.py

config The config file to use
--checkpoint The checkpoint file to load from
--resume The checkpoint file to resume from
--work_dir Working directory
--eval Evaluation only
--dump Dump inference outputs
--seed The random seed to use
--amp Whether to use automatic mixed precision training
--debug Debug mode (detect nan during training)
--launcher The job launcher to use

Evaluation

python tools/launch.py <path-to-config> --checkpoint <path-to-checkpoint> --eval

Notes

If problems occur when reproducing the results, please feel free to contact us at github or email.

Maybe you need to change the config file to the correct path.

Some issues may be fixed by these issues in Baseline Repository

Acknowledgement

We would like to express our sincere gratitude to the following authors for their contributions to the community:

R^2-Tuning

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
datasets		datasets
logs		logs
models		models
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
ReadMe.md		ReadMe.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient Pre-trained Semantics Refinement for Video Temporal Grounding

Dataset Preparation

Training

Evaluation

Notes

Acknowledgement

About

Languages

License

LiAo365/EPSR_VTG

Folders and files

Latest commit

History

Repository files navigation

Efficient Pre-trained Semantics Refinement for Video Temporal Grounding

Dataset Preparation

Training

Evaluation

Notes

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages