Skip to content

Comics Dataset Framework for Comics Understanding

Notifications You must be signed in to change notification settings

emanuelevivoli/CoMix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoMix: Comics Dataset Framework for Comics Understanding

Package Version

The repo is under development. The codebase is called comix. Please contribute to the project by reporting issues, suggesting improvements or adding new datasets. We are currently working on refactoring the code to support:

  • Automatic annotation (needs refactoring)
  • Scraping new datasets (needs refactoring)
  • Multi-task CoMix benchmarks (doing)

Introduction

The purpose of this project is to replicate (on the validation set) the benchmarks presented in:

In particular, one of the main limitation when working in Comics/Manga datasets is the impossibility to share images. To overcome this problem, we have created this framework that allows to use our (validation) annotations, and download the images from the original sources, without breaking the licenses.

The comix is using the following datasets:

  • DCM
  • comics
  • eBDtheque
  • PopManga
  • Manga109

Installation

The project is written in Python 3.8. To create a conda environment, consider using:

conda create --name myenv python=3.8
conda activate myenv

and to install the dependencies, run the following command:

pip install -e .

The above command will install the package comix in editable mode, so that you can modify the code and see the changes immediately. In the case of benchmarking the detection and captioning models, we will create separate conda environments to not conflict with the dependencies.

Procedures

In general, this project is divided into the following steps:

  • Manually obtain and locate images and annotations in the right folder (e.g. data/)
  • Processing images to a unified naming and folder structure - comix/process
  • Model performances (use pre-trained or custom models on the data) - benchmarks
  • Evaluate model performances against provided Ground Truth - comix/evaluators

Model performances and evaluation

In the benchmarks folder, we have multiple scripts to benchmark the models on the datasets, on various tasks. The detection scripts produce a COCO-format json file which can be used by the comix/evaluators/detection.py script to evaluate the performances of the models. The captioning scripts produce multiple .txt files, which can be postprocess to obtain a captions.csv and objects.csv files, used by the comix/evaluators/captioning.py script to evaluate the performances of the models.

Documentation

The documentation is available in the /docs folder.

In particular:

docs/
├── README.md                   # Project overview, installation, quick start
├── datasets/                   # Dataset documentation
│   └── README.md               # Unified dataset info
└── tasks/                      # Task-specific documentation
    ├── detection.md            # Detection task
    │   ├── README.md           # Overview of detection pipeline
    │   ├── generation.md       # Detection models
    │   └── evaluation.md       # Metrics and evaluation
    └── captioning/             # Captioning task
        ├── README.md           # Overview of captioning pipeline
        ├── generation.md       # VLM caption generation details
        ├── postprocessing.md   # LLaMA post-processing
        └── evaluation.md       # Metrics and evaluation

Here are the most important documents:

About

Comics Dataset Framework for Comics Understanding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages