-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train Custom Data Tutorial 🌟 #1570
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
There is a small typo in this tutorial:
|
@notmatthancock thanks for letting us know! This should be fixed now :) |
Can clarification be added under local logging for detecting images based on custom trained data? Also, where/how is |
@mark375chen detection can be run on any trained model, i.e. |
📚 This guide explains how to train your own custom dataset with YOLOv5 🚀. See YOLOv5 Docs for additional details. UPDATED 29 March 2023.
Before You Start
Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch>=1.7. Models and datasets download automatically from the latest YOLOv5 release.
Train On Custom Data
Creating a custom model to detect your objects is an iterative process of collecting and organizing images, labeling your objects of interest, training a model, deploying it into the wild to make predictions, and then using that deployed model to collect examples of edge cases to repeat and improve.
1. Create Dataset
YOLOv5 models must be trained on labelled data in order to learn classes of objects in that data. There are two options for creating your dataset before you start training:
1.1 Create dataset.yaml
COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory
path
and relative paths totrain
/val
/test
image directories (or *.txt files with image paths) and 2) a classnames
dictionary:1.2 Create Labels
After using an annotation tool to label your images, export your labels to YOLO format, with one
*.txt
file per image (if no objects in image, no*.txt
file is required). The*.txt
file specifications are:class x_center y_center width height
format.x_center
andwidth
by image width, andy_center
andheight
by image height.The label file corresponding to the above image contains 2 persons (class
0
) and a tie (class27
):1.3 Organize Directories
Organize your train and val images and labels according to the example below. YOLOv5 assumes
/coco128
is inside a/datasets
directory next to the/yolov5
directory. YOLOv5 locates labels automatically for each image by replacing the last instance of/images/
in each image path with/labels/
. For example:2. Select a Model
Select a pretrained model to start training from. Here we select YOLOv5s, the second-smallest and fastest model available. See our README table for a full comparison of all models.
3. Train
Train a YOLOv5s model on COCO128 by specifying dataset, batch-size, image size and either pretrained
--weights yolov5s.pt
(recommended), or randomly initialized--weights '' --cfg yolov5s.yaml
(not recommended). Pretrained weights are auto-downloaded from the latest YOLOv5 release.# Train YOLOv5s on COCO128 for 3 epochs $ python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt
💡 ProTip: Add
--cache ram
or--cache disk
to speed up training (requires significant RAM/disk resources).💡 ProTip: Always train from a local dataset. Mounted or network drives like Google Drive will be very slow.
All training results are saved to
runs/train/
with incrementing run directories, i.e.runs/train/exp2
,runs/train/exp3
etc. For more details see the Training section of our tutorial notebook.4. Visualize
Comet Logging and Visualization 🌟 NEW
Comet is now fully integrated with YOLOv5. Track and visualize model metrics in real time, save your hyperparameters, datasets, and model checkpoints, and visualize your model predictions with Comet Custom Panels! Comet makes sure you never lose track of your work and makes it easy to share results and collaborate across teams of all sizes!
Getting started is easy:
To learn more about all of the supported Comet features for this integration, check out the Comet Tutorial. If you'd like to learn more about Comet, head over to our documentation. Get started by trying out the Comet Colab Notebook:
ClearML Logging and Automation 🌟 NEW
ClearML is completely integrated into YOLOv5 to track your experimentation, manage dataset versions and even remotely execute training runs. To enable ClearML:
pip install clearml
clearml-init
to connect to a ClearML server (deploy your own open-source server here, or use our free hosted server here)You'll get all the great expected features from an experiment manager: live updates, model upload, experiment comparison etc. but ClearML also tracks uncommitted changes and installed packages for example. Thanks to that ClearML Tasks (which is what we call experiments) are also reproducible on different machines! With only 1 extra line, we can schedule a YOLOv5 training task on a queue to be executed by any number of ClearML Agents (workers).
You can use ClearML Data to version your dataset and then pass it to YOLOv5 simply using its unique ID. This will help you keep track of your data without adding extra hassle. Explore the ClearML Tutorial for details!
Local Logging
Training results are automatically logged with Tensorboard and CSV loggers to
runs/train
, with a new experiment directory created for each new training asruns/train/exp2
,runs/train/exp3
, etc.This directory contains train and val statistics, mosaics, labels, predictions and augmentated mosaics, as well as metrics and charts including precision-recall (PR) curves and confusion matrices.
Results file
results.csv
is updated after each epoch, and then plotted asresults.png
(below) after training completes. You can also plot anyresults.csv
file manually:Next Steps
Once your model is trained you can use your best checkpoint
best.pt
to:Environments
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
Status
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
The text was updated successfully, but these errors were encountered: