Some explanations for training

Tossy0423 · Apr 28, 2020 · 36c73c5 · 36c73c5
1 parent 6d38218
commit 36c73c5
Show file tree

Hide file tree

Showing 3 changed files with 7 additions and 9 deletions.
diff --git a/README.md b/README.md
@@ -389,7 +389,7 @@ Then add to your created project:
 
 2. Then stop and by using partially-trained model `/backup/yolov4_1000.weights` run training with multigpu (up to 4 GPUs): `darknet.exe detector train cfg/coco.data cfg/yolov4.cfg /backup/yolov4_1000.weights -gpus 0,1,2,3`
 
-Only for small datasets sometimes better to decrease learning rate, for 4 GPUs set `learning_rate = 0.00025` (i.e. learning_rate = 0.001 / GPUs). In this case also increase 4x times `burn_in =` and `max_batches =` in your cfg-file. I.e. use `burn_in = 4000` instead of `1000`. Same goes for `steps=` if `policy=steps` is set.
+If you get a Nan, then for some datasets better to decrease learning rate, for 4 GPUs set `learning_rate = 0,00065` (i.e. learning_rate = 0.00261 / GPUs). In this case also increase 4x times `burn_in =` in your cfg-file. I.e. use `burn_in = 4000` instead of `1000`.
 
 https://groups.google.com/d/msg/darknet/NbJqonJBTSY/Te5PfIpuCAAJ
 

diff --git a/build/darknet/x64/cfg/yolov4.cfg b/build/darknet/x64/cfg/yolov4.cfg
@@ -1,10 +1,9 @@
 [net]
-# Testing
-#batch=1
-#subdivisions=1
-# Training
 batch=64
 subdivisions=8
+# Training
+#width=512
+#height=512
 width=608
 height=608
 channels=3

diff --git a/cfg/yolov4.cfg b/cfg/yolov4.cfg
@@ -1,10 +1,9 @@
 [net]
-# Testing
-#batch=1
-#subdivisions=1
-# Training
 batch=64
 subdivisions=8
+# Training
+#width=512
+#height=512
 width=608
 height=608
 channels=3