Pytorch 2.5 & torchtune 0.3+ #315

Delaunay · 2024-11-22T03:43:11Z

No description provided.

…g modules

Delaunay · 2025-01-22T13:04:30Z

=================
Benchmark results
=================

System
------
cpu:      Intel(R) Xeon(R) Gold 5418Y
n_cpu:    48
product:  NVIDIA L40S
n_gpu:    4
memory:   46068.0

Breakdown
---------
bench                    | fail |   n | ngpu |           perf |   sem% |   std% | peak_memory |          score | weight
brax                     |    0 |   1 |    4 |     1018714.82 |   0.1% |   0.5% |        1312 |     1018714.82 |   1.00
diffusion-gpus           |    1 |   1 |    4 |            nan |   nan% |   nan% |       28760 |            nan |   1.00
diffusion-single         |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |            nan |   0.00
dimenet                  |    0 |   4 |    1 |         556.91 |   0.8% |  12.1% |        3850 |        2252.87 |   1.00
dinov2-giant-gpus        |    1 |   1 |    4 |            nan |   nan% |   nan% |       22954 |            nan |   1.00
dinov2-giant-single      |    4 |   4 |    1 |            nan |   nan% |   nan% |        7878 |            nan |   0.00
dqn                      |    0 |   4 |    1 | 24104883766.33 |   1.6% |  90.6% |        1322 | 96296597893.65 |   0.00
bf16                     |    0 |   4 |    1 |         280.06 |   0.2% |   4.6% |        1278 |        1124.35 |   0.00
fp16                     |    0 |   4 |    1 |         275.07 |   0.2% |   2.8% |        1278 |        1102.02 |   0.00
fp32                     |    0 |   4 |    1 |          48.63 |   0.1% |   2.5% |        1656 |         194.48 |   0.00
tf32                     |    0 |   4 |    1 |         139.59 |   0.1% |   2.7% |        1656 |         558.79 |   0.00
bert-fp16                |    0 |   4 |    1 |         206.74 |   0.9% |  10.4% |         nan |         840.32 |   0.00
bert-fp32                |    0 |   4 |    1 |          71.81 |   0.4% |   4.9% |       20660 |         289.38 |   0.00
bert-tf32                |    0 |   4 |    1 |         119.53 |   0.6% |   7.1% |       20660 |         483.44 |   0.00
bert-tf32-fp16           |    0 |   4 |    1 |         207.24 |   0.9% |  10.4% |         nan |         842.42 |   1.00
reformer                 |    0 |   4 |    1 |          29.71 |   0.2% |   2.9% |       12940 |         119.24 |   1.00
t5                       |    0 |   4 |    1 |          30.78 |   0.3% |   4.5% |       33876 |         123.74 |   0.00
whisper                  |    0 |   4 |    1 |         425.06 |   0.5% |   8.2% |        8724 |        1715.05 |   0.00
lightning                |    0 |   4 |    1 |         510.27 |   0.4% |   7.5% |       25808 |        2054.43 |   0.00
lightning-gpus           |    0 |   2 |    4 |        2003.68 |   0.4% |   6.2% |       26198 |        2003.68 |   1.00
llava-single             |    4 |   4 |    1 |            nan |   nan% |   nan% |       11064 |            nan |   1.00
llama                    |    0 |   4 |    1 |         295.56 |   6.9% |  87.7% |       27202 |        1119.03 |   1.00
llm-full-mp-gpus         |    0 |   1 |    4 |          33.13 |   3.5% |  18.4% |       30918 |          33.13 |   1.00
llm-lora-ddp-gpus        |    1 |   1 |    4 |            nan |   nan% |   nan% |         nan |            nan |   1.00
llm-lora-mp-gpus         |    1 |   1 |    4 |            nan |   nan% |   nan% |         nan |            nan |   1.00
llm-lora-single          |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |            nan |   1.00
pna                      |    0 |   4 |    1 |        4350.11 |   0.5% |   7.7% |       39200 |       17412.14 |   1.00
ppo                      |    0 |   4 |    1 |    60873776.74 |   0.7% |  58.0% |         978 |   243494498.61 |   1.00
recursiongfn             |    0 |   4 |    1 |       10054.17 |   2.3% |  35.8% |        6702 |       40477.55 |   1.00
rlhf-gpus                |    1 |   1 |    4 |            nan |   nan% |   nan% |         nan |            nan |   0.00
rlhf-single              |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |            nan |   1.00
focalnet                 |    0 |   4 |    1 |         366.37 |   0.7% |  10.6% |       23038 |        1482.12 |   0.00
torchatari               |    0 |   4 |    1 |        7479.27 |   0.7% |  10.3% |        3264 |       29862.23 |   1.00
convnext_large-fp16      |    0 |   4 |    1 |         276.93 |   1.1% |  12.1% |         nan |        1128.72 |   0.00
convnext_large-fp32      |    0 |   4 |    1 |          70.28 |   0.6% |   6.5% |       44910 |         283.89 |   0.00
convnext_large-tf32      |    0 |   4 |    1 |         119.20 |   1.0% |  11.2% |       45502 |         483.95 |   0.00
convnext_large-tf32-fp16 |    0 |   4 |    1 |         276.49 |   1.1% |  12.1% |         nan |        1126.82 |   1.00
regnet_y_128gf           |    0 |   4 |    1 |          91.04 |   0.4% |   6.6% |       28810 |         366.72 |   1.00
resnet152-ddp-gpus       |    0 |   1 |    4 |        2020.81 |   0.0% |   0.3% |       25994 |        2020.81 |   0.00
resnet50                 |    0 |   4 |    1 |         906.37 |   0.7% |  10.2% |       13868 |        3666.19 |   1.00
resnet50-noio            |    0 |   4 |    1 |         861.21 |   0.0% |   1.6% |       26884 |        3446.88 |   0.00
vjepa-gpus               |    1 |   1 |    4 |            nan |   nan% |   nan% |         nan |            nan |   1.00
vjepa-single             |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |            nan |   1.00

Scores
------
Failure rate:      20.98% (FAIL)
Score:             204.45

Errors
------
30 errors, details in HTML report.

Delaunay · 2025-01-22T13:45:54Z

vjepa-gpus               |    4 |   5 |    4 |           8.71 |   0.2% |   1.8% |       23328 |           1.74 |   1.00
vjepa-single             |    4 |   8 |    1 |           3.91 |   0.9% |  13.8% |         nan |           7.91 |   1.00

Delaunay · 2025-01-22T14:51:45Z

rlhf-gpus                |    5 |   6 |    4 |         140.15 |   1.0% |   8.4% |        8296 |          23.36 |   0.00
rlhf-single              |    4 |   8 |    1 |          46.97 |   0.9% |  14.6% |        8228 |          94.05 |   1.00

Delaunay · 2025-01-22T15:04:45Z

diffusion-gpus           |    2 |   3 |    4 |           8.41 |   0.1% |   0.9% |       28748 |           2.80 |   1.00
diffusion-single         |    4 |   8 |    1 |           5.19 |   0.6% |   9.9% |       18818 |          10.49 |   0.00

Delaunay · 2025-01-22T15:08:42Z

dinov2-giant-gpus        |    1 |   2 |    4 |          23.10 |   1.3% |  10.0% |       30258 |          11.55 |   1.00
dinov2-giant-single      |    4 |   8 |    1 |           6.46 |   0.8% |  12.9% |       28604 |          13.04 |   0.00

Delaunay · 2025-01-22T15:14:31Z

llava out of memory

llava-single.D3
===============
  * no training rate retrieved
  * Error codes = 1
  * 1 exceptions found
    * 1 x torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB. GPU 0 has a total capacity of 44.64 GiB of which 54.25 MiB is free. Including non-PyTorch memory, this process has 44.58 GiB memory in use. Of the allocated memory 43.68 GiB is allocated by PyTorch, and 393.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
        | Traceback (most recent call last):
        |   File "/network/scratch/d/delaunap/shared/milabench/benchmarks/llava/main.py", line 151, in <module>
        |     main()
        |   File "/network/scratch/d/delaunap/shared/milabench/benchmarks/llava/main.py", line 134, in main
        |     optimizer.step()
        |   File "/tmp/workspace/venv/torch/lib/python3.10/site-packages/accelerate/optimizer.py", line 178, in step
        |     self.optimizer.step(closure)
        |   File "/tmp/workspace/venv/torch/lib/python3.10/site-packages/torch/optim/optimizer.py", line 487, in wrapper
        |     out = func(*args, **kwargs)
        |   File "/tmp/workspace/venv/torch/lib/python3.10/site-packages/torch/optim/optimizer.py", line 91, in _use_grad
        |     ret = func(self, *args, **kwargs)
        |   File "/tmp/workspace/venv/torch/lib/python3.10/site-packages/torch/optim/adamw.py", line 209, in step
        |     has_complex = self._init_group(
        |   File "/tmp/workspace/venv/torch/lib/python3.10/site-packages/torch/optim/adamw.py", line 148, in _init_group
        |     state["exp_avg"] = torch.zeros_like(
        | torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB. GPU 0 has a total capacity of 44.64 GiB of which 54.25 MiB is free. Including non-PyTorch memory, this process has 44.58 GiB memory in use. Of the allocated memory 43.68 GiB is allocated by PyTorch, and 393.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Delaunay · 2025-01-22T15:32:57Z

llm-lora-ddp-gpus |    0 |   1 |    4 |     552.09 |   1.5% |   7.8% |       10288 |     552.09 |   1.00

Delaunay · 2025-01-22T15:47:20Z

llm-lora-mp-gpus out of memory

Delaunay · 2025-01-22T15:57:23Z

Breakdown
---------
bench           | fail |   n | ngpu |       perf |   sem% |   std% | peak_memory |      score | weight
llm-lora-single |    0 |   1 |    1 |    1083.89 |   2.9% |  15.4% |       17364 |    1083.89 |   1.00

Delaunay · 2025-01-22T17:08:00Z

llm-lora-mp-gpus |    0 |   1 |    4 |      26.22 |   3.8% |  20.4% |       13842 |      26.22 |   1.00

Delaunay · 2025-01-22T19:40:03Z

with BS=1

Breakdown
---------
bench                    | fail |   n | ngpu |           perf |   sem% |   std% | peak_memory |           score | weight
brax                     |    1 |   1 |    4 |            nan |   nan% |   nan% |         nan |             nan |   1.00
diffusion-gpus           |    0 |   1 |    4 |           8.42 |   0.1% |   1.1% |       22076 |            8.42 |   1.00
diffusion-single         |    0 |   4 |    1 |           5.18 |   0.6% |   9.8% |       18818 |           20.95 |   0.00
dimenet                  |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |             nan |   1.00
dinov2-giant-gpus        |    0 |   1 |    4 |          23.42 |   1.2% |   9.3% |       30394 |           23.42 |   1.00
dinov2-giant-single      |    0 |   4 |    1 |           6.54 |   0.6% |  10.1% |       28604 |           26.38 |   0.00
dqn                      |    0 |   4 |    1 | 25110752643.15 |   1.6% |  90.0% |         874 | 100339640366.34 |   0.00
bf16                     |    0 |   4 |    1 |         279.25 |   0.2% |   4.4% |        1278 |         1121.01 |   0.00
fp16                     |    0 |   4 |    1 |         274.61 |   0.1% |   2.5% |        1278 |         1100.14 |   0.00
fp32                     |    0 |   4 |    1 |          48.41 |   0.1% |   1.6% |        1656 |          193.63 |   0.00
tf32                     |    0 |   4 |    1 |         138.95 |   0.1% |   2.5% |        1656 |          556.22 |   0.00
bert-fp16                |    0 |   4 |    1 |          31.34 |   1.3% |  14.6% |         nan |          128.18 |   0.00
bert-fp32                |    0 |   4 |    1 |          30.09 |   1.3% |  13.8% |         nan |          122.98 |   0.00
bert-tf32                |    0 |   4 |    1 |          36.53 |   1.3% |  14.9% |         nan |          149.39 |   0.00
bert-tf32-fp16           |    0 |   4 |    1 |          31.35 |   1.4% |  15.0% |         nan |          128.22 |   1.00
reformer                 |    0 |   4 |    1 |          33.80 |   0.7% |  11.2% |         nan |          136.72 |   1.00
t5                       |    0 |   4 |    1 |          24.01 |   0.7% |  11.2% |         nan |           97.11 |   0.00
whisper                  |    0 |   4 |    1 |         121.26 |   0.9% |  13.4% |         nan |          490.46 |   0.00
lightning                |    0 |   4 |    1 |          27.17 |   0.5% |   9.7% |         nan |          109.42 |   0.00
lightning-gpus           |    0 |   1 |    4 |          78.32 |   0.5% |   4.8% |        2784 |           78.32 |   1.00
llava-single             |    4 |   4 |    1 |            nan |   nan% |   nan% |       10104 |             nan |   1.00
llama                    |    0 |   4 |    1 |         302.48 |   7.0% |  89.6% |       27202 |         1147.42 |   1.00
llm-full-mp-gpus         |    0 |   1 |    4 |          13.11 |   4.0% |  21.2% |       29132 |           13.11 |   1.00
llm-lora-ddp-gpus        |    0 |   1 |    4 |         551.36 |   1.5% |   7.8% |       10288 |          551.36 |   1.00
llm-lora-mp-gpus         |    0 |   1 |    4 |          26.10 |   3.8% |  20.3% |       13842 |           26.10 |   1.00
llm-lora-single          |    0 |   4 |    1 |        1057.08 |   1.5% |  16.8% |       17364 |         4236.05 |   1.00
pna                      |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |             nan |   1.00
ppo                      |    4 |   4 |    1 |    48745723.11 |   0.3% |  58.4% |         978 |            0.00 |   1.00
recursiongfn             |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |             nan |   1.00
rlhf-gpus                |    0 |   1 |    4 |         132.65 |   1.0% |   8.1% |        9444 |          132.65 |   0.00
rlhf-single              |    0 |   4 |    1 |          43.12 |   0.8% |  14.1% |        7704 |          172.71 |   1.00
focalnet                 |    0 |   4 |    1 |          17.30 |   0.8% |  13.0% |         nan |           69.94 |   0.00
torchatari               |    0 |   4 |    1 |         624.83 |   0.6% |   9.2% |        1012 |         2518.79 |   1.00
convnext_large-fp16      |    0 |   4 |    1 |          29.92 |   1.5% |  16.2% |         nan |          122.02 |   0.00
convnext_large-fp32      |    0 |   4 |    1 |          41.17 |   1.4% |  15.5% |         nan |          168.37 |   0.00
convnext_large-tf32      |    0 |   4 |    1 |          43.73 |   1.5% |  16.2% |         nan |          178.93 |   0.00
convnext_large-tf32-fp16 |    0 |   4 |    1 |          29.44 |   1.5% |  16.6% |         nan |          120.14 |   1.00
regnet_y_128gf           |    0 |   4 |    1 |          17.15 |   0.8% |  12.9% |        1176 |           69.15 |   1.00
resnet152-ddp-gpus       |    0 |   1 |    4 |          94.74 |   0.2% |   1.7% |        1946 |           94.74 |   0.00
resnet50                 |    0 |   4 |    1 |          80.56 |   0.8% |  12.4% |         nan |          326.02 |   1.00
resnet50-noio            |    0 |   4 |    1 |          86.47 |   0.1% |   5.5% |        1226 |          346.35 |   0.00
vjepa-gpus               |    0 |   1 |    4 |           8.59 |   0.4% |   3.2% |       23428 |            8.59 |   1.00
vjepa-single             |    0 |   4 |    1 |           3.94 |   0.7% |  11.4% |        5970 |           15.95 |   1.00

Scores
------
Failure rate:      14.79% (FAIL)
Score:              32.65

Errors
------
21 errors, details in HTML report.

Delaunay · 2025-01-22T21:51:10Z

bench                    | fail |   n | ngpu |           perf |   sem% |   std% | peak_memory |          score | weight
brax                     |    0 |   1 |    4 |     1027148.08 |   0.0% |   0.1% |        1312 |     1027148.08 |   1.00
diffusion-gpus           |    1 |   1 |    4 |            nan |   nan% |   nan% |         nan |            nan |   1.00
diffusion-single         |    4 |   4 |    1 |            nan |   nan% |   nan% |         nan |            nan |   0.00
dimenet                  |    0 |   4 |    1 |         540.40 |   0.8% |  11.8% |        2674 |        2186.39 |   1.00
dinov2-giant-gpus        |    1 |   1 |    4 |            nan |   nan% |   nan% |       24856 |            nan |   1.00
dinov2-giant-single      |    4 |   4 |    1 |            nan |   nan% |   nan% |        7066 |            nan |   0.00
dqn                      |    0 |   4 |    1 | 23531418779.61 |   1.6% |  90.1% |        1354 | 94008285414.56 |   0.00
bf16                     |    0 |   4 |    1 |         279.79 |   0.2% |   4.5% |        1278 |        1123.16 |   0.00
fp16                     |    0 |   4 |    1 |         276.15 |   0.1% |   2.7% |        1278 |        1106.40 |   0.00
fp32                     |    0 |   4 |    1 |          48.62 |   0.1% |   1.8% |        1656 |         194.45 |   0.00
tf32                     |    0 |   4 |    1 |         139.48 |   0.1% |   2.7% |        1656 |         558.38 |   0.00
bert-fp16                |    0 |   4 |    1 |         206.96 |   0.9% |  10.5% |         nan |         841.39 |   0.00
bert-fp32                |    0 |   4 |    1 |          71.61 |   0.5% |   5.1% |       20660 |         288.57 |   0.00
bert-tf32                |    0 |   4 |    1 |         119.40 |   0.7% |   7.2% |       20660 |         482.84 |   0.00
bert-tf32-fp16           |    0 |   4 |    1 |         206.83 |   0.9% |  10.4% |         nan |         840.69 |   1.00
reformer                 |    0 |   4 |    1 |          29.71 |   0.2% |   3.0% |       12940 |         119.23 |   1.00
t5                       |    0 |   4 |    1 |          30.79 |   0.3% |   4.6% |       33876 |         123.79 |   0.00
whisper                  |    0 |   4 |    1 |         425.00 |   0.5% |   8.3% |         nan |        1715.32 |   0.00
lightning                |    0 |   4 |    1 |         510.22 |   0.4% |   7.5% |       25808 |        2054.12 |   0.00
lightning-gpus           |    0 |   1 |    4 |        2014.40 |   0.0% |   0.4% |       26198 |        2014.40 |   1.00
llava-single             |    4 |   4 |    1 |            nan |   nan% |   nan% |       14280 |            nan |   1.00
llama                    |    0 |   4 |    1 |         295.91 |   6.9% |  87.7% |       27202 |        1119.81 |   1.00
llm-full-mp-gpus         |    0 |   1 |    4 |          31.34 |   3.5% |  18.3% |       25208 |          31.34 |   1.00
llm-lora-ddp-gpus        |    0 |   1 |    4 |        5247.18 |   0.4% |   2.1% |       32870 |        5247.18 |   1.00
llm-lora-mp-gpus         |    0 |   1 |    4 |         353.28 |   2.0% |  10.7% |       19166 |         353.28 |   1.00
llm-lora-single          |    0 |   4 |    1 |        2342.46 |   0.1% |   0.9% |       31112 |        9368.48 |   1.00
pna                      |    0 |   4 |    1 |        4386.70 |   0.5% |   7.5% |       39214 |       17557.84 |   1.00
ppo                      |    0 |   4 |    1 |    60533209.59 |   0.7% |  57.9% |         978 |   242133098.76 |   1.00
recursiongfn             |    0 |   4 |    1 |       12721.01 |   1.4% |  21.5% |        8154 |       51218.08 |   1.00
rlhf-gpus                |    0 |   1 |    4 |        6411.79 |   0.3% |   2.3% |       20398 |        6411.79 |   0.00
rlhf-single              |    0 |   4 |    1 |        1828.87 |   0.2% |   3.1% |       19128 |        7323.80 |   1.00
focalnet                 |    0 |   4 |    1 |         366.11 |   0.7% |  10.6% |       23038 |        1481.03 |   0.00
torchatari               |    0 |   4 |    1 |        7487.28 |   0.6% |   9.6% |        3264 |       29872.98 |   1.00
convnext_large-fp16      |    0 |   4 |    1 |         276.36 |   1.1% |  12.2% |         nan |        1126.43 |   0.00
convnext_large-fp32      |    0 |   4 |    1 |          70.17 |   0.6% |   7.0% |       44910 |         283.70 |   0.00
convnext_large-tf32      |    0 |   4 |    1 |         119.13 |   1.0% |  11.2% |       45502 |         483.66 |   0.00
convnext_large-tf32-fp16 |    0 |   4 |    1 |         276.51 |   1.1% |  12.1% |         nan |        1126.96 |   1.00
regnet_y_128gf           |    0 |   4 |    1 |          90.97 |   0.4% |   6.8% |       28810 |         366.50 |   1.00
resnet152-ddp-gpus       |    0 |   1 |    4 |        2018.87 |   0.2% |   1.8% |       25994 |        2018.87 |   0.00
resnet50                 |    0 |   4 |    1 |         904.41 |   0.7% |  10.4% |       13868 |        3658.59 |   1.00
resnet50-noio            |    0 |   4 |    1 |         861.34 |   0.0% |   1.6% |       26884 |        3447.39 |   0.00
vjepa-gpus               |    1 |   1 |    4 |            nan |   nan% |   nan% |       45900 |            nan |   1.00
vjepa-single             |    4 |   4 |    1 |            nan |   nan% |   nan% |        2910 |            nan |   1.00

rkarhila-amd and others added 5 commits November 12, 2024 12:29

Started instrumeting recipes from newer torchtune for milabench

0fa1419

This specific torchtune version requires a roundabout way of importin…

3d83577

…g modules

Updated recipes and configs

37b35f7

file left out from previous commit + conf typo fix

87b987f

Merge branch 'master' of github.com:mila-iqia/milabench into pytorch2.5

b6cf6be

Delaunay changed the title ~~Staging~~ Pytorch 2.5 Nov 22, 2024

Delaunay changed the title ~~Pytorch 2.5~~ Pytorch 2.5 & torchtune 0.3+ Nov 22, 2024

pierre.delaunay and others added 9 commits January 14, 2025 11:34

Update dockerfile

d47751c

-

5020932

Delete benchmarks/geo_gnn/bad.txt

7936166

Update base.yaml

40ff390

update dependencies to torch 2.5

feb9cca

Add shared setup

35cdcfa

Merge branch 'docker' of github.com:mila-iqia/milabench into docker

2b2bcb2

Update torchtune and pytorch

a0293eb

Merge branch 'docker' of github.com:mila-iqia/milabench into staging

1340e16

Delaunay marked this pull request as ready for review January 16, 2025 20:02

pierre.delaunay added 6 commits January 16, 2025 15:14

Update LLM benchmarks

776e3e1

use python 3.10

df7d8a1

Add utility to help launch milabench with docker

cf751f7

Make torchrun use docker in multinode

d73af7d

Add docker to ForeachNode

40c35bd

Add documentation for docker + multinode

3f860c7

Delaunay force-pushed the staging branch from 3adab0f to 3f860c7 Compare January 17, 2025 17:33

pierre.delaunay added 4 commits January 21, 2025 10:32

Disable GPU warden on prepare

68cc940

Maximise build space

b2e4cc2

Add missing dependencies

1c538b0

Increase root system size

f035e5b

Delaunay force-pushed the staging branch from 5b40a0d to f035e5b Compare January 21, 2025 17:37

Add to avoid flooding journald

0710fff

Fix dataset path for vjepa

8cad4a2

pierre.delaunay added 4 commits January 22, 2025 09:07

Update llm-lora-ddp-gpus

a784485

Update llm-lora-ddp-gpus

684e894

Update llm-lora-ddp-gpus

d02a574

Fix rlhf-gpus

2091a16

pierre.delaunay added 2 commits January 22, 2025 10:14

Update llava model

f67e5de

Fix llm-lora-ddp-gpus

a1a9a06

Fix llm-lora-single

612a8c8

pierre.delaunay added 2 commits January 22, 2025 11:21

Update llm-full-mp-gpus

130a131

Remove dataset.pack

5b4fe16

pierre.delaunay added 3 commits January 23, 2025 12:24

update batch resizing logic

ad2f3e3

Remove the process monitor grom the GPU monitor

316fdfa

Add channel last to resnet50

60843ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytorch 2.5 & torchtune 0.3+ #315

Pytorch 2.5 & torchtune 0.3+ #315

Delaunay commented Nov 22, 2024

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Pytorch 2.5 & torchtune 0.3+ #315

Are you sure you want to change the base?

Pytorch 2.5 & torchtune 0.3+ #315

Conversation

Delaunay commented Nov 22, 2024

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025

Delaunay commented Jan 22, 2025