submitit INFO (2025-07-31 15:46:39,691) - Starting with JobEnvironment(job_id=530989, hostname=c137, local_rank=0(1), node=0(1), global_rank=0(1))
submitit INFO (2025-07-31 15:46:39,691) - Loading pickle: /data/cat/ws/pwinkler-mnist_tst/omniopt/runs/mnist_gpu_noall/0/single_runs/530989/530989_submitted.pkl
Traceback (most recent call last):
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/train", line 301, in main
).to(args.device)
^^^^^^^^^^^^^^^
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1355, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 942, in _apply
param_applied = fn(param)
^^^^^^^^^
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1341, in convert
return t.to(
^^^^^
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Parameters: {"epochs": 102, "lr": 0.08048484589278698, "batch_size": 232, "hidden_size": 735, "dropout": 0.04738559573888779, "num_dense_layers": 3, "weight_decay": 0.034132227301597595, "activation": "leaky_relu", "init": "normal"}
Debug-Infos:
========
DEBUG INFOS START:
Program-Code: python3 .tests/mnist/train --epochs 102 --learning_rate 0.08048484589278698254 --batch_size 232 --hidden_size 735 --dropout 0.04738559573888778687 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.03413222730159759521
pwd: /data/cat/ws/pwinkler-mnist_tst/omniopt
File: .tests/mnist/train
UID: 2054851
GID: 200270
SLURM_JOB_ID: 530989
Status-Change-Time: 1753967537.933081
Size: 12760 Bytes
Permissions: -rwxr-xr-x
Owner: pwinkler
Last access: 1753968020.9583523
Last modification: 1753967537.933081
Hostname: c137
========
DEBUG INFOS END
python3 .tests/mnist/train --epochs 102 --learning_rate 0.08048484589278698254 --batch_size 232 --hidden_size 735 --dropout 0.04738559573888778687 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.03413222730159759521
stdout:
Available GPU memory: 0.00 MB reserved
Free GPU memory: 0.00 MB allocated
Max GPU memory allocated: 0.00 MB
Hyperparameters
โญโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Parameter โ Value โ
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโค
โ Device โ cuda โ
โ Epochs โ 102 โ
โ Num Dense Layers โ 3 โ
โ Batch size โ 232 โ
โ Learning rate โ 0.08048484589278698 โ
โ Hidden size โ 735 โ
โ Dropout โ 0.04738559573888779 โ
โ Optimizer โ adam โ
โ Momentum โ 0.9 โ
โ Weight Decay โ 0.034132227301597595 โ
โ Activation โ leaky_relu โ
โ Init Method โ normal โ
โ Seed โ None โ
โฐโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโฏ
Using device: cuda
An error occurred: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so
the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
stderr:
Traceback (most recent call last):
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/train", line 301, in main
).to(args.device)
^^^^^^^^^^^^^^^
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1355, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 942, in _apply
param_applied = fn(param)
^^^^^^^^^
File "/data/cat/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1341, in convert
return t.to(
^^^^^
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Result: {'VAL_ACC': None}
Final-results: {'VAL_ACC': None}
EXIT_CODE: 1
submitit INFO (2025-07-31 15:47:20,860) - Job completed successfully
submitit INFO (2025-07-31 15:47:20,863) - Exiting after successful completion