submitit INFO (2025-07-28 13:10:43,435) - Starting with JobEnvironment(job_id=525762, hostname=c137, local_rank=0(1), node=0(1), global_rank=0(1))
submitit INFO (2025-07-28 13:10:43,437) - Loading pickle: /data/horse/ws/pwinkler-mnist_tst/omniopt/runs/mnist_gpu_noall/1/single_runs/525762/525762_submitted.pkl
/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/networkx/utils/backends.py:135: RuntimeWarning: networkx backend defined more than once: nx-loopback
backends.update(_get_backends("networkx.backends"))
Traceback (most recent call last):
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/train", line 285, in main
model = SimpleMLP(
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1355, in to
return self._apply(convert)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 942, in _apply
param_applied = fn(param)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1341, in convert
return t.to(
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Parameters: {"epochs": 152, "lr": 0.08946310024261475, "batch_size": 745, "hidden_size": 60, "dropout": 0.3126316964626312, "num_dense_layers": 2, "weight_decay": 0.9295684695243835, "activation": "leaky_relu", "init": "normal"}
Debug-Infos:
========
DEBUG INFOS START:
Program-Code: python3 .tests/mnist/train --epochs 152 --learning_rate 0.08946310024261475147 --batch_size 745 --hidden_size 60 --dropout 0.31263169646263122559 --activation leaky_relu --num_dense_layers 2 --init normal --weight_decay 0.92956846952438354492
pwd: /data/horse/ws/pwinkler-mnist_tst/omniopt
File: .tests/mnist/train
UID: 2054851
GID: 200270
SLURM_JOB_ID: 525762
Status-Change-Time: 1753273859.0
Size: 12359 Bytes
Permissions: -rwxr-xr-x
Owner: pwinkler
Last access: 1753701046.0
Last modification: 1753270258.0
Hostname: c137
========
DEBUG INFOS END
python3 .tests/mnist/train --epochs 152 --learning_rate 0.08946310024261475147 --batch_size 745 --hidden_size 60 --dropout 0.31263169646263122559 --activation leaky_relu --num_dense_layers 2 --init normal --weight_decay 0.92956846952438354492
stdout:
Hyperparameters
โญโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโฎ
โ Parameter โ Value โ
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโค
โ Device โ cuda โ
โ Epochs โ 152 โ
โ Num Dense Layers โ 2 โ
โ Batch size โ 745 โ
โ Learning rate โ 0.08946310024261475 โ
โ Hidden size โ 60 โ
โ Dropout โ 0.3126316964626312 โ
โ Optimizer โ adam โ
โ Momentum โ 0.9 โ
โ Weight Decay โ 0.9295684695243835 โ
โ Activation โ leaky_relu โ
โ Init Method โ normal โ
โ Seed โ None โ
โฐโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโฏ
Using device: cuda
An error occurred: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so
the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
stderr:
/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/networkx/utils/backends.py:135: RuntimeWarning: networkx backend defined more than once: nx-loopback
backends.update(_get_backends("networkx.backends"))
Traceback (most recent call last):
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/train", line 285, in main
model = SimpleMLP(
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1355, in to
return self._apply(convert)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 942, in _apply
param_applied = fn(param)
File "/data/horse/ws/pwinkler-mnist_tst/omniopt/.tests/mnist/.torch_venv/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1341, in convert
return t.to(
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Result: {'VAL_ACC': None}
Final-results: {'VAL_ACC': None}
EXIT_CODE: 1
submitit INFO (2025-07-28 13:10:58,619) - Job completed successfully
submitit INFO (2025-07-28 13:10:58,620) - Exiting after successful completion