Experiment overview
Setting | Value |
---|
Model for non-random steps | BOTORCH_MODULAR |
Max. nr. evaluations | 500 |
Number random steps | 20 |
Nr. of workers (parameter) | 20 |
Main process memory (GB) | 8 |
Worker memory (GB) | 10 |
Job Summary per Generation Node
Generation Node | Total | COMPLETED | FAILED | RUNNING |
SOBOL | 20 | 9 | 9 | 2 |
Experiment parameters
Name | Type | Lower bound | Upper bound | Values | Type | Log Scale? |
---|
epochs | range | 10 | 200 | | int | No |
lr | range | 1e-05 | 0.1 | | float | No |
batch_size | range | 8 | 2048 | | int | No |
hidden_size | range | 8 | 2048 | | int | No |
dropout | range | 0 | 0.5 | | float | No |
activation | fixed | | | leaky_relu | | |
num_dense_layers | range | 1 | 4 | | int | No |
init | fixed | | | normal | | |
weight_decay | range | 0 | 1 | | float | No |
Number of evaluations
Failed |
Succeeded |
Running |
Total |
9 |
9 |
2 |
20 |
Result names and types
Last progressbar status
2025-07-31 15:40:07: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 2∑2 (0%/20), waiting for 3 jobs, finished 1 job
Git-Version
Commit: f9547a580b93e0983ebff52a5b7750569294ad57
trial_index,submit_time,queue_time,start_time,end_time,run_time,program_string,exit_code,signal,hostname,OO_Info_SLURM_JOB_ID,arm_name,trial_status,generation_node,VAL_ACC,epochs,lr,batch_size,hidden_size,dropout,num_dense_layers,weight_decay,activation,init
0,1753967672,10,1753967682,1753967944,262,python3 .tests/mnist/train --epochs 127 --learning_rate 0.06689339573502540992 --batch_size 1214 --hidden_size 990 --dropout 0.45562517642974853516 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.59872841835021972656,1,,c143,530836,0_0,FAILED,SOBOL,,127,0.066893395735025409920559980037,1214,990,0.45562517642974853515625,3,0.5987284183502197265625,leaky_relu,normal
1,1753967731,33,1753967764,1753968675,911,python3 .tests/mnist/train --epochs 92 --learning_rate 0.04306331038652919802 --batch_size 658 --hidden_size 1202 --dropout 0.22459426010027527809 --activation leaky_relu --num_dense_layers 2 --init normal --weight_decay 0.14708865061402320862,0,,c143,530837,1_0,COMPLETED,SOBOL,11.34999999999999964472863211995,92,0.043063310386529198015015396095,658,1202,0.224594260100275278091430664062,2,0.1470886506140232086181640625,leaky_relu,normal
2,1753967761,23,1753967784,1753967943,159,python3 .tests/mnist/train --epochs 20 --learning_rate 0.07937379036461003623 --batch_size 1636 --hidden_size 513 --dropout 0.03445854969322681427 --activation leaky_relu --num_dense_layers 1 --init normal --weight_decay 0.975613446906208992,1,,c143,530838,2_0,FAILED,SOBOL,,20,0.079373790364610036229819911568,1636,513,0.03445854969322681427001953125,1,0.97561344690620899200439453125,leaky_relu,normal
3,1753967794,11,1753967805,1753967947,142,python3 .tests/mnist/train --epochs 199 --learning_rate 0.00556081472378224147 --batch_size 92 --hidden_size 1678 --dropout 0.26975434739142656326 --activation leaky_relu --num_dense_layers 4 --init normal --weight_decay 0.27002071961760520935,1,,c141,530839,3_0,FAILED,SOBOL,,199,0.005560814723782241467131548518,92,1678,0.269754347391426563262939453125,4,0.2700207196176052093505859375,leaky_relu,normal
4,1753967842,48,1753967890,1753967939,49,python3 .tests/mnist/train --epochs 169 --learning_rate 0.09104195689666085001 --batch_size 844 --hidden_size 1884 --dropout 0.34263794217258691788 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.78356554172933101654,1,,c141,530842,4_0,FAILED,SOBOL,,169,0.091041956896660850007130250106,844,1884,0.342637942172586917877197265625,3,0.78356554172933101654052734375,leaky_relu,normal
5,1753967873,34,1753967907,1753967938,31,python3 .tests/mnist/train --epochs 38 --learning_rate 0.01723163600472733467 --batch_size 1396 --hidden_size 52 --dropout 0.10320703499019145966 --activation leaky_relu --num_dense_layers 2 --init normal --weight_decay 0.45492733269929885864,1,,c137,530845,5_0,FAILED,SOBOL,,38,0.01723163600472733467117159023,1396,52,0.10320703499019145965576171875,2,0.454927332699298858642578125,leaky_relu,normal
6,1753967904,30,1753967934,1753967940,6,python3 .tests/mnist/train --epochs 62 --learning_rate 0.05356406282581389899 --batch_size 422 --hidden_size 1404 --dropout 0.17099775979295372963 --activation leaky_relu --num_dense_layers 1 --init normal --weight_decay 0.65649853460490703583,1,,c137,530847,6_0,FAILED,SOBOL,,62,0.053564062825813898993665418402,422,1404,0.170997759792953729629516601562,1,0.65649853460490703582763671875,leaky_relu,normal
7,1753967985,27,1753968012,1753968999,987,python3 .tests/mnist/train --epochs 145 --learning_rate 0.02973663241245784114 --batch_size 1962 --hidden_size 533 --dropout 0.398839601781219244 --activation leaky_relu --num_dense_layers 4 --init normal --weight_decay 0.08181141316890716553,0,,c143,530850,7_0,COMPLETED,SOBOL,10.27999999999999936051153781591,145,0.029736632412457841140307479577,1962,533,0.398839601781219244003295898438,4,0.08181141316890716552734375,leaky_relu,normal
8,1753968066,7,1753968073,1753968969,896,python3 .tests/mnist/train --epochs 136 --learning_rate 0.08517047294191085194 --batch_size 332 --hidden_size 226 --dropout 0.21755825495347380638 --activation leaky_relu --num_dense_layers 4 --init normal --weight_decay 0.82492708694189786911,0,,c141,530855,8_0,COMPLETED,SOBOL,11.34999999999999964472863211995,136,0.085170472941910851938374094061,332,226,0.217558254953473806381225585938,4,0.824927086941897869110107421875,leaky_relu,normal
9,1753968128,47,1753968175,1753968211,36,python3 .tests/mnist/train --epochs 76 --learning_rate 0.012141729941014201 --batch_size 1908 --hidden_size 1966 --dropout 0.47822938347235321999 --activation leaky_relu --num_dense_layers 1 --init normal --weight_decay 0.43676314223557710648,1,,c137,530858,9_0,FAILED,SOBOL,,76,0.012141729941014200999660488378,1908,1966,0.478229383472353219985961914062,1,0.436763142235577106475830078125,leaky_relu,normal
10,1753968178,15,1753968193,1753968229,36,python3 .tests/mnist/train --epochs 52 --learning_rate 0.07266547727916390642 --batch_size 930 --hidden_size 766 --dropout 0.29590566642582416534 --activation leaky_relu --num_dense_layers 2 --init normal --weight_decay 0.70181095879524946213,1,,c137,530860,10_0,FAILED,SOBOL,,52,0.072665477279163906421111107647,930,766,0.29590566642582416534423828125,2,0.701810958795249462127685546875,leaky_relu,normal
11,1753968229,45,1753968274,1753968311,37,python3 .tests/mnist/train --epochs 160 --learning_rate 0.04961962326687761188 --batch_size 1454 --hidden_size 1426 --dropout 0.02387435454875230789 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.05969598609954118729,1,,c137,530862,11_0,FAILED,SOBOL,,160,0.04961962326687761187793412887,1454,1426,0.023874354548752307891845703125,3,0.059695986099541187286376953125,leaky_relu,normal
12,,,,,,,,,,,12_0,RUNNING,SOBOL,,179,0.059436874155011039377871639999,1722,1116,0.080644668079912662506103515625,4,0.511717787943780422210693359375,leaky_relu,normal
13,1753968379,33,1753968412,1753968584,172,python3 .tests/mnist/train --epochs 23 --learning_rate 0.03638760253067128969 --batch_size 149 --hidden_size 820 --dropout 0.34951743297278881073 --activation leaky_relu --num_dense_layers 1 --init normal --weight_decay 0.24264822248369455338,0,,c139,530869,13_0,COMPLETED,SOBOL,20.12999999999999900524016993586,23,0.036387602530671289691177605619,149,820,0.34951743297278881072998046875,1,0.242648222483694553375244140625,leaky_relu,normal
14,1753968415,18,1753968433,1753968961,528,python3 .tests/mnist/train --epochs 95 --learning_rate 0.09693937056274154473 --batch_size 1124 --hidden_size 1660 --dropout 0.40949616441503167152 --activation leaky_relu --num_dense_layers 2 --init normal --weight_decay 0.88464964833110570908,0,,c139,530875,14_0,COMPLETED,SOBOL,10.27999999999999936051153781591,95,0.096939370562741544734564058672,1124,1660,0.409496164415031671524047851562,2,0.884649648331105709075927734375,leaky_relu,normal
15,1753968460,34,1753968494,1753969096,602,python3 .tests/mnist/train --epochs 107 --learning_rate 0.02390720845982432716 --batch_size 604 --hidden_size 276 --dropout 0.14465939393267035484 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.36953307781368494034,0,,c139,530877,15_0,COMPLETED,SOBOL,9.74000000000000021316282072803,107,0.02390720845982432715692844738,604,276,0.144659393932670354843139648438,3,0.369533077813684940338134765625,leaky_relu,normal
16,1753968495,37,1753968532,1753969170,638,python3 .tests/mnist/train --epochs 112 --learning_rate 0.09531696763547137241 --batch_size 1810 --hidden_size 1307 --dropout 0.25852300785481929779 --activation leaky_relu --num_dense_layers 1 --init normal --weight_decay 0.17023077234625816345,0,,c138,530880,16_0,COMPLETED,SOBOL,10.91000000000000014210854715202,112,0.095316967635471372410904677963,1810,1307,0.25852300785481929779052734375,1,0.1702307723462581634521484375,leaky_relu,normal
17,1753968531,23,1753968554,1753969131,577,python3 .tests/mnist/train --epochs 100 --learning_rate 0.01915742137833498573 --batch_size 301 --hidden_size 629 --dropout 0.06229067686945199966 --activation leaky_relu --num_dense_layers 4 --init normal --weight_decay 0.59164238162338733673,0,,c138,530882,17_0,COMPLETED,SOBOL,9.74000000000000021316282072803,100,0.019157421378334985734293027804,301,629,0.062290676869451999664306640625,4,0.59164238162338733673095703125,leaky_relu,normal
18,1753968567,17,1753968584,1753968768,184,python3 .tests/mnist/train --epochs 28 --learning_rate 0.05783888303508051554 --batch_size 1483 --hidden_size 1852 --dropout 0.24363784631714224815 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.29326960816979408264,0,,c138,530883,18_0,COMPLETED,SOBOL,10.27999999999999936051153781591,28,0.057838883035080515537806888915,1483,1852,0.243637846317142248153686523438,3,0.2932696081697940826416015625,leaky_relu,normal
19,,,,,,,,,,,19_0,RUNNING,SOBOL,,183,0.031662226884029809337306460293,1026,85,0.451230330858379602432250976562,2,0.96854268945753574371337890625,leaky_relu,normal
To cancel, press CTRL c, then run 'scancel 530833'
⠋ Importing logging...
⠋ Importing warnings...
⠋ Importing argparse...
⠋ Importing datetime...
⠋ Importing dataclass...
⠋ Importing hashlib...
⠋ Importing socket...
⠋ Importing stat...
⠋ Importing pwd...
⠋ Importing signal...
⠋ Importing base64...
⠋ Importing json...
⠋ Importing yaml...
⠋ Importing toml...
⠋ Importing csv...
⠋ Importing ast...
⠋ Importing rich.table...
⠋ Importing rich print...
⠋ Importing rich.pretty...
⠋ Importing rich.prompt...
⠋ Importing types.FunctionType...
⠋ Importing typing...
⠋ Importing ThreadPoolExecutor...
⠋ Importing submitit.LocalExecutor...
⠋ Importing submitit.Job...
⠋ Importing importlib.util...
⠋ Importing inspect...
⠋ Importing platform...
⠋ Importing inspect frame info...
⠋ Importing pathlib.Path...
⠋ Importing uuid...
⠋ Importing traceback...
⠋ Importing cowsay...
⠋ Importing psutil...
⠋ Importing shutil...
⠋ Importing itertools.combinations...
⠋ Importing os.listdir...
⠋ Importing os.path...
⠋ Importing PIL.Image...
⠋ Importing sixel...
⠋ Importing subprocess...
⠋ Importing tqdm...
⠴ Importing beartype...
⠋ Importing statistics...
⠋ Trying to import pyfiglet...
⠧ Importing helpers...
⠋ Parsing arguments...
⠸ Importing torch...
⠋ Importing numpy...
⠋ Importing collections...
⠧ Importing ax...
⠋ Importing ax.core.generator_run...
⠋ Importing Cont_X_trans and Y_trans from ax.modelbridge.registry...
⠋ Importing ax.core.arm...
⠋ Importing ax.core.objective...
⠋ Importing ax.core.Metric...
⠋ Importing ax.exceptions.core...
⠋ Importing ax.exceptions.generation_strategy...
⠋ Importing CORE_DECODER_REGISTRY...
⠋ Trying ax.generation_strategy.generation_node...
⠋ Importing GenerationStep, GenerationStrategy from generation_strategy...
⠋ Importing GenerationNode from generation_node...
⠋ Importing ExternalGenerationNode...
⠋ Importing MaxTrials...
⠋ Importing GeneratorSpec...
⠋ Importing Models from ax.modelbridge.registry...
⠋ Importing get_pending_observation_features...
⠋ Importing load_experiment...
⠋ Importing save_experiment...
⠋ Importing save_experiment_to_db...
⠋ Importing TrialStatus...
⠋ Importing Data...
⠋ Importing Experiment...
⠋ Importing parameter types...
⠋ Importing TParameterization...
⠋ Importing pandas...
⠴ Importing AxClient and ObjectiveProperties...
⠋ Importing RandomForestRegressor...
⠋ Importing botorch...
⠋ Importing submitit...
⠋ Importing ax logger...
⠋ Importing SQL-Storage-Stuff...
Run-UUID: dd6fd54b-612d-4e05-9461-3a0820ca358c
_____ _ _____ _ _____
| _ | (_) _ | | | / __ \
| | | |_ __ ___ _ __ _| | | |_ __ | |_`' / /'
| | | | '_ ` _ \| '_ \| | | | | '_ \| __| / /
\ \_/ / | | | | | | | | \ \_/ / |_) | |_./ /___
\___/|_| |_| |_|_| |_|_|\___/| .__/ \__\_____/
| |
|_|
⠋ Writing worker creation log...
omniopt --partition=alpha --experiment_name=mnist_gpu_noall --mem_gb=10 --time=2880 --worker_timeout=120 --max_eval=500 --num_parallel_jobs=20 --gpus=1 --num_random_steps=20 --follow --live_share --send_anonymized_usage_stats --result_names VAL_ACC=max --run_program='cHl0aG9uMyAudGVzdHMvbW5pc3QvdHJhaW4gLS1lcG9jaHMgJWVwb2NocyAtLWxlYXJuaW5nX3JhdGUgJWxyIC0tYmF0Y2hfc2l6ZSAlYmF0Y2hfc2l6ZSAtLWhpZGRlbl9zaXplICVoaWRkZW5fc2l6ZSAtLWRyb3BvdXQgJWRyb3BvdXQgLS1hY3RpdmF0aW9uICVhY3RpdmF0aW9uIC0tbnVtX2RlbnNlX2xheWVycyAlbnVtX2RlbnNlX2xheWVycyAtLWluaXQgJWluaXQgLS13ZWlnaHRfZGVjYXkgJXdlaWdodF9kZWNheQ==' --cpus_per_task=1 --nodes_per_job=1 --revert_to_random_when_seemingly_exhausted --model=BOTORCH_MODULAR --n_estimators_randomforest=100 --run_mode=local --occ_type=euclid --main_process_gb=8 --max_nr_of_zero_results=50 --slurm_signal_delay_s=0 --max_failed_jobs=0 --max_attempts_for_generation=20 --num_restarts=20 --raw_samples=1024 --max_abandoned_retrial=20 --max_num_of_parallel_sruns=16 --parameter epochs range 10 200 int false --parameter lr range 0.00001 0.1 float false --parameter batch_size range 8 2048 int false --parameter hidden_size range 8 2048 int false --parameter dropout range 0 0.5 float false --parameter activation fixed leaky_relu --parameter num_dense_layers range 1 4 int false --parameter init fixed normal --parameter weight_decay range 0 1 float false --ui_url aHR0cHM6Ly9pbWFnZXNlZy5zY2Fkcy5kZS9vbW5pYXgvZ3VpP3BhcnRpdGlvbj1hbHBoYSZleHBlcmltZW50X25hbWU9bW5pc3RfZ3B1X25vYWxsJnJlc2VydmF0aW9uPSZhY2NvdW50PSZtZW1fZ2I9MTAmdGltZT0yODgwJndvcmtlcl90aW1lb3V0PTEyMCZtYXhfZXZhbD01MDAmbnVtX3BhcmFsbGVsX2pvYnM9MjAmZ3B1cz0xJm51bV9yYW5kb21fc3RlcHM9MjAmZm9sbG93PTEmbGl2ZV9zaGFyZT0xJnNlbmRfYW5vbnltaXplZF91c2FnZV9zdGF0cz0xJmNvbnN0cmFpbnRzPSZyZXN1bHRfbmFtZXM9VkFMX0FDQyUzRG1heCZydW5fcHJvZ3JhbT1weXRob24zJTIwLnRlc3RzJTJGbW5pc3QlMkZ0cmFpbiUyMC0tZXBvY2hzJTIwJTI1ZXBvY2hzJTIwLS1sZWFybmluZ19yYXRlJTIwJTI1bHIlMjAtLWJhdGNoX3NpemUlMjAlMjViYXRjaF9zaXplJTIwLS1oaWRkZW5fc2l6ZSUyMCUyNWhpZGRlbl9zaXplJTIwLS1kcm9wb3V0JTIwJTI1ZHJvcG91dCUyMC0tYWN0aXZhdGlvbiUyMCUyNWFjdGl2YXRpb24lMjAtLW51bV9kZW5zZV9sYXllcnMlMjAlMjVudW1fZGVuc2VfbGF5ZXJzJTIwLS1pbml0JTIwJTI1aW5pdCUyMC0td2VpZ2h0X2RlY2F5JTIwJTI1d2VpZ2h0X2RlY2F5JmNwdXNfcGVyX3Rhc2s9MSZub2Rlc19wZXJfam9iPTEmc2VlZD0mZHJ5cnVuPTAmZGVidWc9MCZyZXZlcnRfdG9fcmFuZG9tX3doZW5fc2VlbWluZ2x5X2V4aGF1c3RlZD0xJmdyaWRzZWFyY2g9MCZtb2RlbD1CT1RPUkNIX01PRFVMQVImZXh0ZXJuYWxfZ2VuZXJhdG9yPSZuX2VzdGltYXRvcnNfcmFuZG9tZm9yZXN0PTEwMCZpbnN0YWxsYXRpb25fbWV0aG9kPWNsb25lJnJ1bl9tb2RlPWxvY2FsJmRpc2FibGVfdHFkbT0wJnZlcmJvc2VfdHFkbT0wJmZvcmNlX2xvY2FsX2V4ZWN1dGlvbj0wJmF1dG9fZXhjbHVkZV9kZWZlY3RpdmVfaG9zdHM9MCZzaG93X3NpeGVsX2dlbmVyYWw9MCZzaG93X3NpeGVsX3RyaWFsX2luZGV4X3Jlc3VsdD0wJnNob3dfc2l4ZWxfc2NhdHRlcj0wJnNob3dfd29ya2VyX3BlcmNlbnRhZ2VfdGFibGVfYXRfZW5kPTAmb2NjPTAmb2NjX3R5cGU9ZXVjbGlkJm5vX3NsZWVwPTAmc2x1cm1fdXNlX3NydW49MCZ2ZXJib3NlX2JyZWFrX3J1bl9zZWFyY2hfdGFibGU9MCZhYmJyZXZpYXRlX2pvYl9uYW1lcz0wJm1haW5fcHJvY2Vzc19nYj04Jm1heF9ucl9vZl96ZXJvX3Jlc3VsdHM9NTAmc2x1cm1fc2lnbmFsX2RlbGF5X3M9MCZtYXhfZmFpbGVkX2pvYnM9MCZleGNsdWRlPSZ1c2VybmFtZT0mZ2VuZXJhdGlvbl9zdHJhdGVneT0mcm9vdF92ZW52X2Rpcj0md29ya2Rpcj0mZG9udF9qaXRfY29tcGlsZT0wJmZpdF9vdXRfb2ZfZGVzaWduPTAmcmVmaXRfb25fY3Y9MCZzaG93X2dlbmVyYXRlX3RpbWVfdGFibGU9MCZkb250X3dhcm1fc3RhcnRfcmVmaXR0aW5nPTAmbWF4X2F0dGVtcHRzX2Zvcl9nZW5lcmF0aW9uPTIwJm51bV9yZXN0YXJ0cz0yMCZyYXdfc2FtcGxlcz0xMDI0Jm1heF9hYmFuZG9uZWRfcmV0cmlhbD0yMCZtYXhfbnVtX29mX3BhcmFsbGVsX3NydW5zPTE2JmZvcmNlX2Nob2ljZV9mb3JfcmFuZ2VzPTAmbm9fdHJhbnNmb3JtX2lucHV0cz0wJmZpdF9hYmFuZG9uZWQ9MCZub19ub3JtYWxpemVfeT0wJnZlcmJvc2U9MCZnZW5lcmF0ZV9hbGxfam9ic19hdF9vbmNlPTAmZmxhbWVfZ3JhcGg9MCZjaGVja291dF90b19sYXRlc3RfdGVzdGVkX3ZlcnNpb249MCZwYXJhbWV0ZXJfMF9uYW1lPWVwb2NocyZwYXJhbWV0ZXJfMF90eXBlPXJhbmdlJnBhcmFtZXRlcl8wX21pbj0xMCZwYXJhbWV0ZXJfMF9tYXg9MjAwJnBhcmFtZXRlcl8wX251bWJlcl90eXBlPWludCZwYXJhbWV0ZXJfMF9sb2dfc2NhbGU9ZmFsc2UmcGFyYW1ldGVyXzFfbmFtZT1sciZwYXJhbWV0ZXJfMV90eXBlPXJhbmdlJnBhcmFtZXRlcl8xX21pbj0wLjAwMDAxJnBhcmFtZXRlcl8xX21heD0wLjEmcGFyYW1ldGVyXzFfbnVtYmVyX3R5cGU9ZmxvYXQmcGFyYW1ldGVyXzFfbG9nX3NjYWxlPWZhbHNlJnBhcmFtZXRlcl8yX25hbWU9YmF0Y2hfc2l6ZSZwYXJhbWV0ZXJfMl90eXBlPXJhbmdlJnBhcmFtZXRlcl8yX21pbj04JnBhcmFtZXRlcl8yX21heD0yMDQ4JnBhcmFtZXRlcl8yX251bWJlcl90eXBlPWludCZwYXJhbWV0ZXJfMl9sb2dfc2NhbGU9ZmFsc2UmcGFyYW1ldGVyXzNfbmFtZT1oaWRkZW5fc2l6ZSZwYXJhbWV0ZXJfM190eXBlPXJhbmdlJnBhcmFtZXRlcl8zX21pbj04JnBhcmFtZXRlcl8zX21heD0yMDQ4JnBhcmFtZXRlcl8zX251bWJlcl90eXBlPWludCZwYXJhbWV0ZXJfM19sb2dfc2NhbGU9ZmFsc2UmcGFyYW1ldGVyXzRfbmFtZT1kcm9wb3V0JnBhcmFtZXRlcl80X3R5cGU9cmFuZ2UmcGFyYW1ldGVyXzRfbWluPTAmcGFyYW1ldGVyXzRfbWF4PTAuNSZwYXJhbWV0ZXJfNF9udW1iZXJfdHlwZT1mbG9hdCZwYXJhbWV0ZXJfNF9sb2dfc2NhbGU9ZmFsc2UmcGFyYW1ldGVyXzVfbmFtZT1hY3RpdmF0aW9uJnBhcmFtZXRlcl81X3R5cGU9Zml4ZWQmcGFyYW1ldGVyXzVfdmFsdWU9bGVha3lfcmVsdSZwYXJhbWV0ZXJfNl9uYW1lPW51bV9kZW5zZV9sYXllcnMmcGFyYW1ldGVyXzZfdHlwZT1yYW5nZSZwYXJhbWV0ZXJfNl9taW49MSZwYXJhbWV0ZXJfNl9tYXg9NCZwYXJhbWV0ZXJfNl9udW1iZXJfdHlwZT1pbnQmcGFyYW1ldGVyXzZfbG9nX3NjYWxlPWZhbHNlJnBhcmFtZXRlcl83X25hbWU9aW5pdCZwYXJhbWV0ZXJfN190eXBlPWZpeGVkJnBhcmFtZXRlcl83X3ZhbHVlPW5vcm1hbCZwYXJhbWV0ZXJfOF9uYW1lPXdlaWdodF9kZWNheSZwYXJhbWV0ZXJfOF90eXBlPXJhbmdlJnBhcmFtZXRlcl84X21pbj0wJnBhcmFtZXRlcl84X21heD0xJnBhcmFtZXRlcl84X251bWJlcl90eXBlPWZsb2F0JnBhcmFtZXRlcl84X2xvZ19zY2FsZT1mYWxzZSZwYXJ0aXRpb249YWxwaGEmbnVtX3BhcmFtZXRlcnM9OQ==
⠋ Disabling logging...
⠋ Setting run folder...
⠋ Creating folder /data/cat/ws/pwinkler-mnist_tst/omniopt_1/runs/mnist_gpu_noall/0...
⠋ Writing revert_to_random_when_seemingly_exhausted file ...
⠋ Writing username state file...
⠋ Writing result names file...
⠋ Writing result min/max file...
⠋ Saving state files...
Run-folder: /data/cat/ws/pwinkler-mnist_tst/omniopt_1/runs/mnist_gpu_noall/0
⠋ Printing run info...
⠋ Initializing NVIDIA-Logs...
⠋ Writing ui_url file if it is present...
⠋ Writing live_share file if it is present...
⠋ Writing job_start_time file...
⠙ Writing git info file...
⠋ Checking max_eval...
⠋ Calculating number of steps...
⠋ Adding excluded nodes...
⠋ Handling random steps...
⠋ Initializing ax_client...
[WARNING 07-31 15:13:59] ax.service.ax_client: Selecting a GenerationStrategy when using BatchTrials is in beta. Double check the recommended strategy matches your expectations.
⠋ Setting orchestrator...
You have 1 CPUs available for the main process. Using CUDA device NVIDIA H100. Generation strategy: SOBOL for 20 steps and then BOTORCH_MODULAR for 480 steps.
Run-Program: python3 .tests/mnist/train --epochs %epochs --learning_rate %lr --batch_size %batch_size --hidden_size %hidden_size --dropout %dropout --activation %activation --num_dense_layers %num_dense_layers --init %init --weight_decay %weight_decay
Experiment parameters
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
┃ Name ┃ Type ┃ Lower bound ┃ Upper bound ┃ Values ┃ Type ┃ Log Scale? ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
│ epochs │ range │ 10 │ 200 │ │ int │ No │
│ lr │ range │ 1e-05 │ 0.1 │ │ float │ No │
│ batch_size │ range │ 8 │ 2048 │ │ int │ No │
│ hidden_size │ range │ 8 │ 2048 │ │ int │ No │
│ dropout │ range │ 0 │ 0.5 │ │ float │ No │
│ activation │ fixed │ │ │ leaky_relu │ │ │
│ num_dense_layers │ range │ 1 │ 4 │ │ int │ No │
│ init │ fixed │ │ │ normal │ │ │
│ weight_decay │ range │ 0 │ 1 │ │ float │ No │
└──────────────────┴───────┴─────────────┴─────────────┴────────────┴───────┴────────────┘
Result-Names
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Result-Name ┃ Min or max? ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ VAL_ACC │ max │
└─────────────┴─────────────┘
See https://imageseg.scads.de/omniax/share?user_id=pwinkler&experiment_name=mnist_gpu_noall&run_nr=6 for live-results.
█▀▀▀▀▀█ ███ ▀▄▀▀▀ █▄▄█▄▀ ▀▀ █▀▀▀▀▀█
█ ███ █ █▄▀█ ▄ ▀▄▀██▀▀▀▀ ▄█▀▀ █ ███ █
█ ▀▀▀ █ ▄ ▄▄▀▀▀█▀▀▄ █ ▀ ▀▄▀█ █ ▀▀▀ █
▀▀▀▀▀▀▀ ▀▄▀ █ ▀▄█▄█▄█▄▀ █ █▄█ ▀▀▀▀▀▀▀
█▀▄▄▀▀▀ ▄▄▄▀▄▄ ▄▄▀▄▄ █▀▄ ▄▄▄ ▀▄▀██▀
▄ ▄▀ ▄▀ █▀ █▀▀ ▀ ██ ▄█▄█ █▄▀▄▀▀█▄▀▄
▄ █▄ ▀▀█▀▀█ ▀▄ █▄█▀▀ ▀▀▄ ▄▄█▄▀▀ ▄▄█▄▀
▀ ▄ ▄▄▀▄█▄▄ ▄ █▀▀██▄▄█ ██ ▄▀▀▀ ▄█▄▄▄
▀▄ █▀▄▄ ▀▀▄ ▄█▄█▄ ▀▀██▄▄ █▄▀▀█▄▄█▄▀
▄▄▀▄▀█▀ ██ █▀▀▀▀ ██ █▄██▄▀ █ ▄ █▀ ▄▄
█ ▄█▀█▀▀ █ ▀▀ ▀██▄▀▀▀ ▄▀▄▄█▄█▀▀▄ ██▀
▀▀█▄▀ ▀▀▄▀▀ ▄▀██ ▄▀ █▄▄█▀ █▀ ▀▄█ █▄
▄▄██▄▄▀ ▀▄ ▀▄ ▄▄██▄ ▀▀▀█▀ ▀█▄ ▀▀▄▄██▀
▄███▀▀▄▀▄█▀▀▀███▄ ▀█ ▀█▀█▀ ▄█▀▀ █▄
▀▀ ▀▀ █▀▄ ▀▄ ▄█▀ ▀██▀ ▄ ▀█▀▀▀█ ▀
█▀▀▀▀▀█ ▄▄▄ ▄▀ ▀ ▄▄▄ ██▀ █▄█ ▀ █ █▄
█ ███ █ ▀█▄▀█ ▀▄▄▀ ▀ ▀█▄ ▄ ▄▀██▀▀▄▀ ▀
█ ▀▀▀ █ ▄ █▀▄ █▀ █▀ █████▀▄▄▀▀▀█ ▄▄
▀▀▀▀▀▀▀ ▀▀▀ ▀▀▀ ▀▀▀▀ ▀ ▀ ▀ ▀ ▀▀▀
Sobol, failed: 9 ('VAL_ACC: ' not found), best VAL_ACC: 20.13, running 2∑2 (0%/20), waiting for 3 jobs, finished 1 job : 2%|░░░░░░░░░░| 9/500 [26:03<10:47:30, 79.13s/it]
2025-07-31 15:14:04: SOBOL, Started OmniOpt2 run...
2025-07-31 15:14:12: Sobol, getting new HP set
2025-07-31 15:14:20: Sobol, requested 1 jobs, got 1, 8.22 s/job
2025-07-31 15:14:24: Sobol, eval #1/1 start
2025-07-31 15:14:28: Sobol, starting new job
2025-07-31 15:14:33: Sobol, unknown 1∑1 (5%/20), started new job
2025-07-31 15:14:38: Sobol, running 1∑1 (5%/20), getting new HP set
2025-07-31 15:14:47: Sobol, running 1∑1 (5%/20), requested 1 jobs, got 1, 9.62 s/job
2025-07-31 15:15:06: Sobol, running 1∑1 (5%/20), eval #1/1 start
2025-07-31 15:15:26: Sobol, running 1∑1 (5%/20), starting new job
2025-07-31 15:15:32: Sobol, running/unknown 1/1∑2 (10%/20), started new job
2025-07-31 15:15:38: Sobol, running 2∑2 (10%/20), getting new HP set
2025-07-31 15:15:47: Sobol, running 2∑2 (10%/20), requested 1 jobs, got 1, 10.16 s/job
2025-07-31 15:15:52: Sobol, running 2∑2 (10%/20), eval #1/1 start
2025-07-31 15:15:57: Sobol, running 2∑2 (10%/20), starting new job
2025-07-31 15:16:03: Sobol, running/unknown 2/1∑3 (15%/20), started new job
2025-07-31 15:16:09: Sobol, running 3∑3 (15%/20), getting new HP set
2025-07-31 15:16:19: Sobol, running 3∑3 (15%/20), requested 1 jobs, got 1, 9.96 s/job
2025-07-31 15:16:24: Sobol, running 3∑3 (15%/20), eval #1/1 start
2025-07-31 15:16:29: Sobol, running 3∑3 (15%/20), starting new job
2025-07-31 15:16:35: Sobol, running/unknown 3/1∑4 (20%/20), started new job
2025-07-31 15:16:54: Sobol, running 4∑4 (20%/20), getting new HP set
2025-07-31 15:17:08: Sobol, running 4∑4 (20%/20), requested 1 jobs, got 1, 14.02 s/job
2025-07-31 15:17:12: Sobol, running 4∑4 (20%/20), eval #1/1 start
2025-07-31 15:17:18: Sobol, running 4∑4 (20%/20), starting new job
2025-07-31 15:17:23: Sobol, running/unknown 4/1∑5 (25%/20), started new job
2025-07-31 15:17:29: Sobol, running/pending 4/1∑5 (25%/20), getting new HP set
2025-07-31 15:17:39: Sobol, running 5∑5 (25%/20), requested 1 jobs, got 1, 9.82 s/job
2025-07-31 15:17:43: Sobol, running 5∑5 (25%/20), eval #1/1 start
2025-07-31 15:17:48: Sobol, running 5∑5 (25%/20), starting new job
2025-07-31 15:17:54: Sobol, running/unknown 5/1∑6 (30%/20), started new job
2025-07-31 15:18:00: Sobol, running/pending 5/1∑6 (30%/20), getting new HP set
2025-07-31 15:18:10: Sobol, running 6∑6 (30%/20), requested 1 jobs, got 1, 10.28 s/job
2025-07-31 15:18:15: Sobol, running 6∑6 (30%/20), eval #1/1 start
2025-07-31 15:18:20: Sobol, running 6∑6 (30%/20), starting new job
2025-07-31 15:18:25: Sobol, running/unknown 6/1∑7 (35%/20), started new job
2025-07-31 15:18:44: Sobol, running 7∑7 (35%/20), getting new HP set
2025-07-31 15:18:55: Sobol, running 7∑7 (35%/20), requested 1 jobs, got 1, 11.06 s/job
2025-07-31 15:18:59: Sobol, running 7∑7 (30%/20), eval #1/1 start
2025-07-31 15:19:36: Sobol, completed/running 6/1∑7 (5%/20), starting new job
2025-07-31 15:19:46: Sobol, completed/running/unknown 6/1/1∑8 (10%/20), started new job
2025-07-31 15:20:02: Sobol, completed/running/pending 6/1/1∑8 (10%/20), job_failed
2025-07-31 15:20:02: Sobol, completed/running/pending 6/1/1∑8 (10%/20), job_failed
2025-07-31 15:20:02: Sobol, completed/running/pending 6/1/1∑8 (10%/20), job_failed
2025-07-31 15:20:03: Sobol, completed/running/pending 6/1/1∑8 (10%/20), job_failed
2025-07-31 15:20:03: Sobol, completed/running/pending 6/1/1∑8 (10%/20), job_failed
2025-07-31 15:20:07: Sobol, completed/running/pending 6/1/1∑8 (10%/20), job_failed
2025-07-31 15:20:27: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 2∑2 (10%/20), finishing jobs (_get_next_trials), finished 6 jobs
2025-07-31 15:20:33: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 2∑2 (10%/20), getting new HP set
2025-07-31 15:20:44: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 2∑2 (10%/20), requested 1 jobs, got 1, 10.25 s/job
2025-07-31 15:20:49: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 2∑2 (10%/20), eval #1/1 start
2025-07-31 15:20:55: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 2∑2 (10%/20), starting new job
2025-07-31 15:21:07: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/unknown 2/1∑3 (15%/20), started new job
2025-07-31 15:21:13: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 3∑3 (15%/20), getting new HP set
2025-07-31 15:21:24: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 3∑3 (15%/20), requested 1 jobs, got 1, 10.21 s/job
2025-07-31 15:21:30: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 3∑3 (15%/20), eval #1/1 start
2025-07-31 15:21:53: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 3∑3 (15%/20), starting new job
2025-07-31 15:22:09: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/unknown 3/1∑4 (20%/20), started new job
2025-07-31 15:22:27: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/pending 3/1∑4 (20%/20), getting new HP set
2025-07-31 15:22:42: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/pending 3/1∑4 (20%/20), requested 1 jobs, got 1, 15.23 s/job
2025-07-31 15:22:48: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), eval #1/1 start
2025-07-31 15:22:53: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), starting new job
2025-07-31 15:23:01: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/unknown 4/1∑5 (25%/20), started new job
2025-07-31 15:23:07: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/pending 4/1∑5 (25%/20), getting new HP set
2025-07-31 15:23:17: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 5∑5 (25%/20), requested 1 jobs, got 1, 10.40 s/job
2025-07-31 15:23:21: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running 5∑5 (25%/20), eval #1/1 start
2025-07-31 15:23:44: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/completed 4/1∑5 (20%/20), starting new job
2025-07-31 15:23:50: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/completed/unknown 4/1/1∑6 (20%/20), started new job
2025-07-31 15:24:02: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/completed/pending 3/2/1∑6 (20%/20), job_failed
2025-07-31 15:24:03: Sobol, failed: 6 ('VAL_ACC: <FLOAT>' not found), running/completed/pending 3/2/1∑6 (20%/20), job_failed
2025-07-31 15:24:30: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), finishing jobs (_get_next_trials), finished 2 jobs
2025-07-31 15:24:35: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), getting new HP set
2025-07-31 15:24:46: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), requested 1 jobs, got 1, 11.21 s/job
2025-07-31 15:24:52: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), eval #1/1 start
2025-07-31 15:24:57: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), starting new job
2025-07-31 15:25:04: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running/unknown 4/1∑5 (25%/20), started new job
2025-07-31 15:25:24: Sobol, failed: 8 ('VAL_ACC: <FLOAT>' not found), running/completed 4/1∑5 (20%/20), job_failed
2025-07-31 15:25:34: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), finishing jobs (_get_next_trials), finished 1 job
2025-07-31 15:25:40: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), getting new HP set
2025-07-31 15:26:01: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), requested 1 jobs, got 1, 21.98 s/job
2025-07-31 15:26:08: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), eval #1/1 start
2025-07-31 15:26:13: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 4∑4 (20%/20), starting new job
2025-07-31 15:26:20: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/unknown 4/1∑5 (25%/20), started new job
2025-07-31 15:26:26: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 4/1∑5 (25%/20), getting new HP set
2025-07-31 15:26:37: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 5∑5 (25%/20), requested 1 jobs, got 1, 11.56 s/job
2025-07-31 15:26:43: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 5∑5 (25%/20), eval #1/1 start
2025-07-31 15:26:50: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 5∑5 (25%/20), starting new job
2025-07-31 15:26:58: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/unknown 5/1∑6 (30%/20), started new job
2025-07-31 15:27:05: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 5/1∑6 (30%/20), getting new HP set
2025-07-31 15:27:18: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 6∑6 (30%/20), requested 1 jobs, got 1, 13.26 s/job
2025-07-31 15:27:24: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 6∑6 (30%/20), eval #1/1 start
2025-07-31 15:27:35: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 6∑6 (30%/20), starting new job
2025-07-31 15:27:41: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/unknown 6/1∑7 (35%/20), started new job
2025-07-31 15:27:47: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 6/1∑7 (35%/20), getting new HP set
2025-07-31 15:27:58: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 6/1∑7 (35%/20), requested 1 jobs, got 1, 11.33 s/job
2025-07-31 15:28:05: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 6/1∑7 (35%/20), eval #1/1 start
2025-07-31 15:28:10: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 6/1∑7 (35%/20), starting new job
2025-07-31 15:28:17: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/unknown 7/1∑8 (40%/20), started new job
2025-07-31 15:28:23: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 8∑8 (40%/20), getting new HP set
2025-07-31 15:28:34: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 8∑8 (40%/20), requested 1 jobs, got 1, 11.30 s/job
2025-07-31 15:28:40: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 8∑8 (40%/20), eval #1/1 start
2025-07-31 15:28:46: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 8∑8 (40%/20), starting new job
2025-07-31 15:28:52: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/unknown 8/1∑9 (45%/20), started new job
2025-07-31 15:28:58: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 8/1∑9 (45%/20), getting new HP set
2025-07-31 15:29:09: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 9∑9 (45%/20), requested 1 jobs, got 1, 11.46 s/job
2025-07-31 15:29:15: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 9∑9 (45%/20), eval #1/1 start
2025-07-31 15:29:21: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running 9∑9 (45%/20), starting new job
2025-07-31 15:29:28: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/unknown 9/1∑10 (50%/20), started new job
2025-07-31 15:29:35: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/pending 9/1∑10 (50%/20), getting new HP set
2025-07-31 15:29:46: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/completed 9/1∑10 (45%/20), requested 1 jobs, got 1, 11.90 s/job
2025-07-31 15:29:53: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), running/completed 9/1∑10 (45%/20), eval #1/1 start
2025-07-31 15:30:04: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running/completed 9/1∑10 (45%/20), starting new job
2025-07-31 15:30:12: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running/completed/unknown 9/1/1∑11 (50%/20), started new job
2025-07-31 15:30:18: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running/completed/pending 9/1/1∑11 (50%/20), new result: 20.13
2025-07-31 15:30:35: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running/pending 9/1∑10 (50%/20), finishing jobs, finished 1 job
2025-07-31 15:30:42: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running/pending 9/1∑10 (50%/20), waiting for 10 jobs
2025-07-31 15:30:55: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running/pending 9/1∑10 (50%/20), waiting for 10 jobs
2025-07-31 15:31:06: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 10∑10 (50%/20), waiting for 10 jobs
2025-07-31 15:31:17: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 10∑10 (50%/20), new result: 11.35
2025-07-31 15:31:34: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 10 jobs, finished 1 job
2025-07-31 15:31:40: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 9 jobs
2025-07-31 15:31:51: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 9 jobs
2025-07-31 15:32:03: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 9 jobs
2025-07-31 15:32:14: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 9 jobs
2025-07-31 15:32:29: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 9 jobs
2025-07-31 15:32:40: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (45%/20), waiting for 9 jobs
2025-07-31 15:32:51: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 9∑9 (40%/20), new result: 10.28
2025-07-31 15:33:10: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 9 jobs, finished 1 job
2025-07-31 15:33:15: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:33:27: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:33:42: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:33:53: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:34:05: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:34:16: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:34:27: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:34:39: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:34:50: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:35:01: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:35:15: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:35:32: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:35:43: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (40%/20), waiting for 8 jobs
2025-07-31 15:36:03: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 8∑8 (35%/20), new result: 10.28
2025-07-31 15:36:22: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 7∑7 (30%/20), waiting for 8 jobs, finished 1 job
2025-07-31 15:36:29: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 7∑7 (30%/20), waiting for 7 jobs
2025-07-31 15:36:40: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 7∑7 (25%/20), new result: 10.28
2025-07-31 15:36:40: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 7∑7 (25%/20), new result: 11.35
2025-07-31 15:37:05: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (25%/20), waiting for 7 jobs, finished 2 jobs
2025-07-31 15:37:12: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (25%/20), waiting for 5 jobs
2025-07-31 15:37:24: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (25%/20), waiting for 5 jobs
2025-07-31 15:37:35: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (25%/20), waiting for 5 jobs
2025-07-31 15:38:04: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (25%/20), waiting for 5 jobs
2025-07-31 15:38:15: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (25%/20), waiting for 5 jobs
2025-07-31 15:38:26: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 5∑5 (20%/20), new result: 9.74
2025-07-31 15:38:45: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 4∑4 (20%/20), waiting for 5 jobs, finished 1 job
2025-07-31 15:38:52: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 4∑4 (20%/20), waiting for 4 jobs
2025-07-31 15:39:04: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 4∑4 (15%/20), new result: 9.74
2025-07-31 15:39:23: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 3∑3 (15%/20), waiting for 4 jobs, finished 1 job
2025-07-31 15:39:29: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 3∑3 (15%/20), waiting for 3 jobs
2025-07-31 15:39:52: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 3∑3 (10%/20), new result: 10.91
2025-07-31 15:40:07: Sobol, failed: 9 ('VAL_ACC: <FLOAT>' not found), best VAL_ACC: 20.13, running 2∑2 (0%/20), waiting for 3 jobs, finished 1 job
Arguments Overview
Key | Value |
---|
config_yaml | None |
config_toml | None |
config_json | None |
num_random_steps | 20 |
max_eval | 500 |
run_program | [["'cHl0aG9uMyAudGVzdHMvbW5pc3QvdHJhaW4gLS1lcG9jaHMgJWVwb2NocyAtLWxlYXJuaW5nX3JhdGUgJWxyIC0tYmF0Y2hfc2l6ZSAlYmF0Y2hfc2l6ZSAtLWhpZGRlbl9zaXplICVoaWRkZW5… |
experiment_name | mnist_gpu_noall |
mem_gb | 10 |
parameter | [['epochs', 'range', '10', '200', 'int', 'false'], ['lr', 'range', '0.00001', '0.1', 'float', 'false'], ['batch_size', 'range', '8', '2048', 'int', |
| 'false'], ['hidden_size', 'range', '8', '2048', 'int', 'false'], ['dropout', 'range', '0', '0.5', 'float', 'false'], ['activation', 'fixed', |
| 'leaky_relu'], ['num_dense_layers', 'range', '1', '4', 'int', 'false'], ['init', 'fixed', 'normal'], ['weight_decay', 'range', '0', '1', 'float', |
| 'false']] |
continue_previous_job | None |
experiment_constraints | None |
run_dir | runs |
seed | None |
verbose_tqdm | False |
model | BOTORCH_MODULAR |
gridsearch | False |
occ | False |
show_sixel_scatter | False |
show_sixel_general | False |
show_sixel_trial_index_result | False |
follow | True |
send_anonymized_usage_stats | True |
ui_url | aHR0cHM6Ly9pbWFnZXNlZy5zY2Fkcy5kZS9vbW5pYXgvZ3VpP3BhcnRpdGlvbj1hbHBoYSZleHBlcmltZW50X25hbWU9bW5pc3RfZ3B1X25vYWxsJnJlc2VydmF0aW9uPSZhY2NvdW50PSZtZW1fZ2I… |
root_venv_dir | /home/pwinkler |
exclude | None |
main_process_gb | 8 |
max_nr_of_zero_results | 50 |
abbreviate_job_names | False |
orchestrator_file | None |
checkout_to_latest_tested_version | False |
live_share | True |
disable_tqdm | False |
disable_previous_job_constraint | False |
workdir | |
occ_type | euclid |
result_names | ['VAL_ACC=max'] |
minkowski_p | 2 |
signed_weighted_euclidean_weights | |
generation_strategy | None |
generate_all_jobs_at_once | False |
revert_to_random_when_seemingly_exhausted | True |
load_data_from_existing_jobs | [] |
n_estimators_randomforest | 100 |
max_attempts_for_generation | 20 |
external_generator | None |
username | None |
max_failed_jobs | 0 |
num_cpus_main_job | None |
calculate_pareto_front_of_job | [] |
show_generate_time_table | False |
force_choice_for_ranges | False |
max_abandoned_retrial | 20 |
share_password | None |
dryrun | False |
db_url | None |
run_program_once | None |
dont_warm_start_refitting | False |
refit_on_cv | False |
fit_out_of_design | False |
fit_abandoned | False |
dont_jit_compile | False |
num_restarts | 20 |
raw_samples | 1024 |
max_num_of_parallel_sruns | 16 |
no_transform_inputs | False |
no_normalize_y | False |
transforms | [] |
num_parallel_jobs | 20 |
worker_timeout | 120 |
slurm_use_srun | False |
time | 2880 |
partition | alpha |
reservation | None |
force_local_execution | False |
slurm_signal_delay_s | 0 |
nodes_per_job | 1 |
cpus_per_task | 1 |
account | None |
gpus | 1 |
run_mode | local |
verbose | False |
verbose_break_run_search_table | False |
debug | False |
flame_graph | False |
no_sleep | False |
tests | False |
show_worker_percentage_table_at_end | False |
auto_exclude_defective_hosts | False |
run_tests_that_fail_on_taurus | False |
raise_in_eval | False |
show_ram_every_n_seconds | 0 |
show_generation_and_submission_sixel | False |
just_return_defaults | False |
prettyprint | False |
1753967644.2211282,20,0,0
1753967647.7410944,20,0,0
1753967647.8602407,20,0,0
1753967652.7192628,20,0,0
1753967660.720339,20,0,0
1753967664.703013,20,0,0
1753967668.7666395,20,0,0
1753967673.8767617,20,1,5
1753967678.7143652,20,1,5
1753967687.7085037,20,1,5
1753967706.707943,20,1,5
1753967726.1001718,20,1,5
1753967732.893962,20,2,10
1753967737.9840584,20,2,10
1753967747.8443713,20,2,10
1753967752.7789688,20,2,10
1753967757.2792275,20,2,10
1753967763.0882115,20,3,15
1753967769.3448913,20,3,15
1753967778.980912,20,3,15
1753967783.7601418,20,3,15
1753967789.3165607,20,3,15
1753967795.837868,20,4,20
1753967814.7107744,20,4,20
1753967828.1210747,20,4,20
1753967832.7407355,20,4,20
1753967837.8966107,20,4,20
1753967843.8925586,20,5,25
1753967849.712782,20,5,25
1753967858.9250119,20,5,25
1753967863.7964315,20,5,25
1753967868.7311363,20,5,25
1753967874.8378797,20,6,30
1753967880.2820826,20,6,30
1753967890.2467442,20,6,30
1753967895.7160554,20,6,30
1753967900.1189172,20,6,30
1753967905.8346262,20,7,35
1753967924.723261,20,7,35
1753967935.2660596,20,7,35
1753967939.700056,20,6,30
1753967976.2403433,20,1,5
1753967986.4525096,20,2,10
1753968002.7099497,20,2,10
1753968002.7142658,20,2,10
1753968002.727753,20,2,10
1753968003.7215846,20,2,10
1753968003.7475672,20,2,10
1753968007.6981468,20,2,10
1753968027.7284634,20,2,10
1753968033.717795,20,2,10
1753968044.7085254,20,2,10
1753968049.7150424,20,2,10
1753968054.952823,20,2,10
1753968067.7146788,20,3,15
1753968073.7787275,20,3,15
1753968083.2897604,20,3,15
1753968090.7145321,20,3,15
1753968112.9694378,20,3,15
1753968129.9093769,20,4,20
1753968147.714594,20,4,20
1753968162.7009466,20,4,20
1753968168.707309,20,4,20
1753968173.8950186,20,4,20
1753968179.9911752,20,5,25
1753968186.9719458,20,5,25
1753968196.9269242,20,5,25
1753968201.884365,20,5,25
1753968224.723713,20,4,20
1753968230.8836393,20,4,20
1753968242.071311,20,4,20
1753968242.0769813,20,4,20
1753968270.04652,20,4,20
1753968275.715433,20,4,20
1753968286.711705,20,4,20
1753968292.0553877,20,4,20
1753968297.7525384,20,4,20
1753968304.1400354,20,5,25
1753968324.714325,20,4,20
1753968334.9720662,20,4,20
1753968340.7210023,20,4,20
1753968361.8702526,20,4,20
1753968367.982832,20,4,20
1753968373.7136464,20,4,20
1753968380.8855803,20,5,25
1753968386.7296448,20,5,25
1753968397.8846774,20,5,25
1753968403.80329,20,5,25
1753968410.7750459,20,5,25
1753968418.210284,20,6,30
1753968424.72158,20,6,30
1753968437.7180557,20,6,30
1753968444.046268,20,6,30
1753968455.0922658,20,6,30
1753968461.8910725,20,7,35
1753968467.8007088,20,7,35
1753968478.7298079,20,7,35
1753968485.0943537,20,7,35
1753968490.8627732,20,7,35
1753968497.2021344,20,8,40
1753968503.7109485,20,8,40
1753968514.7201848,20,8,40
1753968520.7065802,20,8,40
1753968526.172456,20,8,40
1753968532.7115388,20,9,45
1753968538.770606,20,9,45
1753968549.8357947,20,9,45
1753968555.8002386,20,9,45
1753968561.7138531,20,9,45
1753968568.876604,20,10,50
1753968575.0486646,20,10,50
1753968586.719112,20,9,45
1753968593.1696148,20,9,45
1753968604.1472635,20,9,45
1753968611.9927146,20,10,50
1753968618.0302699,20,10,50
1753968623.7127616,20,10,50
1753968635.7130744,20,10,50
1753968642.707166,20,10,50
1753968642.8162982,20,10,50
1753968642.957332,20,10,50
1753968655.0733826,20,10,50
1753968666.0896714,20,10,50
1753968676.9900937,20,10,50
1753968682.7156262,20,9,45
1753968694.7011416,20,9,45
1753968700.037881,20,9,45
1753968711.2095308,20,9,45
1753968723.1226318,20,9,45
1753968734.3148696,20,9,45
1753968749.0047433,20,9,45
1753968760.7014616,20,9,45
1753968771.7839885,20,8,40
1753968777.1465638,20,8,40
1753968790.053755,20,8,40
1753968795.7107291,20,8,40
1753968806.9521673,20,8,40
1753968822.8110416,20,8,40
1753968833.7146723,20,8,40
1753968845.0940056,20,8,40
1753968856.1754816,20,8,40
1753968866.8530843,20,8,40
1753968878.977937,20,8,40
1753968890.753377,20,8,40
1753968901.8464968,20,8,40
1753968915.0369787,20,8,40
1753968932.9107597,20,8,40
1753968943.7176561,20,8,40
1753968963.2221885,20,7,35
1753968970.2484486,20,6,30
1753968982.8930237,20,6,30
1753968989.712592,20,6,30
1753969000.7142212,20,5,25
1753969000.7258966,20,5,25
1753969010.1100936,20,5,25
1753969010.2369215,20,5,25
1753969025.7639604,20,5,25
1753969032.7335215,20,5,25
1753969044.9624035,20,5,25
1753969055.89407,20,5,25
1753969083.072827,20,5,25
1753969095.0801666,20,5,25
1753969106.7168186,20,4,20
1753969112.7178988,20,4,20
1753969125.3183563,20,4,20
1753969132.1165552,20,4,20
1753969144.0633905,20,3,15
1753969150.7109811,20,3,15
1753969163.7573593,20,3,15
1753969169.3089476,20,3,15
1753969192.2432013,20,2,10
1753969197.7105398,20,1,5
timestamp,ram_usage_mb,cpu_usage_percent
1753967639,711.3984375,5.8
1753967644,711.8984375,5.8
1753967647,711.8984375,5.8
1753967647,711.8984375,12.5
1753967647,711.8984375,7.2
1753967647,711.8984375,7.1
1753967647,711.8984375,9.1
1753968623,737.72265625,10.0
1753968642,735.9921875,6.1
1753968642,735.9921875,15.4
1753968642,735.9921875,7.7
1753968642,735.9921875,9.1
1753968682,738.078125,10.0
1753968777,738.12109375,18.2
1753968970,736.20703125,10.0
1753969010,736.20703125,6.3
1753969010,736.20703125,9.1
1753969010,736.20703125,12.5
1753969112,736.70703125,6.0
1753969112,736.70703125,8.3
1753969150,736.70703125,10.0
Parameter statistics
Parameter | Min | Max | Mean | Std Dev | Count |
---|
run_time | 6 | 987 | 347.3889 | 333.3944 | 18 |
VAL_ACC | 9.74 | 20.13 | 11.5622 | 3.0821 | 9 |
epochs | 20 | 199 | 105.15 | 55.1247 | 20 |
lr | 0.0056 | 0.0969 | 0.0513 | 0.0283 | 20 |
batch_size | 92 | 1962 | 1053.35 | 593.8654 | 20 |
hidden_size | 52 | 1966 | 1019.25 | 606.6682 | 20 |
dropout | 0.0239 | 0.4782 | 0.2508 | 0.1424 | 20 |
num_dense_layers | 1 | 4 | 2.5 | 1.118 | 20 |
weight_decay | 0.0597 | 0.9756 | 0.5012 | 0.2872 | 20 |
activation | No numerical statistics available |
init | No numerical statistics available |
Show SLURM-Job-ID (if it exists)
submitit INFO (2025-07-31 15:14:38,335) - Starting with JobEnvironment(job_id=530836, hostname=c143, local_rank=0(1), node=0(1), global_rank=0(1))
submitit INFO (2025-07-31 15:14:38,336) - Loading pickle: /data/cat/ws/pwinkler-mnist_tst/omniopt_1/runs/mnist_gpu_noall/0/single_runs/530836/530836_submitted.pkl
ERROR: Could not install packages due to an OSError: [Errno 2] Datei oder Verzeichnis nicht gefunden: '/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/include/ATen/ops/_grid_sampler_2d_cpu_fallback_backward_ops.h'
Traceback (most recent call last):
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 46, in ensure_venv_and_rich
import torchvision
ModuleNotFoundError: No module named 'torchvision'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 60, in <module>
ensure_venv_and_rich()
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 49, in ensure_venv_and_rich
create_and_setup_venv()
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 28, in create_and_setup_venv
subprocess.check_call([str(PYTHON_BIN), "-m", "pip", "install", "rich", "torch", "torchvision"])
File "/software/genoa/r24.04/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['.tests/mnist/.torch_venv_1bdd5e1e8b/bin/python', '-m', 'pip', 'install', 'rich', 'torch', 'torchvision']' returned non-zero exit status 1.
Parameters: {"epochs": 127, "lr": 0.06689339573502541, "batch_size": 1214, "hidden_size": 990, "dropout": 0.45562517642974854, "num_dense_layers": 3, "weight_decay": 0.5987284183502197, "activation": "leaky_relu", "init": "normal"}
Debug-Infos:
========
DEBUG INFOS START:
Program-Code: python3 .tests/mnist/train --epochs 127 --learning_rate 0.06689339573502540992 --batch_size 1214 --hidden_size 990 --dropout 0.45562517642974853516 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.59872841835021972656
pwd: /data/cat/ws/pwinkler-mnist_tst/omniopt_1
File: .tests/mnist/train
UID: 2054851
GID: 200270
SLURM_JOB_ID: 530836
Status-Change-Time: 1753967537.933081
Size: 12760 Bytes
Permissions: -rwxr-xr-x
Owner: pwinkler
Last access: 1753967682.8331795
Last modification: 1753967537.933081
Hostname: c143
========
DEBUG INFOS END
python3 .tests/mnist/train --epochs 127 --learning_rate 0.06689339573502540992 --batch_size 1214 --hidden_size 990 --dropout 0.45562517642974853516 --activation leaky_relu --num_dense_layers 3 --init normal --weight_decay 0.59872841835021972656
stdout:
Requirement already satisfied: pip in ./.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages (22.3.1)
Collecting pip
Downloading pip-25.2-py3-none-any.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 9.4 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 22.3.1
Uninstalling pip-22.3.1:
Successfully uninstalled pip-22.3.1
Successfully installed pip-25.2
Collecting rich
Using cached rich-14.1.0-py3-none-any.whl.metadata (18 kB)
Collecting torch
Using cached torch-2.7.1-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (29 kB)
Collecting torchvision
Using cached torchvision-0.22.1-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (6.1 kB)
Collecting markdown-it-py>=2.2.0 (from rich)
Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Collecting pygments<3.0.0,>=2.13.0 (from rich)
Using cached pygments-2.19.2-py3-none-any.whl.metadata (2.5 kB)
Collecting filelock (from torch)
Using cached filelock-3.18.0-py3-none-any.whl.metadata (2.9 kB)
Collecting typing-extensions>=4.10.0 (from torch)
Using cached typing_extensions-4.14.1-py3-none-any.whl.metadata (3.0 kB)
Collecting sympy>=1.13.3 (from torch)
Using cached sympy-1.14.0-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch)
Using cached networkx-3.5-py3-none-any.whl.metadata (6.3 kB)
Collecting jinja2 (from torch)
Using cached jinja2-3.1.6-py3-none-any.whl.metadata (2.9 kB)
Collecting fsspec (from torch)
Using cached fsspec-2025.7.0-py3-none-any.whl.metadata (12 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.6.77 (from torch)
Using cached nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.6.77 (from torch)
Using cached nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.6.80 (from torch)
Using cached nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.5.1.17 (from torch)
Using cached nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.6.4.1 (from torch)
Using cached nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.3.0.4 (from torch)
Using cached nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.7.77 (from torch)
Using cached nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cusolver-cu12==11.7.1.2 (from torch)
Using cached nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparse-cu12==12.5.4.2 (from torch)
Using cached nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cusparselt-cu12==0.6.3 (from torch)
Using cached nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl.metadata (6.8 kB)
Collecting nvidia-nccl-cu12==2.26.2 (from torch)
Using cached nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.0 kB)
Collecting nvidia-nvtx-cu12==12.6.77 (from torch)
Using cached nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-nvjitlink-cu12==12.6.85 (from torch)
Using cached nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufile-cu12==1.11.1.6 (from torch)
Using cached nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.5 kB)
Collecting triton==3.3.1 (from torch)
Using cached triton-3.3.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (1.5 kB)
Requirement already satisfied: setuptools>=40.8.0 in ./.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages (from triton==3.3.1->torch) (65.5.0)
Collecting numpy (from torchvision)
Using cached numpy-2.3.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (62 kB)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
Using cached pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (9.0 kB)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich)
Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch)
Using cached mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
Using cached MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.0 kB)
Using cached rich-14.1.0-py3-none-any.whl (243 kB)
Using cached pygments-2.19.2-py3-none-any.whl (1.2 MB)
Using cached torch-2.7.1-cp311-cp311-manylinux_2_28_x86_64.whl (821.2 MB)
Using cached nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (393.1 MB)
Using cached nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.9 MB)
Using cached nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl (23.7 MB)
Using cached nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (897 kB)
Using cached nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl (571.0 MB)
Using cached nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (200.2 MB)
Using cached nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.1 MB)
Using cached nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB)
Using cached nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (158.2 MB)
Using cached nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (216.6 MB)
Using cached nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl (156.8 MB)
Using cached nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (201.3 MB)
Using cached nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB)
Using cached nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB)
Using cached triton-3.3.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.7 MB)
Using cached torchvision-0.22.1-cp311-cp311-manylinux_2_28_x86_64.whl (7.5 MB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Using cached pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
Using cached sympy-1.14.0-py3-none-any.whl (6.3 MB)
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
Using cached typing_extensions-4.14.1-py3-none-any.whl (43 kB)
Using cached filelock-3.18.0-py3-none-any.whl (16 kB)
Using cached fsspec-2025.7.0-py3-none-any.whl (199 kB)
Using cached jinja2-3.1.6-py3-none-any.whl (134 kB)
Using cached MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23 kB)
Using cached networkx-3.5-py3-none-any.whl (2.0 MB)
Using cached numpy-2.3.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.9 MB)
Installing collected packages: nvidia-cusparselt-cu12, mpmath, typing-extensions, triton, sympy, pygments, pillow, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, mdurl, MarkupSafe, fsspec, filelock, nvidia-cusparse-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, markdown-it-py, jinja2, rich, nvidia-cusolver-cu12, torch, torchvision
stderr:
ERROR: Could not install packages due to an OSError: [Errno 2] Datei oder Verzeichnis nicht gefunden: '/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/.torch_venv_1bdd5e1e8b/lib/python3.11/site-packages/torch/include/ATen/ops/_grid_sampler_2d_cpu_fallback_backward_ops.h'
Traceback (most recent call last):
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 46, in ensure_venv_and_rich
import torchvision
ModuleNotFoundError: No module named 'torchvision'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 60, in <module>
ensure_venv_and_rich()
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 49, in ensure_venv_and_rich
create_and_setup_venv()
File "/data/cat/ws/pwinkler-mnist_tst/omniopt_1/.tests/mnist/train", line 28, in create_and_setup_venv
subprocess.check_call([str(PYTHON_BIN), "-m", "pip", "install", "rich", "torch", "torchvision"])
File "/software/genoa/r24.04/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['.tests/mnist/.torch_venv_1bdd5e1e8b/bin/python', '-m', 'pip', 'install', 'rich', 'torch', 'torchvision']' returned non-zero exit status 1.
Result: {'VAL_ACC': None}
Final-results: {'VAL_ACC': None}
EXIT_CODE: 1
submitit INFO (2025-07-31 15:19:04,441) - Job completed successfully
submitit INFO (2025-07-31 15:19:04,443) - Exiting after successful completion