📝 Create `run.sh` -file & modify your program

Overview: what needs to be done

There are basically three steps you need to do to optimize your program with OmniOpt2:

Your program needs to be able to run on Linux, and especially on the HPC System, i.e. you need to use default dependencies or install the dependencies of your program into a virtual environment (or similar)
Your program needs to accept its hyperparameters via the command line, so you can call it like this: my_experiment.py --epochs=10 --learning_rate=0.05 (or similar)
Your program needs to print its result (i.e. e.g. its loss) in a standardized form. This can be achieved in python by doing: print(f"RESULT: {loss}")

Script Example

To make your script robust enough for the environment of OmniOpt2 on HPC-Systems, it is recommended that you do not run your script directly in the objective program string. Rather, it is recommended that you create a run.sh -file from which your program gets run.
It may look like this:

#!/usr/bin/env bash -l
# ^ Shebang-Line, so that it is known that this is a bash file
# -l means 'load this as login shell', so that /etc/profile gets loaded and you can use 'module load' or 'ml' as usual


# If you use this script not via `./run.sh' or just `srun run.sh', but like `srun bash run.sh', please add the '-l' there too.
# Like this:
# srun bash -l run.sh


# This makes sure the script runs from the directory where this shell script is located,
# so you don't need to use absolute paths. Just keep this script in the same folder
# as the main script (e.g., your Python script), and relative paths will work fine.
# Helps avoid common errors like "[Errno 2] No such file or directory" when starting
# the script from a different working directory.
SCRIPT_DIR=$(dirname $(realpath "$0"))


# Load modules your program needs, always specify versions!
ml TensorFlow/2.3.1-fosscuda-2019b-Python-3.7.4 # Or whatever modules you need


# Load specific virtual environment (if applicable)
source /path/to/environment/bin/activate


# Load your script. $@ is all the parameters that are given to this run.sh file.
python3 script.py $@


# If you didn't use the SCRIPT_DIR, please use absolute path like that:
# python3 /absolute/path/to_script.py $@


exit $? # Exit with exit code of python

Even though sbatch may inherit shell variables like loaded modules, it is not recommended to rely on that heavily, because, especially when copying the curl -command from this website, you may forget loading the correct modules. This makes your script much more robust to changes.
Also, always load specific module-versions and never let lmod guess the versions you want. Once these change, you'll almost certainly have problems otherwise.

Parse Arguments from the Command Line

Using sys.argv

The following Python program demonstrates how to parse command line arguments using sys.argv :

import sys
epochs = int(sys.argv[1])
learning_rate = float(sys.argv[2])
model_name = sys.argv[3]


if epochs <= 0:
    print("Error: Number of epochs must be positive")
    sys.exit(1)
if not 0 < learning_rate < 1:
    print("Error: Learning rate must be between 0 and 1")
    sys.exit(2)
print(f"Running with epochs={epochs}, learning_rate={learning_rate}, model_name={model_name}")


# Your code here


# loss = model.fit(...)


loss = epochs + learning_rate


print(f"RESULT: {loss}")

Example call:

python3 script.py 10 0.01 MyModel

Example OmniOpt2-call:

python3 script.py %(epochs) %(learning_rate) %(model_name)

Using argparse

The following Python program demonstrates how to parse command line arguments using argparse :

import argparse
import sys


parser = argparse.ArgumentParser(description="Run a training script with specified parameters.")
parser.add_argument("epochs", type=int, help="Number of epochs")
parser.add_argument("learning_rate", type=float, help="Learning rate")
parser.add_argument("model_name", type=str, help="Name of the model")


args = parser.parse_args()


if args.epochs <= 0:
    print("Error: Number of epochs must be positive")
    sys.exit(1)
if not 0 < args.learning_rate < 1:
    print("Error: Learning rate must be between 0 and 1")
    sys.exit(2)


print(f"Running with epochs={args.epochs}, learning_rate={args.learning_rate}, model_name={args.model_name}")


# Your code here


# loss = model.fit(...)


loss = args.epochs + args.learning_rate


print(f"RESULT: {loss}")

Example call:

python3 script.py --epochs 10 --learning_rate 0.01 --model_name MyModel

Example OmniOpt2-call:

python3 script.py --epochs %(epochs) --learning_rate %(learning_rate) --model_name %(model_name)

Advantages of using `argparse`

Order of arguments does not matter; they are matched by name.
Type checking is automatically handled based on the type specified in add_argument .
Generates helpful usage messages if the arguments are incorrect or missing.
Supports optional arguments and more complex argument parsing needs.

`mean` and `sem` in Ax

mean : average of repeated measurements
sem : standard error of the mean (uncertainty of mean )

Use sem when results are noisy — it helps Ax handle uncertainty.
One way of calculating this would be this:
$$ \text{SEM} = \frac{s}{\sqrt{n}} s = \text{std deviation}, n = \text{number of trials} $$
But it's totally up to you and your program to calculate these results.
To do that, you need to print out the result and also the SEM. This can be used with multi-objective-optimization as well, and has to be done for each parameter, like RESULT , where you want that kind of data.

print(f"RESULT: {result}")
print(f"SEM-RESULT: {sem_result}")

Test your configuration quickly (dryrun)

If you want to test your configuration quickly to find errors before you try to run it on the Cluster, or even on the cluster, but avoiding slurm, this can be easily done with --dryrun . Simply attach this parameter to your OmniOpt2-run. See the dryrun-page for more info.

📝 Create run.sh -file & modify your program

Overview: what needs to be done

Script Example

Parse Arguments from the Command Line

Using sys.argv

Using argparse

Advantages of using argparse

mean and sem in Ax

Test your configuration quickly (dryrun)

📝 Create `run.sh` -file & modify your program

Advantages of using `argparse`

`mean` and `sem` in Ax