Right way to provide optional parameters to script in experiments

Hi! What is a supposed way to deal with optional params for scripts in dvc.yaml?
Let’s suppose we have a script which could be run like python train.py or python train.py --resume path-to-model-weights.

I can come up to something like this:

# dvc.yaml
stages:
  train:
    deps:
      - train.py
    cmd: python train.py ${resume}
# params.yaml
resume: ""

and in case I want to run an experiment and resume training, use dvc exp run -S resume="--resume path-to-model-weights"

But maybe I’ve missed more elegant solution? Something that will allow dvc exp run -S resume=path-to-model-weights.

@agushin
I think that we did not have this use case before, and your approach seems to be valid. @skshetry might have more information then me in that matter.

Thank you for the answer! I hope @skshetry could provide more information about this case.

Also I’m not sure what is the right way to handle situations when a script should be called like this (note the arbitrary amount of values supplied to --numbers

python calculator.py --numbers 1 2 3 4 --operation sum
# or
python calculator.py --numbers 1 2 --operation multiply
# or
python calculator.py --operation sum --numbers 1 2 3 4 5 6

If I’d have constant number of arguments, then I would do something like this, which is already not very beatiful:

# params.yaml
numbers: [1, 2, 3, 4]
operation: sum

# dvc.yaml
stages:
  calculate:
    cmd: python calculator.py --numbers ${numbers[0]} ${numbers[1]} ${numbers[2]} ${numbers[3]} --operation ${operation}

And if for some reason either

  1. I want to have arbitrary number of values in this list
  2. I want to have an option to skip this argument

then I don’t have a good idea how to handle this without modifying my python script, except may be by treating this parameter as a string again:

# params.yaml
numbers: "--numbers 1 2 3 4"
operation: sum

# dvc.yaml
stages:
  calculate:
    cmd: python calculator.py ${numbers} --operation ${operation}

Would be great to know more beatiful solution, if it exists!

This seems to me like a case where it would be better to just support reading values directly from params.yaml in your python script, instead of always passing them via the command line

1 Like