Is there any elegant way of passing argument -d and -o to command run by dvc run?


#1

When using dvc run, argument of dvc run and true command to be run are always similar, thus
dvc run -d source.npy -o target.npy python some_process.py source.npy target.npy . As there are repeated file names, it might be a “smell of code”.

It might be easier if these is a way of automatically passing arguments of -d and -o to command run, and what’s more important, prevent bugs like dvc run -d source.npy -o target.npy python some_process2.py source.npy target2.npy, which accidentally used different path.

In summary, a desired api might be: dvc run -d source.npy -o target.npy python proc.py , or dvc run -d source.npy -o target.npy python proc.py $DVCDEP0 $DVCOUT, and it should work as dvc run -d source.npy -o target.npy python proc.py source.npy target.npy in current use case.


#2

Hi. It’s a very good idea! Agree that the params duplication adds “smell of code”.

$DVCDEP0-like params won’t work, unfortunately, due to shell param expansion. These parameters will be expanded to empty strings before DVC will see them.

I was thinking about a single option like --pass-params to add all the dependencies and output params in the same order. It should work like your first desired api example with one more option:

dvc run -d source.npy -d proc.py -o target.npy --pass-params python proc.py myparam

DVC adds all the params in addition to the original one (myparam):

  • myparam source.npy proc.py target.npy

@xiang0x48 what do you think about this approach?


#3

@dmitry, I think it’s a really good approach. Actually I think the --pass-params approach is much better than $DVCDEP0-like params, even if we don’t considering about shell param expansion, since it explained its effect clearly and it’s much shorter.


#4

Great! Let’s do that.
I’ve created a feature request: https://github.com/iterative/dvc/issues/995

@xiang0x48 thank you for the great idea!