When using dvc run, argument of dvc run and true command to be run are always similar, thus dvc run -d source.npy -o target.npy python some_process.py source.npy target.npy . As there are repeated file names, it might be a “smell of code”.
It might be easier if these is a way of automatically passing arguments of -d and -o to command run, and what’s more important, prevent bugs like dvc run -d source.npy -o target.npy python some_process2.py source.npy target2.npy, which accidentally used different path.
In summary, a desired api might be: dvc run -d source.npy -o target.npy python proc.py , or dvc run -d source.npy -o target.npy python proc.py $DVCDEP0 $DVCOUT, and it should work as dvc run -d source.npy -o target.npy python proc.py source.npy target.npy in current use case.
Hi. It’s a very good idea! Agree that the params duplication adds “smell of code”.
$DVCDEP0-like params won’t work, unfortunately, due to shell param expansion. These parameters will be expanded to empty strings before DVC will see them.
I was thinking about a single option like --pass-params to add all the dependencies and output params in the same order. It should work like your first desired api example with one more option:
@dmitry, I think it’s a really good approach. Actually I think the --pass-params approach is much better than $DVCDEP0-like params, even if we don’t considering about shell param expansion, since it explained its effect clearly and it’s much shorter.
I would even say that --pass-params should be the default behavior, and another option --no-params is introduced instead. Even if the command does not need the params and we forget to use --no-params, and the params are passed to the command as a result, in general this shouldn’t break the command.