Is it possible to use variables in DVC files. This would help me to keep track of repeated values. Let’s take as an example, the Dvc-file from the docs:
But I think this probably defeats the point of what you’re trying to achieve. In fact, I’m not sure I understand the usefulness of having vars in DVC-files, what do you mean by “keep track of repeated values”? You mean for better readability of the DVC-file?
How do you envision providing these values to dvc add or dvc run? Or would it be a trick only for manually edited DVC-files?
Yes, readability is one reason. However, not the most important one. In my current workflow, I am often editting the files manually. For example, to rename the targets, or add more steps to the pipeline. It happens from time to time that I forget to change the name of the file in the deps or in the cmd, with all the mess that arises.
Second use case, is automatically generating the DVC files from a template. This is useful, when you need to train models for different subset of your data (think for example, different user groups etc.). The DVC files might be identical except the names of deps/target.
OK. I think we want DVC-files to be easy to edit for these kind of situations as well as for advanced users, so your use cases seem reasonable to me, even if vars were only supported for manually edited/generated stage files. However for now the answer is no, we don’t support this. Also, the format would probably look a bit different if we did (valid YAML).
p.s. in fact this is already mentioned as part of existing issue https://github.com/iterative/dvc/issues/2437 (see points 2. and 6. in long description) so you can chime in there instead if you prefer.
@btel Also I think this thread and discussion is highly relevant in this case - https://github.com/iterative/dvc/issues/1462 . It would be really great if you could left a comment there describing your use case.