Hello! I read tutorial and have a question about DVC possibilities. Using make
one can specify a rule in Makefile
which can applied to all files which names satisfy some template. Is it possible in DVC? I’d like to batch process of a large number of files once and then re-process only new/updated files
Hi @hombit !
Unfortunately, there is no direct alternative to such a feature. Could you elaborate on your scenario, please?
Thank you,
Ruslan
My problem is processing of a lot of files, each file is processed independently. If the process crashed or data updated/added I’d like to re-produce only missing data-products, not everything. Makefile makes such tasks easy, because everything you need is the only one rule for some template, i.e. dir/*.dat
. In DVC I don’t know how to do it without N identical “runs” which makes dvc.json
non-human-readable and can cause errors if developer forgot to add new runs when data is added
It feels that it is at least partially related to this one https://github.com/iterative/dvc/issues/331 ? But I also don’t quite understand the full use case- the part with name templates. Is it separate from the incremental updates?
This issue looks very relevant for me, thank you!
Thanks!! It would be great if could chime in the ticket! It’ll help us prioritize this.