I have the following use case:
- Launch a script which loops over multiple folders and does some processing (each folder takes quite a bit of CPU time)
- Each folder leads to the creation of a new folder, containing the output of the processing
- Once the loop is complete, a single .dvc file is created
This works fine, but if the processing crashes mid-way,
dvc repro will start again from scratch.
One solution would be to create a
.dvc file for each folder, but then, repro is a bit tedious to run as I would need to loop over all the
Is there a better way to go ? If not, would it be possible to define multiple layers of dvc files (e.g. run subtasks, create multiple
.dvc, once subtask is complete, create a master
.dvc which will allow to easily repro all the subtasks) ?
Thanks in advance !