I have a training script for a self-supervised model for which I would like to evaluate every single produced checkpoint (i.e. one per epoch) against various downstream tasks. This whole setup would be configured as a pipeline, with training and evaluation scripts corresponding to pipeline stages. Since training may stop at any point if loss doesn’t improve, the number of produced model checkpoints is not known beforehand.
How could this use-case be handled with DVC? I was thinking that perhaps the checkpoint files produced by the training stage could be defined as a wildcard output, and the downstream stages could then loop over that wildcard as well. But this pattern does not seem to be supported by DVC at the moment.
Is there any other way to accomplish this?