Multiple pipelines with single metric file

I am trying to define multiple pipelines sharing a single metrics file and this does not seem possible currently. I get an error when trying to add the same metric file to more than one pipeline.

My use-case is, I am doing some ML experiments and would like to have multiple models that I work on in parallel, for example a SVM and a Naive Bayes model (maybe more models in future), both sharing the same data and much of the pre-processing steps. Then I have 2 pipelines, which are very similar, sharing many stages: model_svm.dvc and model_nb.dvc. So they are in the same repository and I would like that they both use the same metric file so that when I call “dvc metrics show -T” it will show past results from both models.

Is this currently possible and if so how? Or am I trying to use the system in a way that was not intended, and if so could you recommend how I could restructure my project so that it fits better with how DVC was intended to be used?

Hi @tania !

Indeed, dvc doesn’t allow multiple dvcfiles to have the same output, as it creates conflicts on dvc checkout. Are you caching that metric file with dvc? I.e. are you using -m or -M option in the respective dvc run's for it?

Also, note that for dvc metrics show, you don’t have to use 1 metric file, you can totally have multiple of them and dvc metrics show will show both.

1 Like

Thanks @kupruser, I didn’t realise that metrics show shows all the metrics files. This solves my problem.

I am using -m to add the metrics files.

1 Like

Glad it solves it! :slight_smile: Be sure to let us know if you have any questions.