Pipeline template

hey!

I want to create a template of a pipeline -
i.e. i want to run the same pipeline with different parameters each time.
i read the templating section documentation and looks like the only solution is
foreach in each stage,

is it possible to run multiple stages with one foreach block?
what if I want to run one specific set of params?

any suggestions?

thanks!!!

2 Likes

@roeez thank you for your question!

There are a few discussions about this subject: Reconfigurable pipelines · Discussion #5921 · iterative/dvc · GitHub and Reconfigurable modules · Discussion #5922 · iterative/dvc · GitHub.

It would be really helpful if you can provide more details on why the template is needed? Is it because you have multiple similar projects or it is about a specific use case in a single project? How an ideal API should look like in your dvc.yaml?

I’d appreciate it if you can provide more details.

I would also love this feature. Any update on this?

In my project I have multiple pipelines for different NN architectures I want to have available in my repo. Most of these pipelines are essentially the same just with different file paths, some have more significant differences e.g. calling different scripts. We also have an ensemble pipeline which trains component models using matrix / foreach functionality. I structured the repo and wrote my dvc.yaml files so that they use a “pipeline_name” variable. Then the pipelines are organized in the same directory format e.g.

pipelines/
- arch1
  - eval_results/
  - models/
  - dvc.yaml
  - params.yaml
  - summary.yaml
- arch2
  - eval_results/
  - models/
  - dvc.yaml
  - params.yaml
  - summary.yaml

then I can basically copy-paste the dvc.yaml and change the pipeline name variable at the top and I’m mostly done.

The drawback is that if I need to make changes to any dvc.yaml, I need to make the same change everywhere.

Sorry, no updates on that. And at least for now, we don’t have it in our plans for the recent future either :frowning:

No updates on pipeline templates, but plugging in different NN architectures is a use case that is often solvable using hydra integration. It’s pretty flexible to pass in entirely different sets of parameters for different architectures and even instantiate model classes using those parameters. See the example in https://github.com/iterative/dvc/pull/8093#issuecomment-1222126884 for an idea of how to do this.

2 Likes