Download out of a specific stage

How can I download specific out using pull? Is it even possible?

I have the following pipeline:

  correct_h5:
    foreach: ${datasets}
    do:
      cmd: >-
        python ds_gen/process_ds.py
        --ds-root ${item.path}/h5/
        --out ${item.path}/h5-corrected/
        --config pipelines/01_ds_gen_and_analysis/correction.json
        --num-workers 4
        --buffer-size 4
        --force
      deps:
        - ${item.path}/h5/train/
        - ${item.path}/h5/val/
        - ${item.path}/h5/test/
      params:
        - pipelines/01_ds_gen_and_analysis/correction.json:
      outs:
        - ${item.path}/h5-corrected/train/
        - ${item.path}/h5-corrected/val/
        - ${item.path}/h5-corrected/test/
        - ${item.path}/h5-corrected/log.txt

I’d like to download the only ${item.path}/h5-corrected/val/ on my local machine. But I can download only all artifacts at once by dvc pull dvc.yaml:correct_h5@1 and I didn’t find a way how specific out can be set.

I found out that I can use get: dvc get . <path to folder where out is expected to be>, but it’s not very convenient to use

You can specify any subpath in dvc pull like dvc pull my_item_path/h5-corrected/val/.

2 Likes