Dvc repro --force-downstream not working

Buffy · March 16, 2022, 11:49pm

Hi,

I setup some dvc data pipeline and use dvc repro command generated some .pkl files. I later on decided to regenerate the files (due to code change, where the dependency wasn’t captured in the dvc.yaml file), so I deleted the file manually in the directory. This is probably the first mistake I made.

So now when I run dvc repro, dvc refuse to re-run and report nothing has been change.

I thought dvc repro --force-downstream [target] should work, but it’s still reporting “everything is up to date” and not reproducing the stage.

How do I force dvc to re-run the stage to regenerate the .pkl file?

Thanks,

dtrifiro · March 17, 2022, 10:17am

Hi,
would you mind sharing your dvc.yaml? Did you add the output pkl files as stage outputs?

Buffy · March 17, 2022, 11:36pm

Hi,

Short answer is yes. Please see the train and test stages from my dvc.yaml below.

training:
cmd: python3 code/train.py
params:
- train
deps:
- ./data/SWT_transform/Hisar_test_data_firstOrder.npy
- ./data/SWT_transform/Hisar_test_data_secondOrder.npy
- ./data/SWT_transform/Hisar_test_data_total.npy
- ./data/SWT_transform/Hisar_train_data_firstOrder.npy
- ./data/SWT_transform/Hisar_train_data_secondOrder.npy
- ./data/SWT_transform/Hisar_train_data_total.npy
outs:
- ./model/rbf_svm_3_.pkl #The model ID needs to be updated for each new model generated
- ./model/scaler3.bin

test:
cmd: python3 code/test.py
params:
- test
deps:
- ./model/rbf_svm_3_.pkl
- ./model/scaler3.bin

dtrifiro · March 18, 2022, 9:01am

Thanks! The pipeline itself looks fine, but you might want to add code/train.py to dependencies. That way, when you change train.py and run dvc repro, the training stage will be re-run since one of its dependencies changed.

To force reproducing without adding train.py you could use dvc repro -f

Topic		Replies	Views
Run pipeline from a stage Questions	2	705	February 28, 2020
Creating an aggregate .dvc file Questions	11	3342	October 17, 2018
First run of DVC - getting a "failed to reproduce" error Questions	7	6198	April 11, 2019
Force running experiment from a specified stage Questions	1	127	July 22, 2024
Using DVC for end-to-end pipeline Questions	6	1613	January 5, 2019

Dvc repro --force-downstream not working

Related topics