Best Practice for CI with Run Cache?

rabefabi · July 6, 2020, 9:54am

I’m having trouble understanding how to use the new run cache in a continuous integration context.

From what I gathered it’s supposed to work as follows:

git commit & git push some changes to my code
The CI-Pipeline is started, it executes a dvc repro and trains the new model
After model training, the CI-Pipeline executes a dvc push --run-cache
- The newly trained model, and all other outputs tracked by dvc are pushed to the dvc remote
On my local machine, I execute dvc pull --run-cache (without any changes to my local dvc.lock file)
- I’m expecting the newly trained model and all other dvc outputs to appear in my working directory?

But instead of the files from the run cache appearing in my working dir, I still get the outputs corresponding to my (old, local) dvc.lock file.

Am I missing something here, or am I using the commands wrong?
In case my expectations are correct I can try to provide a minimal example or more debug information for my current setup.

When I manually copy the (new) dvc.lock from the runner and paste it into my local machine, I’m able to dvc pull from the dvc remote and get the expected outputs.
But from what I understood, it shouldn’t be necessary anmyore to git add dvc.lock && git commit from inside the runner for every experiment?

Thanks in advance,
Rabefabi

skshetry · July 6, 2020, 10:19am

@rabefabi, right now, you have to run dvc repro to change the lock-file and checkout from run-cache. But, before that, try running dvc status to see if anything’s changed.

Topic		Replies	Views
Git Flow for DVC 🌿 General	5	8403	December 11, 2020
`dvc pull --run-cache [target]` Questions	16	2276	July 18, 2020
Looking for Workflow Suggestion Questions	2	171	December 21, 2023
Basic DVC workflow Questions	1	403	January 19, 2023
Clear local cache completely and rely on remote Questions	3	5721	December 18, 2020

Best Practice for CI with Run Cache?

Related topics