Best Practice for CI with Run Cache?

I’m having trouble understanding how to use the new run cache in a continuous integration context.

From what I gathered it’s supposed to work as follows:

  • git commit & git push some changes to my code
  • The CI-Pipeline is started, it executes a dvc repro and trains the new model
  • After model training, the CI-Pipeline executes a dvc push --run-cache
    • The newly trained model, and all other outputs tracked by dvc are pushed to the dvc remote
  • On my local machine, I execute dvc pull --run-cache (without any changes to my local dvc.lock file)
    • I’m expecting the newly trained model and all other dvc outputs to appear in my working directory?

But instead of the files from the run cache appearing in my working dir, I still get the outputs corresponding to my (old, local) dvc.lock file.

Am I missing something here, or am I using the commands wrong?
In case my expectations are correct I can try to provide a minimal example or more debug information for my current setup.

When I manually copy the (new) dvc.lock from the runner and paste it into my local machine, I’m able to dvc pull from the dvc remote and get the expected outputs.
But from what I understood, it shouldn’t be necessary anmyore to git add dvc.lock && git commit from inside the runner for every experiment?

Thanks in advance,
Rabefabi

1 Like

@rabefabi, right now, you have to run dvc repro to change the lock-file and checkout from run-cache. But, before that, try running dvc status to see if anything’s changed.

2 Likes