Is it possible to change branches while running repro?

In our project we sometimes have long repros (with e.g., hyperparameter optimization) that can take hours/days to finish. While such a repro is running, I can’t change branches because:

  • The repro reads files as the stages are being executed, so changing branches mid-repro could make subsequent stages use file versions from a different branch;
  • We have commit hooks that use dvc, which complains about the lockfile if I try to change branches during a repro.

Right now I see two solutions to be able to continue working while a repro is running:

  1. Run the repro in another machine (e.g., cloud)
  2. Create an entire copy of my repo in another folder, and work there.

Solution 1 is bad because I have to waste resources to keep the cloud instance running, and because running it remotely is cumbersome (I don’t currently have a system to notify me if the repro fails, for instance, and have to keep checking, and I don’t have the same tools in the remote to investigate outputs, monitor resource usage, etc).

Solution 2 is bad because I need twice as much data stored in my computer (for which I usually don’t have enough space for, it’s a big project), and because more than once I ended up working on the wrong repo, messing up the repro and having to move my changes to the other version of the repo.

Are there any recommended approaches here? This seems like a common problem that most users with long-running repros would face.

One solution would be to do 2 and set up a shared cache locally so the storage isn’t duplicated. How to Share a Cache Among Projects

Another option would be to treat the pipeline as an experiment and use dvc exp run --temp or dvc exp run --queue to run it in a temporary directory.

Using dvc experiments sounds like a really nice solution, thank you for the input! Would you mind elaborating a bit more on how the workflow would be? I’m guessing something like:

  1. dvc exp run --temp, leave the long-running repro/exp running
  2. git checkout other_branch, work in another branch
  3. ?

Maybe step 3, when the repro is done, is running dvc exp apply? Maybe git checkout the previous branch first?