Cannot apply the first exp after new commit

Hello,

Iv been trying to use dvc to manage my experiments and like it very much so for but I encountered rather unexpected behavior.

Anytime I commit the code, the very first experiment which I performs afterwards cannot be applied to workspace. The command “dvc exp run” will proceed without errors and environment is modified accordingly. But when I change the environment and then try to “dvc exp apply” it back, no changes to the environment are made (again, no errors or warnings). Applying any other experiments works as expected.

The minimal example is the following:

dvc.yaml:
stages:
main:
cmd: Rscript src/main.R
params:
- par.yaml:
metrics:
- out.yaml:
cache: false

par.yaml:
a: 1
b: 1
c: 1
d: 1

main.R
library(yaml)
par ← yaml.load_file(“par.yaml”)
out ← list(
perf = par$a * 1 + par$b * 0.1 + par$c * 0.01 + par$d * 0.001
)
print(out)
write_yaml(out, “out.yaml”)

However, I believe that the issue may not be with the dvc.yaml but rather in my workflow as the same happens also when creating the environment with “dvc exp init” and when cloning the official example repo. Em I supposed to call some other command after new commit to prime the environment for new experiments? I had not found anything in the reference manual.

Right now I’m sidestepping the issue by running the first experiment twice and disregarding the first one but it is cumbersome for computationally costly experiments.

Thank you very much for your help.

But when I change the environment and then try to “dvc exp apply” it back

Could you please clarify what you are modifying when you “change the environment” before using dvc exp apply?

Of course. I by that meant modifying the par.yaml either manually or via -S option and reruning the model to regenerate outputs. In particular, for the following (slightly modified) setup

dvc.yaml:

stages:
  main:
    cmd: Rscript src/main.R
    deps:
    - src
    params:
    - par.yaml:
    metrics:
    - out.yaml:
        cache: false

main.R:

library(yaml)
par <- yaml.load_file("par.yaml")
b <- 9
out <- list(
    perf = par$a * 1 + b * 0.1
)
print(out)
write_yaml(out, "out.yaml")

par.yaml:
a: 9

I do the following:

  • change values of a and b to 1 to simulate the process of working on both the code and parameters
  • commit all
  • run dvc exp run -n exp_a and observe that perf=1.1 is created as expected
  • run dvc exp run -n exp_b -S "par.yaml:a=2" and observe that perf=2.1 is created as expected
  • run dvc exp run -n exp_c -S "par.yaml:a=3" and observe that perf=3.1 is created as expected
  • Now I want return to exp_a. I call dvc exp apply exp_a. However, a in par.yaml stays at its current value, in this case 3. Interestingly though, the out.yaml gets restored to the correct value of 1.1.
  • If I try to apply any other exp, say exp_b or c, everything works as expected and both par.yaml and out.yaml are restored.

I also tried to perform another experiment (say exp_0) right before the commit (so that the commit is published with the correct value of out.yaml) In this case, neither par.yaml nor out.yaml is restored when calling dvc exp apply exp_a.

Em I using it right? Thank you.

I’ve confirmed that I can reproduce the issue, this is a bug in DVC. To clarify, the changes from the first exp are applied except for files where the file in the experiment result is unchanged from the file in the initial git commit. So in your case, the changes for the output out.yaml are applied, but since par.yaml is unchanged (both the original commit and the experiment result contain par.yaml:a=2) exp apply is essentially ignoring the params file and does not modify it.

I’ve opened a github issue for this bug, you can subscribe to it for further updates: exp apply: git-tracked files which are unchanged from HEAD are not reset on apply · Issue #8764 · iterative/dvc · GitHub

Great, thank you for opening the issue.
Keep up with the great work, it is pleasure to use DVC.