I’m trying to better understand how params.yaml
works and in particular how changes in it are tracked. @jorgeorpinel suggested here to track params.yaml
using git
. So far so go. Now, when inspecting a .dvc
generated by dvc run -p ...
I see that the parameters from the params.yaml
are not associated with MD5
. This made me wonder: if I’m changing a parameter in the YAML (and commit the change to git), how dvc
will know about it? How will a dvc repro ...
know that a parameter changed?
1 Like
@drorata So, if you inspect the .dvc
file you can note, that chosen params are stored there too.
So the detection is based on value stored in .dvc
file versus value stored in params.yaml
file.
Thats how DVC will know to rerun the stage on params.yaml
changed.
Take a look at this script:
#!/bin/bash
rm -rf repo
mkdir repo
set -ex
pushd repo
git init --quiet
dvc init --quiet
echo data >> data
echo "lr: 0.1" >> params.yaml
dvc add data -q
dvc run -d data -o output -p lr "cat data >> output"
git add -A
git commit -am "init"
cat output.dvc | grep -A1 params
sed -i "s/0.1/0.2/g" params.yaml
dvc repro output.dvc
cat output.dvc | grep -A1 params
If you will run it, you can note that dvc notices the change in lr
and reruns the command. After run it writes new value of lr
to output.dvc
.
2 Likes
Thanks for the great explanation!
1 Like