Tracking changes in params.yml

I’m trying to better understand how params.yaml works and in particular how changes in it are tracked. @jorgeorpinel suggested here to track params.yaml using git. So far so go. Now, when inspecting a .dvc generated by dvc run -p ... I see that the parameters from the params.yaml are not associated with MD5. This made me wonder: if I’m changing a parameter in the YAML (and commit the change to git), how dvc will know about it? How will a dvc repro ... know that a parameter changed?

1 Like

@drorata So, if you inspect the .dvc file you can note, that chosen params are stored there too.
So the detection is based on value stored in .dvc file versus value stored in params.yaml file.
Thats how DVC will know to rerun the stage on params.yaml changed.

Take a look at this script:

#!/bin/bash

rm -rf repo
mkdir repo

set -ex

pushd repo

git init --quiet
dvc init --quiet

echo data >> data
echo "lr: 0.1" >> params.yaml

dvc add data -q
dvc run -d data -o output -p lr "cat data >> output"

git add -A
git commit -am "init"

cat output.dvc | grep -A1 params

sed -i "s/0.1/0.2/g" params.yaml

dvc repro output.dvc

cat output.dvc | grep -A1 params

If you will run it, you can note that dvc notices the change in lr and reruns the command. After run it writes new value of lr to output.dvc.

2 Likes

Thanks for the great explanation!

1 Like