In this case, run-cache
would track params
and data
. These are the ones that change when dvc runs in Airflow retraining pipeline.
Therefore
Day 0: Production and Airflow have data1 + params 1 + code1
Day 1: Airflow retrains model, now, it saves with run-cache
the combination: data2 + params2 + code1
Then, production model checks for updates in run-cache
periodically and updates itself, yielding a production environment with: data2, params2, code1
.
This is how I am imagining it would work with run-cache
and no commits