Hi! I am trying to integrate experiments usage, but faced with some issues.
I have the following pipeline:
stages:
train:
cmd: python train.py data/dataset/ --config config.yaml --out data/model.pt
deps:
- data/dataset/
- train.py
outs:
- data/model.pt
I run experiment and push:
dvc exp run -n v1.0 -S config.yaml:model.dim=64
dvc exp push origin v1.0
The I try on other server run the same exp:
dvc exp pull origin v1.0
dvc exp run -n v1.1 -S config.yaml:model.dim=64 -v —dry
>>>
train stage modified
git status
>>>
changes in config.yaml
i.e. dvc thinks that there are changes in config and runs pipeline, but experiment was pulled and in a cache
But if I apply this exp:
dvc exp apply v1.0
dvc chekout
dvc exp run -n v1.1 -S config.yaml:model.dim=64 -v —dry
>>>
no changes, get from cache
i.e. dvc sees that experiment is duplicate only if apply it
I expected that dvc allows not to run exps if it was run already. So, my questions are:
- what am I doing wrong?
- if smth is ok, why dvc can’t detect that new experiment was already run? Is this feature going to be implemented?