I plan to define my dvc pipes with the following file structure:
.
└── src/
└── pipelines/
├── model_a/
│ ├── dvc.yaml
│ ├── train.py
│ └── params.py
├── model_b/
│ ├── dvc.yaml
│ ├── train.py
│ └── params.py
└── ...
dvc.yaml
example:
vars:
- params.py
stages:
train:
cmd: python train.py
params:
- params.py:
- DataConfig
- TrainConfig
outs:
- ${TrainConfig.output_path}
params:
- dvclive/params.yaml
metrics:
- dvclive/metrics.json
plots:
- dvclive/plots/metrics:
x: step
params.py example:
class DataConfig:
n_samples = 400
n_features = 4
seed = 42
class TrainConfig:
tol = 1e-4
output_path = 'logreg.pth'
It all works fine when I run dvc exp run src/pipelines/model_a/dvc.yaml
if I don’t have any dvc tracked data in the repo. But when there is, when trying to use Live logger, an error is raised:
Running stage 'src/pipelines/logreg/dvc.yaml:train':
> python train.py
WARNING:dvclive:Ignoring `save_dvc_exp` because `dvc repro` is running.
Use `dvc exp run` to save experiment.
Traceback (most recent call last):
File "/Users/user/Main/Repos/dvc-pipe/src/pipelines/logreg/train.py", line 33, in <module>
main()
File "/Users/user/Main/Repos/dvc-pipe/src/pipelines/logreg/train.py", line 27, in main
with Live() as logger:
^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvclive/live.py", line 163, in __init__
self._init_dvc()
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvclive/utils.py", line 182, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvclive/live.py", line 250, in _init_dvc
stage := find_overlapping_stage(self._dvc_repo, self.dvc_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvclive/dvc.py", line 162, in find_overlapping_stage
for stage in dvc_repo.index.stages:
^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/repo/__init__.py", line 282, in index
return Index.from_repo(self)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/repo/index.py", line 330, in from_repo
for _, idx in collect_files(repo, onerror=onerror):
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/repo/index.py", line 90, in collect_files
index = Index.from_file(repo, file_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/repo/index.py", line 356, in from_file
stages=list(dvcfile.stages.values()),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen _collections_abc>", line 880, in __iter__
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/stage/loader.py", line 192, in __getitem__
return self.load_stage(self.dvcfile, deepcopy(self.stage_data), self.stage_text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/stage/loader.py", line 206, in load_stage
stage.deps = dependency.loadd_from(stage, d.get(Stage.PARAM_DEPS) or [])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/dependency/__init__.py", line 52, in loadd_from
_get(stage, p, d, files=files, hash_name=hash_name, fs_config=fs_config)
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/dependency/__init__.py", line 40, in _get
return RepoDependency(repo, stage, p, info)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/dependency/repo.py", line 37, in __init__
self.fs = self._make_fs()
^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/dependency/repo.py", line 124, in _make_fs
config = Config.load_file(conf)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc/config.py", line 220, in load_file
with fs.open(path) as fobj:
^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc_objects/fs/base.py", line 324, in open
return self.fs.open(path, mode=mode, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Main/Repos/dvc-pipe/.venv/lib/python3.11/site-packages/dvc_objects/fs/local.py", line 131, in open
return open(path, mode=mode, encoding=encoding) # noqa: SIM115
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '.dvc/config.local'
ERROR: failed to reproduce 'src/pipelines/logreg/dvc.yaml:train': failed to run: python train.py, exited with 1
It seems that when Live instance is created, dvc tries to check all .dvc deps, regardless of dependency actually being in dvc.yaml file, and then tries to check config files in them using the current dir of the script.
Example: the error occures if there is the following file in the repo root:
md5: 428cb92c23c3a423a3a6573f0f7786fd
frozen: true
deps:
- path: datasets
repo:
url: git@github.com:ciars-voc/voc-data
rev_lock: 566d116023fc85c2ec44e92a7d9a893ddd45f4c7
config: .dvc/config.local
outs:
- md5: b5e1e98793b486acc7ade359b9a76c92.dir
size: 137226200
nfiles: 8
hash: md5
path: datasets
Is it possible to avoid this error without changing wdir
in dvc.yaml
to repo root?