I am puzzled by the behavior of dvc exp run
. In particular, I read:
When called with no arguments, this is equivalent to
dvc repro
followed bydvc exp save
.
(source)
But I find the behaviour of dvc exp run
to differ widely from that of dvc repro
, and I do not understand the underlying logic of the former.
When running dvc exp run --force
, dvc reproduces multiple experiments in a row, not just one.
- The first experiment uses the workspace (configuration, code).
- The next experiments seem to check out the workspace to commits and runs that
dvc.yaml
.
So something like
> dvc exp run --force
Reproducing experiment 'yucky-huia'
Building workspace index |8.00 [00:00, 114entry/s]
Comparing indexes |26.0 [00:00, 449entry/s]
Applying changes |13.0 [00:00, 1.04kfile/s]
[some code being executed]
Updating lock file 'dvc.lock'
[some code being executed]
Reproducing experiment 'obese-mana'
[The reproduction]
[etc.]
On the other hand, dvc repro
runs an experiment using the current workspace.
Could someone help me get what dvc exp run
is trying to do? Where is that behavior documented?
The output of dvc doctor
, for reference:
❯ dvc doctor
DVC version: 3.57.0 (pip)
-------------------------
Platform: Python 3.10.14 on Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
Subprojects:
dvc_data = 3.16.7
dvc_objects = 5.1.0
dvc_render = 1.0.2
dvc_task = 0.40.2
scmrepo = 3.3.8
Supports:
http (aiohttp = 3.11.3, aiohttp-retry = 2.9.1),
https (aiohttp = 3.11.3, aiohttp-retry = 2.9.1),
s3 (s3fs = 2024.10.0, boto3 = 1.35.36)
Config:
Global: /home/xavier/.config/dvc
System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sdc
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/sdc
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/76d5de47ff46d22ee75c2a81e6d464a4