I have a few TB of data in a local store.
dvc pull, dvc creates a crap load1 of jobs and saturates the zfs pool, making the copies take a long time.
jobs 10 in
.dvc/config, but it seems to have been ignored.
Am I missing something?
- A crap load is a common unit of measure equivalent to a whole bunch.
The screenshot seems to show
checkout part of
dvc pull, which is not affected by remote’s
jobs, since it is not really dealing with remote there, but with local cache. We don’t have an option to regulate checkout parallelism right now. A possible workaround is to use
dvc fetch instead and
dvc checkout files/directories as needed.
Adding support for
cache.checkout_jobs or smth like that) is possible and quite straightforward for
dvc pull, so feel free to create an issue and/or even consider contributing.