Is jobs = n ignored on local stores?

JohnAtl · September 12, 2023, 11:27pm

I have a few TB of data in a local store.
When I dvc pull, dvc creates a crap load¹ of jobs and saturates the zfs pool, making the copies take a long time.

I added jobs 10 in .dvc/config, but it seems to have been ignored.

Am I missing something?
Thanks!

A crap load is a common unit of measure equivalent to a whole bunch.

kupruser · September 12, 2023, 11:58pm

The screenshot seems to show checkout part of dvc pull, which is not affected by remote’s jobs, since it is not really dealing with remote there, but with local cache. We don’t have an option to regulate checkout parallelism right now. A possible workaround is to use dvc fetch instead and dvc checkout files/directories as needed.

Adding support for core.checkout_jobs (or cache.checkout_jobs or smth like that) is possible and quite straightforward for dvc pull, so feel free to create an issue and/or even consider contributing.

Topic		Replies	Views
Is it possible to limit the size of the local cache directory? Questions	0	374	March 15, 2023
Dvc with git sparse-checkout Questions	3	598	January 31, 2023
Clear local cache completely and rely on remote Questions	3	5932	December 18, 2020
Issues with Pushing/Pulling Data from Local Remote Questions	1	40	March 31, 2025
Large dataset, dvc pull/add/push jobs options Questions	2	760	February 7, 2023

Is jobs = n ignored on local stores?

Related topics