I have several remotes and one of them stores dvc<3.0 data. I tried to pull some data from it:
(venv) ermolaev@df783b0a927d:~/projects/.../mae$ dvc pull -r mae-gdrive -R data/full_datasets/.../datasets/train.dvc
Collecting |3.00 [00:00, 33.9entry/s]
Computing md5 for a large file '/data/projects/radml/DVC_CACHE//24/47e90cc3c859c1327679bdd4bd8f8f'. This is only done once.
Fetching
Building workspace index |10.0 [00:00, 47.1entry/s]
Comparing indexes |10.0 [00:00, 637entry/s]
Applying changes |1.00 [00:00, 519file/s]
M data/full_datasets/.../train/
1 file modified and 1 file fetched
Data has downloaded but one of the files has disappeared:
I use symlink
cache and file isn’t in cache. But it’s on a remote cache.
$ dvc doctor
DVC version: 3.44.0 (pip)
-------------------------
Platform: Python 3.10.12 on Linux-5.15.0-71-generic-x86_64-with-glibc2.35
Subprojects:
dvc_data = 3.11.0
dvc_objects = 5.0.0
dvc_render = 1.0.1
dvc_task = 0.3.0
scmrepo = 3.1.0
Supports:
gdrive (pydrive2 = 1.19.0),
http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3)
Config:
Global: /home/ermolaev/.config/dvc
System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: gdrive, gdrive, gdrive
Workspace directory: ext4 on /dev/sda1
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/7205a6ce3131e59a2db7211a94dd5faa
Why my file didn’t downloaded? Why DVC didn’t throw any error? There are several files that din’t download correctly actually and I don’t known why