My scenario is that I’m on a cluster with really limited home directory quota (40GB) but with TB of “scratch” space on a separate filesystem. I’d like to keep my Git repo on the home filesystem (say at
~/myrepo/), but the DVC cache on the scratch file system (say at
Its easy enough to configure the cache on the scratch filesystem with e.g.
dvc cache dir /scratch/dvc and then use the
symlink cache type so files just symlink to here.
But now, suppose I’ve generated a large file at
/scratch/newfile.dat. How can I add it to the repo at a chosen path without it ever touching the home filesystem?
I would have thought:
dvc add -o ~/myrepo/newfile.dat /scratch/newfile.dat
would do it but that still seems to go through the home filesystem.
It seems if I add
--to-remote it works as expected, but it also (obviously) pushes to remote, which I don’t want.
Thanks for any suggestions.