I have multiple users who each can launch their own AWS instance for data analysis, then shut it down when they are done. Rather than having to rebuild the dvc cache on each of these instances when they startup, we want to have a persistent shared cache.
So I’ve created a shared DVC cache on EFS (AWS’s version of NFS) with symlinks in the workspace, but the problem is that ‘dvc fetch’ and ‘dvc checkout’ are now quite slow for certain branches that have 1000s of files tracked by DVC. First, for a ‘dvc fetch’, even if no downloads need to be done from the remote, just “Querying cache…” can take up to 8 minutes. Then ‘dvc checkout’ can take 6 more minutes. Presumably this has to do with NFS communication overhead to list files in the cache (I see the progress bar saying things like “110 files/sec”).
Is there any way to speed up a shared cache stored on NFS, especially when you have 1000s of tracked files?