We have developers working with data shared on a local network, and I’d like to understand whether/how dvc could integrate with this pipeline. I think I’m asking whether its possible (or even makes sense) to have a single, shared cache – kinda like the dvc cloud workflow you describe but without push/pull. The code just reads/writes data outside the git repo.
Another reason I’d like to leave data outside the repo is because many projects have the same (large) dataset as a dependency.
Is the answer hardlinks? Worried 'cause there’s already a lot of linking going on … (Duh. Not across filesystems!)
To be clear, this looks like an awesome tool that I’d like to adapt to if possible.