I stumbled upon dvc over the weekend after discussing with my colleagues our need for just such a thing. I got it working in some test repos with data on our private s3 buckets. Very cool! I really appreciate the documentation, it’s very well put-together.
One thing I’m not sure of is:
If I have a dataset on a remote server and I have several different repos/projects that use it, what is the appropriate way to point them all to that dataset and/or a single dvc remote representation thereof? (Do I need in each repo/project to download it, then dvc add it, and assign the dataset to the same remote?)