Dvc_api.get_url is not working with external data

Sure. Let’s try to setup a MWE together:

# Mock up the setup
# In reality user1 would be something like `/home/user1`, cache better
# to be located on the same volume as `/home` and data (which is `NAS/temp_registy/Dogs`)
% mkdir example-shared-cache
% cd example-shared-cache
% mkdir data
% mkdir user1
% mkdir user2
% echo "dog1" > data/dog1.txt
% echo "dog2" > data/dog2.txt
% mkdir cache
% cd user1
% mkdir project
% cd project
% git init

# Initialize DVC repo with a remote cache and enable all possible links to avoid copies
% dvc init
% dvc config cache.type "reflink,symlink,hardlink,copy"
% dvc cache dir /Users/ivan/Projects/example-shared-cache/cache
% git add .dvc/config

# Now we add the data finally. See it here https://dvc.org/doc/command-reference/add#example-transfer-to-an-external-cache and here https://dvc.org/doc/command-reference/add#-o
% dvc add ../../data -o data
% ls
data     data.dvc

% git add .gitignore data.dvc
% git commit -a -m "add data"
% # git push should go here to GH/GitLab/etc

# Now the second user comes ...
% cd ../user2
% git clone ../user1/project # in reality it would a clone from GH/GitLab or something
% cd project
% ls
data.dvc
% dvc checkout
% ls
data     data.dvc

Now, let’s say we’d like to add one more dog into the initial dataset:

% echo "dog3" > example-shared-cache/data/dog3.txt

To update the data in the repository, do this:

% cd user1/project
# This is a bug that you have to remove these, I'll create a ticket
% rm -rf data data.dvc
% dvc add ../../data -o data
% git add data.dvc ...
% git commit -m "update data"
% git push

There is an important caveat to keep in mind. Check the cache.shared option and configure it appropriately as well in the initial setup if it’s needed.

An interesting alternative to using dvc add ... -o is to use dvc import-url like this:

% dvc import-url /Users/ivan/Projects/example-shared-cache/data data

It’s similar to the dvc add ... -o but also saves in the .dvc file the source /Users/ivan/Projects/example-shared-cache/data, and later you could use dvc update data.dvc or something like this to update your data.

Another alternative is to setup an extra data registry repo as @pmrowla mentioned.