Question about import-url

Hi, I just want to make sure I understand import-url correctly. Say I do:

dvc import-url ssh://<host>/myfile.dat

This creates a myfile.dat.dvc file in my repo which records in it the remote path ssh://<host>/myfile.dat, and downloads the myfile.dat from the remote server. If I later dvc push, the file myfile.dat does not get pushed to the remote cache.

If someone else clones my repo and does a dvc pull, they will download myfile.dat from ssh://<host>/myfile.dat as well as recorded in the dvc file, rather than from the remote cache (where it doesn’t exist). I suppose the hash will be used to check the file is still actually the same on the remote.

I can also do dvc update --to-remote myfile.dat.dvc, which does push the file to the remote cache, if I want to safeguard against it being deleted/changed on ssh://<host>/myfile.dat.

Is this all right?

If I do do dvc update --to-remote, then will future new dvc pulls get the file from the remote cache, or from ssh://<host>/myfile.dat?

Thanks!

dvc push will push myfile.dat since there is no guarantee ssh://<host>/myfile.dat won’t be deleted or changed. You might be mixing up import-url and import.

1 Like