First steps with DVC, a few questions

Hi folks, first time DVC user here! I’m following the quickstart guide Get Started with DVC with some data of mine and I have a few questions.

  • The docs say “dvc add moved the data to the project’s cache, and linked it back to the workspace.”, but I see my data files are still there and I don’t see any symbolic links. I had no expectations but now after reading this I’m confused.
  • I did dvc push to a local MinIO instance :+1: Does the data now live in my filesystem and the remote then? Is there a reason for this?
  • After that, I then cloned my own repo to some other folder, the data files weren’t there (okay) but then dvc pull failed with “No remote provided and no default remote set…ERROR: failed to pull data from the cloud” :boom: Is there a way to “save” the remote properties so that it gets propagated to clones?
1 Like
  • The docs say “dvc add moved the data to the project’s cache, and linked it back to the workspace.”, but I see my data files are still there and I don’t see any symbolic links. I had no expectations but now after reading this I’m confused.

what OS are you using? On macOs and some other filesystems it might be reflinks (copy on write). It’s better compared to the symlinks.

  • I did dvc push to a local MinIO instance :+1: Does the data now live in my filesystem and the remote then? Is there a reason for this?

yes, but you can drop it on your local machine / disk if you’d like. DVC doesn’t do this by default.

After that, I then cloned my own repo to some other folder, the data files weren’t there (okay) but then dvc pull failed with “No remote provided and no default remote set…

probably means you forgot to git commit .dvc/config file that has remote URL, etc

1 Like

Thanks for the prompt responses!

Indeed macOS, TIL about reflinks. Thanks!

got it, will keep it in mind.

you’re exactly right :smile:

1 Like