@Andrej api
functions are designed to make it easier to work with remotes repos and especially remote cache of those repos. Say you have your repo at https://github.com/amesar/dbws2
and you added a remote and pushed some files to it:
# Add a remote named "s3" pointing to some amazon s3 bucket and path,
# set it as default, mind `-d`
dvc remote add -d s3 s3://bucket/path
# Push any files added via `dvc add` or `dvc run` to default remote
dvc push
# On a fresh clone or on an outdated working copy,
# this will pull data from a dvc remote.
dvc pull
After that your file.txt
will be stored at s3://bucket/path/b4/1abaf44fdd5f41b0d7c57669c9109a
and you can get it with:
aws s3 cp s3://bucket/path/b4/1abaf44fdd5f41b0d7c57669c9109a file.txt.copy
To get those s3 url you may use dvc.api.get_url("file.txt", repo="https://github.com/amesar/dbws2")
you can also use dvc.api.open()
or dvc.api.read()
.
The reason all these doesn’t work for you is that your remote is not set properly, looking at your .dvc/config
file I can see that you are trying to use git url as your remote, which won’t work. The idea behind dvc remotes is that you want to store your big files separetely from your git versioned things, i.e. some cloud storage, some server via SSH or simply in a separate local dir (bigger drive, network share or whatever).
I showed the basic usage, i.e. dvc remote add -d ...
, dvc push
and dvc pull
higher, but you can read more about here:
The last ones goes about all the remote types we support.
BTW, we are still refining our api so you might keep your dvc version fresh. Doc-strings are also being discussed, so any input is welcome. I hope this will help you make it work, don’t hesitate to get back to us otherwise.
P.S. The rev
argument in all api function is git revision, i.e. a branch name, a tag or a commit sha.