Public read-only s3 remote

Hey folks,

The DVC getting started example has a read-only HTTP remote.

Is it possible to do the same with S3 so the public can have read-only permission? I believe I’ve set my bucket permissions correctly* (I can download w/o AWS creds), but when I try dvc pull w/o creds I get an error:

$ dvc pull
ERROR: unexpected error - Unable to locate credentials 

It looks like dvc pull is checking for credentials regardless.

    • allowing s3:GetObject and s3:GetObject

If this isn’t possible, does anyone have a suggestion on the easiest approach to allow read-only access to DVC data analogous to a public GitHub repo?

Thanks!

2 Likes

Hi @dlite !

Indeed, aws requires you to have at least some creds to access even public buckets through s3 protocol. But, you could do that another way: by accessing it through HTTP. For example, in our core repo we use public s3 bucket to store some images, but we use it as an HTTP remote for dvc pulling. See https://github.com/iterative/dvc/blob/master/.dvc/config#L2 (we do route it through our own domain, but it doesn’t matter, you could do the same with regular https://<bucket>.s3.amazonaws.com/.. URL for your bucket.

1 Like

@dlite So just to make it clear, you could configure two remotes in your dvc project: one to access your bucket through s3 to push stuff to and one http one to allow anyone to download from. E.g.

dvc remote add mys3 s3://bucket/path
# note -d, which makes it the default one
dvc remote add -d mys3http https://bucket.s3.amazonaws.com/path 
# now you could
dvc push -r mys3
# or you could set it as a default one for your machine using --local flag,
# which uses .dvc/config.local which is .gitignored and will only stay on your machine.
dvc remote default mys3 --local
1 Like

Interesting! I’ll give this a shot tomorrow.

Thanks for the quick reply @kupruser!

2 Likes

Thanks again @kupruser - the process you outlined above is working well for me.

dvc remote default mys3 --local

I like this as it makes the flow very similar to working on a public GitHub repo (I can push by default and others can read by default).

2 Likes