Can my workspace be an s3 location?

I would like to run dvc pull directly into an s3 bucket. Is that possible?

Long story short: I would like end users to version control their csv files using dvc. I then would like to have a recurring process that will do a dvc pull, of the main branch, directly into an s3 bucket. I understand that dvc can use s3 for storage, but I would like to re-create the data in s3, in an identical structure as the dvc git repository.

Hi @zzztimbo!
AFAIK s3 is not a machine where you could run a terminal, and use dvc pull inside. It seems to me that what you want to do is to checkout the main branch of your repo and use aws s3 cp {your_local_csv_path} s3://{path_on_s3} in the recurring process that you are referring to.

Hi @zzztimbo ! You might want to take a look at https://dvc.org/doc/user-guide/managing-external-data and https://dvc.org/doc/user-guide/external-dependencies . Those features are advanced but should give you exactly what you are looking for I think.

Keep in mind that it works fine if you don’t share the project and multiple users don’t work simultaneously. Otherwise you would need to put some extra measures to make sure that everyone is using unique location bucket in S3 when they do dvc pull, etc.