Hello dvc people!
At my team we are experimenting with dvc as an interface to work with trained models and datasets. So far it has been working nicely, but we found a mayor roadblock when it comes to permissions management.
In our ideal scenario, when a teammate gets access to one of our repositories in GitHub that person will only have access to the data from that particular repository. This permission should be given automatically by the fact that the person has access to the repository, with no need to modify roles in AWS.
However, this seems challenging to achieve through S3 + DVC. Our considered solution right now is to have a single S3 bucket with all our DVC repositories. If we grant access to all teammates to the whole bucket it would mean that they will have access to all the company’s data, which is less than ideal. So we were considering having a “Junior” role which we will have to give access manually to whatever specific S3 folders they need and a “Senior” role with access to everything.
This solution is suboptimal as we will need to handle different permissions for GitHub and for S3, which adds overhead.
Is there any other pattern we are missing here? Is there any easy way for a teammate to be given access to ONLY the dvc remote from the GitHub repository automatically?
Thanks for the help