We have a scenario that someone might shed some light on.
We are sucessfully implementing Shared Development Server and Data Registries with shared cache and user groups in our deep learning development. It all works perfectly, shared cache, remotes, etc except one particular issue very specific for our use case.
Our current approach for data registries is that for any particular dataset we have a master copy of raw original data in that data registry repo on the server. Any data injections are done only from within that repo. So we always have a copy of raw datasets (not hashed) along with caches and a copy on the remote, which are hashed.
The issue with this scenario is that we need several users be able to add, commit and push from within a particular repo. And even when they are in the same group we have to manually set permissions to 2775 and 0664 for everythiong in .dvc folder (cache is not there as per shared cache scenario) so that dvc works. Git commands work without any permission issues. And from time to time permissions break on the dvc remote too and we have to reset them manually to 2775 and 0444.
Is it something that we might need to setup dvc some other way? Or is it just not supported but can be supported and possibly needs contribution?
And the .dvc/ permissions issue you have is because all of your users are using git add/commit and dvc push from within a single shared working tree on your server?
Thanks for prompt reply. Yes, correct, dvc config cache.shared group is set. And we need to work from within a single shared working tree on the server.
After initial setup and first dvc add, git add/commit/push and dvc push by one user, when another user tries to run any dvc command like dvc status, it throws sqlite3.OperationalError: attempt to write a readonly database. As a quick debug we looked for db files in repo which are .dvc/tmp/md5s/cache.db and .dvc/tmp/links/cache.db and they have 0644 permissions by default. Setting everything in .dvc folder to 2775 and 0664 does the trick and other users can run dvc commands from within that repo.
But then occasionally dvc push fails due to folder permissions on the remote when some folders have 2755, when they need 2775 as we understand? At least setting them back to 2775 solves that.
The issue is that regular files created by DVC (such as the state databases that give you the errors) will always have the default permissions for new files set by your OS/shell.
The cache.shared group option forces DVC to explicitly chmod DVC cache directory files to 0664, but it’s not really practical for DVC to do that operation to every possible file that DVC touches (including tmp files). And the normal use case in DVC is for users to have their own working trees (with a single explicitly shared cache directory).
I think what you are really looking for here is setting the appropriate umask in your users’ shell. By default your umask is probably 022 which makes the newly created file permissions 0644. If you run
umask 002
it will make the default permissions for newly created files 0664. Note that this will only be applied for the duration of your current shell session.
You probably aren’t looking to override the default umask system wide on your server, so one possible solution I can think of would be to install DVC in a virtualenv on your server, and add the line for setting your desired umask value (002) in the activate script for that virtualenv.
So when any of your users activate that venv to use DVC, their shell will have the appropriate settings for running DVC in your shared working tree.
You can also accomplish the same thing in any shell script that is sourced before running DVC, for example in a .bashrc (but I wouldn’t recommend using bashrc for this since your users probably only want this specifically for the DVC directory). Depending on your server OS there may be ways to set this type of permission per directory as well via commands like setfacl, but that’s platform specific.
Based on some googling, it looks like this is actually an old sqlite3 bug which has never been resolved. sqlite3 databases are apparently hard-coded to always be created with the umask applied to 0644 rather than 0666 (so setting umask to 0002 will still result in a sqlite3 database file with 0644 permissions).
The workaround for your case would be to explicitly chmod 0664 .dvc/tmp/**/*.db after calling dvc init (in addition to setting your umask to 0002). This should ensure that the permissions are correct for your entire .dvc directory.
We could potentially consider checking umask and overriding the permissions for those .db files in DVC, but it is not going to be high priority for us given that it’s a native sqlite3 issue that also affects Python in general.