Refactor existing project from single root .dvc to subprojects (dvc init --subdir)

I have an existing project that was initialized at the git root with a single .dvc directory.

Are there going to be any gotchas or unexpected behavior if I want to refactor this by running “dvc init --subdir” in a couple of subdirectories to create subprojects with their own cache directories and remotes? If I understand things correctly, I can:

  1. Update my cache and ensure no one will push changes to remote while the steps below are done
  2. Run dvc init --subdir in a subdirectory that I want to turn into a subproject
  3. Set cache dir to the same as the root project, set the remote to the new remote for this subproject
  4. Run dvc push to copy current versions of all files in the subproject to the new remote
  5. Change subproject cache dir to its new path
  6. git commit and push

Anything I’m missing?

One confusing issue will be that dvc commands will have a different scope depending whether a developer is on a commit upstream or downstream from this change.

Hi, @jmiller ! Unfortunately, I haven’t really seen the use case you are proposing so can’t commit to saying that there won’t be any issues.

However, the steps you are proposing sound reasonable. Please don’t hesitate on following up with any problem you might encounter.

The only gotcha might be what you have already pointed out, that you should be clear documenting those changes and notifying the users of the repo (also using whatever mechanism you use in these cases like git tag or releasing)