I just tested DVC with some dummy data and have run into a situation which would be less than ideal for production. Is there an elegant way out?
- The workspace has only one DVC-tracked directory named “data/”
- There are two git branches. In both of them, we added/removed separate files to “data/”
- Now, we want to merge both branches
- Unless there are conflicts, a simple git merge yields the union of the file operations in both branches
- A git conflict in data.dcv. I can’t really merge, but only pick the data version in one of the branches
Given that the command “dvc diff” shows some very useful output, is there a way to merge both data versions semi-automatic? I have read the page https://dvc.org/doc/user-guide/how-to/merge-conflicts, but this only mentions the “append-only” strategy, not even mentioning dvc diff :(.
P.s.: As a side question: Can “dvc diff” detect and highlight renamed files (since they have the same hash value)?