Improved parallel execution documentation


It seems like proper parallel stage execution is in the works so this might be eventually OBE. At the moment the way to run stages in parallel (when they have common dependencies) is to run them as experiments then merge them back together. This process is a bit of a pain so I am looking forward to proper parallel execution. However, this workaround could be improved with better documentation. I have used this technique two times to train 10 components of an ensemble model. The first time, all the dvc.lock files got jumbled/reordered so the merge was pretty involved. I ended up manually extracting the sections of the dvc.lock files from the specific stages I knew ran in each experiment branch and merging them into a common lock file. This ultimately worked but was an arduous process. The second time I ran it I did an octopus merge and everything combined seamlessly and easily. I’m not sure exactly what I did the second time to achieve this result so I think some good documentation on how to go through this process would be great.

Great points. Could you create an issue on GitHub - iterative/ 📖 DVC website and documentation , please? Also contributions to docs are welcome, if you are keen.

I would contribute but I need to understand what happened to make things go smoothly the second time. Was it the fact that I used octopus merge of all experiment branches together simultaneously rather than one at a time?

Oh, I see. Unfortunately, I can’t explain that at this moment neither.