Background
I am using dvc with tensorflow. I have two versions of my model which are different enough to merit their own git branches, i.e. they have different architecture definitions and different data generators.
To run experiments, I git checkout
each branch, make my changes and queue up an experiment. I have two branches so I end up queuing up 2 experiments. Then I call dvc exp run --run-all -j 2
to run them both in parallel.
This is basically what dvc exp show -a
looks like (don’t pay attention to the metrics and params):
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ Experiment ┃ Created ┃ auc.model_params ┃ auc.model_t_params ┃ auc.model_nt_params ┃ train_model.walltime ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ workspace │ - │ 10787481 │ 10782921 │ 4560 │ 3:00:00:00 │
│ branch_1 │ 12:48 PM │ 10787481 │ 10782921 │ 4560 │ 1:00:00:00 │
│ └── 508af88 [exp_1] │ 06:49 PM │ - │ - │ - │ 3:00:00:00 │
│ ├── 6d926b6 [exp_2] │ 06:47 PM │ - │ - │ - │ 3:00:00:00 │
│ branch_2 │ 06:22 PM │ 697998 │ 696878 │ 1120 │ 1:00:00:00 │
│ master │ Nov 17, 2021 │ - │ - │ - │ - │
└────────────────────────────┴──────────────┴──────────────────┴────────────────────┴─────────────────────┴──────────────────────┘
Question 1
How do I interpret the lines next to the experiment names? It looks like exp_1
experiment correctly connects to branch_1, but why does it look like the exp_2
experiment connects to both branch_1 and branch_2? Did I set up these experiments incorrectly or is this a display bug or what?
Question 2
My other more general question is: is this the intended workflow for managing experiments on multiple branches? Any comments on this question from a dvc expert would be very much appreciated.