Best practice for queuing experiments on code changes

pafonta · April 1, 2021, 1:26pm

Dear community,

Thank you very much for DVC and the new features from the version 2!

What would be your recommendation on how to queue experiments which depend on code changes?

The use case is to queue as individual experiment each change of the content of a Python file declared as a deps of a DVC stage.

For example:

one is on a branch,
makes a change on a Python file,
do dvc exp run --queue for a DVC stage depending on this Python file,
then do another change on the same Python file,
then do dvc exp run --queue for the same DVC stage,
and then do dvc exp run --run-all.

It seems that committing each change is an anti-pattern and not very usable.

But if the change are not committed, git shows the changed file hanging around as modified and the changes corresponding to an experiment seem then not tracked.

The idea would be to replicate what is below but when the changes are not on a params.yaml file:

Thank you.

pmrowla · April 1, 2021, 3:23pm

Hi @pafonta, dvc exp run (with or without --queue) should work out of the box to generate experiments with your code changes. The feature is designed to work with any changes in your DVC/git repo, it is not specific to parameters (experimenting with modified parameters is just the common use case).

But if the change are not committed, git shows the changed file hanging around as modified and the changes corresponding to an experiment seem then not tracked.

Once you have run the experiments, you don’t need to keep the code changes in your workspace (and you do not need to commit them anywhere yourself). They will still be tracked as a part of the DVC experiment.

If/when you decide you want to keep one of your experiments, you can use dvc exp apply to re-apply the code changes back to your workspace.

The code changes for an experiment can also be retrieved directly in git - under the hood, experiments are essentially custom git branches. If you use dvc exp show --sha, you can see the git SHA for your experiments, and that SHA can be used in commands like git diff.

pafonta · April 6, 2021, 9:26am

Hi @pmrowla,

Thank you very much for your prompt and detailed reply!

This helps me and solves my issue.

Topic		Replies	Views
Basic DVC workflow Questions	1	414	January 19, 2023
Experiment duplicates Questions	0	268	April 18, 2023
Experiments go missing when commits are amended Questions	2	309	January 2, 2024
Error. Run exp in queue Bug Reports	6	768	May 5, 2023
Git Flow for DVC 🌿 General	5	8421	December 11, 2020

Best practice for queuing experiments on code changes

Related topics