Best practice for hyperparameters sweep

Hi,

I wonder what is best practice for hyperparameter optimization, to do e.g. a parameter sweep, to capture both the models and the metrics. I’d like to try 10 different values for the number of clusters (and also a few other parameters), and then plot figures visualizing each clustering.

Do you just execute dvc run in a loop from a script? Then that script essentially becomes the “Makefile”?

Also, do you create a new branch for each experiment, or just a new folder with a unique name?

Thanks!
Alex

3 Likes

Hi @alex,
we have some GitHub issues related to this subject.
https://github.com/iterative/dvc/issues/2532 here is one related to your question.
Also here:
https://github.com/iterative/dvc/issues/2379

Would you mind sharing your use case and describing how would you expect such workflow to behave? The more feedback we gather, the faster we can develop some initial draft on how to proceed with this subject.

3 Likes

Thanks for the links, they address exactly the questions I had! Unfortunately, no solutions yet :slight_smile:

I’ve used a number of ad-hoc solutions, and I was not satisfied with any of them.

For every output file I would like to be able to find exactly how the file was produced, i.e. version of code (git commit), hyperparameters (e.g. command-line parameters to the executables), and data (I have several “datasets” that go through identical pipelines). I experimented with long file- and directory-names (called “experiment-as-a-folder” in issue #2532), and included values of all parameters in a filename (e.g. summaryFigures/dataA_numclusters5_epochs10/). Much of this information is already in the DVC file, but the workflow is not “clicking” for me yet.

I think I’d like to separate a generic workflow/pipeline/DAG definition from each of its “instances” (with the hashes and all).

I will watch those two issues on GitHub.

Thanks again!
Alex

1 Like

Hi Alex!

This is good insight! It would be great if you can share this in the issue tracker (under any of the issues @Paffciu mentioned).

1 Like

@alex could you please elaborate on the the suggested split (generic workflow vs instances)? Did you mean taking md5 checksum values out of DVC-files and saving them somewhere? What benefits do you see in doing this?

@shcheklein I don’t have a full picture yet, but I’d say that DVC-files would store checksums. Maybe some generic pipeline will instantiate “runs” and save the run-command and track the command result in a DVC-file. I guess I’m looking for some tool that would auto-generate and manage huge amount of DVC-files.

1 Like

Thanks, @alex! So, am I understand correct that would be some DSL or bash or whatnot to actually execute runs (instantiate DAG)? Could you take a look at this ticket - https://github.com/iterative/dvc/issues/1018 ? what do you think?

Yes, ticket #1018 is very relevant. I found no definite solution/recommendation there, but very useful references to snakemake and makepp.

@alex we’ve prioritized the issue and moved the discussion here https://github.com/iterative/dvc/issues/2799 . Please, feel free to chime in. We would really appreciate your involvement.