Best practice for hyperparameters sweep

alex · October 7, 2019, 12:33pm

Thanks for the links, they address exactly the questions I had! Unfortunately, no solutions yet

I’ve used a number of ad-hoc solutions, and I was not satisfied with any of them.

For every output file I would like to be able to find exactly how the file was produced, i.e. version of code (git commit), hyperparameters (e.g. command-line parameters to the executables), and data (I have several “datasets” that go through identical pipelines). I experimented with long file- and directory-names (called “experiment-as-a-folder” in issue #2532), and included values of all parameters in a filename (e.g. summaryFigures/dataA_numclusters5_epochs10/). Much of this information is already in the DVC file, but the workflow is not “clicking” for me yet.

I think I’d like to separate a generic workflow/pipeline/DAG definition from each of its “instances” (with the hashes and all).

I will watch those two issues on GitHub.

Thanks again!
Alex

Topic		Replies	Views
Statistical significant stage best practice Questions	9	879	June 9, 2021
Using DVC to keep track of multiple model variants Questions	8	2615	August 21, 2020
DVC for analytics pipeline with runtime parameters and variable dependencies Questions	3	988	February 18, 2022
140-stage DVC pipeline getting hard to work with Questions	2	326	June 1, 2023
Coding Patterns Development	3	1092	April 28, 2020

Best practice for hyperparameters sweep

Related topics