Programatic access to experiments

gregstarr · January 18, 2023, 12:38am

Hello,

What is the best way to programatically access experiments? I want to write a script which goes through all of the experiments on a particular ref e.g. HEAD, and collects some data / information from the files there. The files could be git or dvc tracked. I have lots of output from my pipeline, so it is more than I would want to put in a metrics json. It seems like DVCFileSystem (DVCFileSystem) might do this? Would I just need to get a list of all the experiment revs, then loop over them and create a DVCFileSystem for each one?

Thanks,
Greg

kupruser · January 18, 2023, 5:06pm

Hey Greg,

Currently there is no exp-specific api like that, but if all you need is an fs-like interface to experiments then yes, creating a DVCFileSystem for each rev would do the trick. Please let us know how it goes for you or if you’ll run into any problems.

gregstarr · February 1, 2023, 7:31pm

Thanks for the advice.

This is what I ended up doing:

repo = Repo(".")
# determine current branch
branch = repo.scm.active_branch()
# get experiments on this commit
experiments = repo.experiments.ls()[branch]
for exp_name in experiments:
    print(f"collecting data for {exp_name}")
    # make dvc file system for this experiment
    fs = DVCFileSystem(rev=exp_name)

then accessing files, e.g. pandas.read_csv through the dvc fs.

Topic		Replies	Views
Accessing experiments in any arbitrary branch / commit Questions	4	167	March 3, 2025
Looking for Workflow Suggestion Questions	2	184	December 21, 2023
Create and run multiple experiments from python Questions	2	453	February 1, 2023
Track all experiments with a separate directory Questions	1	420	April 14, 2021
Dvc exp show: experiment not showing / wrong position Questions	14	1780	August 17, 2022

Programatic access to experiments

Related topics