Programatic access to experiments


What is the best way to programatically access experiments? I want to write a script which goes through all of the experiments on a particular ref e.g. HEAD, and collects some data / information from the files there. The files could be git or dvc tracked. I have lots of output from my pipeline, so it is more than I would want to put in a metrics json. It seems like DVCFileSystem (DVCFileSystem) might do this? Would I just need to get a list of all the experiment revs, then loop over them and create a DVCFileSystem for each one?


Hey Greg,

Currently there is no exp-specific api like that, but if all you need is an fs-like interface to experiments then yes, creating a DVCFileSystem for each rev would do the trick. Please let us know how it goes for you or if you’ll run into any problems.

1 Like

Thanks for the advice.

This is what I ended up doing:

repo = Repo(".")
# determine current branch
branch = repo.scm.active_branch()
# get experiments on this commit
experiments =[branch]
for exp_name in experiments:
    print(f"collecting data for {exp_name}")
    # make dvc file system for this experiment
    fs = DVCFileSystem(rev=exp_name)

then accessing files, e.g. pandas.read_csv through the dvc fs.