I have a couple of DVC projects for ML model development.
I have been using the queue functionality for experiments quite a lot and it seems like all of a sudden queued experiments don’t work.
I can run an experiment to completion with dvc exp run
But the following fails:
dvc exp run --queue
dvc queue start
dvc queue logs {exp_id}
> ERROR: No output logs found for experiment {exp_id}
It fails a few seconds after starting. This happens across all my dvc projects running in different python environments.
I have access to two macs, a 2021 macbook pro and a 2024 M4 mac mini.
The macbook pro works
The mac mini shows the same issue you do
I tried re-creating the pyenv environment on the mac mini by piping pip list into a requirements.txt and then pip install -r requirements.txt on my mini. I’m using python==3.12.5 for both. Packages and python versions are the same.
This was attempted on the same repo on both machines.
I try dvc exp run --queue --set-param "train.epochs=7,8" and then dvc queue start and when I dvc queue status, both jobs immediately have failed with no logs. However running the same set of commands on my macbook, it works and the jobs run.
I have tried starting the jobs with dvc queue start --jobs 1 --verbose but no new information was gained.
My two dvc doctor commands produce slightly different results. The packages are the same but the configs slightly different: