Hello everyone, I have a git repo ({AI_Models_Repo}) in which I use dvc to track AI models. I used to have a single remote, linked to storagebox (ssh). I very recently added an S3 remote called s3-storage.
I have a second repo, ({B}), from which I want to pull some of these models when needed. After reading the documentation, it seemed that “dvc get” is what I needed. I also want to exclusively use the S3 remote from that repo to get the models.
The following command works fine:
dvc get {AI_Models_Repo} {model_path} --rev {REVISION} -o {output_path}
But it pulls the model I need from storagebox, so I’ve added the --remote s3-storage
flag:
dvc get {AI_Models_Repo} {model_path} --rev {REVISION} -o {output_path} --remote s3-storage
I’ve got the following error:
ERROR: unexpected error - [Errno 2] No storage files available: 'models/parts/yolov11/runs/segment/train/weights/best.pt'
If I remove the --rev {REVISION}
flag, it works great and gets the latest model from the s3 storage.
I’ve manually checked my S3 bucket (aws s3 ls), and I found the md5sum of the model that corresponds to the version (revision) I am trying to pull there.
I am not sure how to fix this. I think I am confusing a few things about dvc get
:
- Is it using the .dvc/config file of the {AI_Models_Repo} repository, instead of the dvc/config file of the repo I am starting the “dvc get” command from ({B})?
- If yes, is there a way to override this behavior? I want to always use ({B})/.dvc/config when I run
dvc get
from within ({B}). - Do I need to run any additional commands from the {AI_Models_Repo} repo to make sure all previous versions of my models are correctly uploaded to my new S3 remote and synced?
Thanks for your help and time!