I am new to DVC and evaluate it in a proof of concept implementation for our ML projects, which seems to fit perfectly! But I encounter a problem with dvc.api.open while using an Azure Blob Storage.
I am able to use dvc push and dvc pull, but by using dvc.api.read I get “AttributeError: ‘NoneType’ object has no attribute ‘account_key’” (see attached screenshots). If the file is downloaded with dvc pull and it is available in the cache folder everything works.
Can anyone point me to the problem or my misunderstanding? I want to use the streaming functionality, as we have very large files and do not want to store them on the storage of a virtual machine.
It looks like the problem here is related to a known limitation in DVC, where local config settings (such as those set via dvc remote modify --local ...) are not used by certain DVC commands, including api.open() and api.read().
The good news is that this limitation has been addressed in an upcoming release, however we are not currently planning to backport these changes into DVC 1.11.x.
Would you mind trying the pre-release version (see: https://dvc.org/blog/dvc-2-0-pre-release#install) from pip, and checking if that resolves your issue? When you install the prerelease version, don’t forget to also install the azure dependency