I’m trying triggering a pipeline to run DVC and download the data from GCP Storage but the log of GitHub Actions returns the following error:
ERROR: unexpected error - Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object., 401
I think this happens due to giving the right permissions to the Service Account but the one that I’m using has the Storage Object Viewer, which gives the permission I need.
Here is part of my pipeline file
- name: Setup Cloud SDK
uses: google-github-actions/setup-gcloud@v0.2.0
with:
project_id: ${{ secrets.GCP_PROJECT }}
service_account_key: ${{ secrets.GCP_KEY }}
export_default_credentials: true
- name: CML Run
shell: bash
env:
repo_token: ${{ secrets.GITHUB_TOKEN }}
GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GCP_KEY }}
run: |
# run-cache and reproduce pipeline
dvc remote add -d -f myremote gs://myproject/
dvc pull mypath/data.csv.zip.dvc
dvc repro -m
# Report metrics
echo "## Metrics" >> report.md
git fetch --prune
dvc metrics diff main --show-md >> report.md
# Publish confusion matrix diff
echo -e "## Plots\n### Confusion Matrix" >> report.md
cml-publish $PWD/mypath/reports/confusion-matrix.png --md >> report.md
cml-send-comment report.md
Can you please share the verbose traceback about the run of the command that is raising that error? There is also a possibility that the google storage backend we use doesn’t recognize the login method and fallbacking to anonymous authentication, so setting the credentialpath
to the tokenfile might work (via dvc remote modify --local
).
Sure! Here is the traceback
Run # run-cache and reproduce pipeline
Setting 'mlops-talks' as a default remote.
_request non-retriable exception: Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object., 401
Traceback (most recent call last):
File "/home/runner/.local/lib/python3.8/site-packages/gcsfs/retry.py", line 110, in retry_request
return await func(*args, **kwargs)
File "/home/runner/.local/lib/python3.8/site-packages/gcsfs/core.py", line 332, in _request
validate_response(status, contents, path)
File "/home/runner/.local/lib/python3.8/site-packages/gcsfs/retry.py", line 97, in validate_response
raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object., 401
ERROR: unexpected error - Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object., 401
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
Error: Process completed with exit code 255.
It seems like it is possible that gcsfs
is not recognizing your credentials and falling back to the anonymous
login. Can you try to do the same with setting the credentialpath
to your service account file and test it again?
Sure! I did this but here is another error:
ERROR: unrecognized arguments: *** *** *** *** *** *** *** *** *** *** ***
usage: dvc remote modify [-h] [--global | --system | --project | --local]
[-q | -v] [-u]
name option [value]
Here it is with the credentialpath
dvc remote modify --local myremote credentialpath $GOOGLE_APPLICATION_CREDENTIALS
credentialpath
takes a path where the credentials file is stored, from what I understand the $GOOGLE_APPLKICATION_CREDENTIALS
is the contents of that file, so you should probably write it off to a temporary file (e.g /tmp/creds.json
) and then pass that file as the argument
Well, I’m doing this using GitHub Actions, that’s why I’m using that $GOOGLE_APPLICATION_CREDENTIALS
which is a secret with the contents of that json file. Can I use that configure the credentials using github secrets?
1 Like