ERROR: failed to pull data from the cloud - CI-CD pipeline

Hi,
I am getting this error “ERROR: failed to pull data from the cloud - Checkout failed for following targets:” with some files. The details are below

  • Creating CI-CD pipeline for ML Model
    commands in git workflow yaml file are below
  - name: Dvc pull updated files
      run: | 
         dvc pull -v
    - name: Run pipeline
      run: |
         dvc repro -f 
2023-02-21 17:58:37,096 ERROR: failed to pull data from the cloud - Checkout failed for following targets:
/home/runner/work/MLOTest/MLOTest/data/processed/churn_train.csv
/home/runner/work/MLOTest/MLOTest/data/processed/churn_test.csv
/home/runner/work/MLOTest/MLOTest/models/model.joblib
/home/runner/work/MLOTest/MLOTest/data/external/train.csv
/home/runner/work/MLOTest/MLOTest/data/raw/train.csv

dvc version: 2.45.1

Please give your inputs to resolve the issue

Thanks,
Veerendra

Hi @Veerendra , if that is the first time you are running the Pipeline, the error is expected as there are no files in the remote yet.

There is a feature request for adding a flag to ignore those error in dvc checkout --ignore-missing; build workspace even if some data is missing in local cache · Issue #4746 · iterative/dvc · GitHub .

As a workaround, you can update your workflow to ignore the error and still run the pipeline:

    - name: Dvc pull updated files
      continue-on-error: true
      run: | 
         dvc pull -v 
    - name: Run pipeline
      run: |
         dvc repro -f 

Or instead, run dvc pull -v ||

1 Like

The problem is fixed. Thank you Daavoo.

Thank you. The issue is resolved.