Dvc external output add after changing files data in remote is failing

I know that, after changing data being tracked by dvc, we can use “dvc add” command and then “dvc push” to github. I can use different .dvc files to get back data using “dvc pull”.

The same when am trying to do with dvc external output, it is not working,
I’m using remote storage and remote cache. And adding data of remote using “add --external” command.

  • dvc add --external remote://s3remote/wine-quality.csv # tracks data, creates a cache folder in remote
  • git push

Now, am changing data in remote place, and I want to track the news changes,

  • dvc add --external remote://s3remote/wine-quality.csv
    [I’m using custom hosted Minion s3 bucket, to make changes am deleting the data file and uploading new one with same name with changes in data]
    This is failing with following error,

FYI,

  • config file
    [cache]
    s3 = s3cache
    [‘remote “s3remote”’]
    url = S3://datasource-bucket/
    endpointurl = http://localhostminio:10009/
    access_key_id = user
    secret_access_key = password
    use_ssl = false
    [‘remote “s3cache”’]
    url = s3://datasource-bucket/cache/
    endpointurl = http://localhostminio:10009/
    access_key_id = user
    secret_access_key = password
    use_ssl = false

It seems like you initially added data in a folder named remoteTrack and now you are trying to add it again from a different folder. This is trying to create a new stage but then it fails since the stage already exists. You can do either go to the remoteTrack\ folder and use the same command or specify the filename that belongs to the previous stage via --file remoteTrack\wine-quality.csv.dvc.

1 Like

Thanks @isidentical :slight_smile: