stages:
gather_pool_list:
cmd: python pipeline_scripts/gather_pool_list.py
–data_dir ${general.data_dir}
${gather_pool_list}
deps:
- pipeline_scripts/gather_pool_list.py
- code_data_collecting/data_fetcher.py
outs:
- pipeline_data/gather_pool_list/pools.csv
- s3://general-info.pool.csv:
cache: false
dvc repro -s gather_pool_list -v
raise:
2025-02-12 12:16:32,455 ERROR: failed to reproduce ‘gather_pool_list’: ‘x-amz-bucket-region’
…
File “C:\Users\grshn.conda\envs\torch_python39\lib\site-packages\s3fs\core.py”, line 359, in get_s3
return await self._s3creator.get_bucket_client(bucket)
File “C:\Users\grshn.conda\envs\torch_python39\lib\site-packages\s3fs\utils.py”, line 53, in get_bucket_client
region = response[“ResponseMetadata”][“HTTPHeaders”][“x-amz-bucket-region”]
KeyError: ‘x-amz-bucket-region’
I’m using yandex s3 (not amazon).
dvc status
2025-02-12 12:17:34,663 ERROR: unexpected error - ‘x-amz-bucket-region’
File “C:\Users\grshn.conda\envs\torch_python39\lib\site-packages\s3fs\core.py”, line 359, in get_s3
return await self._s3creator.get_bucket_client(bucket)
File “C:\Users\grshn.conda\envs\torch_python39\lib\site-packages\s3fs\utils.py”, line 53, in get_bucket_client
region = response[“ResponseMetadata”][“HTTPHeaders”][“x-amz-bucket-region”]
KeyError: ‘x-amz-bucket-region’
From my environment I can access files from s3fs:
s3 = s3fs.S3FileSystem(anon=False)
s3.ls(‘general-info’)
return:
[‘general-info/pools.csv’]
aws cli also works.
How can I use external output? I tried use s3 dependencies - same error.