Recovering pushed files after losing .dvc reference

Let’s say I’m using DVC and I:

  1. dvc add my_folder/
  2. dvc push
  3. Delete my_folder.dvc
  4. Delete all DVC cache contents

Without the reference provided by my_folder.dvc and no backup using git, is it still possible for me to see if my_folder has been uploaded and list it’s contents? Once I have listed this folder and some of it’s details, can I also download it’s contents?

1 Like

Hi @Seanny123, yes, it might be not easy (depending on the size of the remote storage), but it’s possible. I wrote an answer on StackOverflow describing how to recover directory content from the DVC cache, but since remote structure is the same you can apply it more or less the same way.

Please, read it here, check the comments and let us know if you have any other questions.

That question seems to cover the scenario where you still have a cache. However, if I’ve deleted the cache is there any way to recover the file from AWS S3?

Hi @Seanny123, the cache directory itself is what gets pushed to remote storage. I.e. what you see on the S3 remote is a copy of the local cache. (It may be a partial copy though, depending on how dvc push was used.)

1 Like