Hi,
I am trying out dvc for one of my ML pipeline poc. I see dvc add command to keep track of changes in data files. How do i revert back to an older version of data files using dvc cli?
Thanks.
Hi,
I am trying out dvc for one of my ML pipeline poc. I see dvc add command to keep track of changes in data files. How do i revert back to an older version of data files using dvc cli?
Thanks.
After changing the file content you can return to any version of your data file using a combination of two command:
git checkout COMMIT
which reverts all code and all dvc metafiles (*.dvc
)dvc checkout
which gets all the corresponded data files from your cache.Get the previous commit example:
$ git checkout HEAD~1
$ dvc checkout
Sometimes you can get the “detached HEAD” issue. To avoid the issue please create a branch when you jump to an old commit:
$ git checkout HEAD~1 -b original_dataset
$ dvc checkout
PS: I’d recommend not to modify any data file which was added by dvc add file.txt
. Insted, please remove the files by dvc remove file.txt.dvc
and then add the file again.
Hm, this is not working. I did
git checkout HEAD~1
and
dvc checkout
But the local files still contain the last edits.
Hey @rmbzmb,
This thread is quite old, but the suggestion is still valid.
Keep in mind that in order to restore previous versions, you will need to checkout the git revision that contains the file version you’re interested in, you can find out the revisions in which the .dvc
files have changed by doing git log <yourfilename.dvc>
.
Please create a new issue with more details about your setup if you cannot get this to work.
Just in case someone still looking for it. By parsing the dvc file in the checkout works for me:
git checkout <old_commit> <file_name>.dvc