Hi,
Say I have a DVC repository where I store a large data file (call it datafile.dat
). I have some code files tracked in the corresponding git repository that allow me to generate the file datafile.dat
.
I do a lot of modification in my codes so that I generate numerous versions of datafile.dat
and my DVC repository is becoming very large (regarding storage space).
I do not want to keep very old versions of the file datafile.dat
, only recent ones.
Would that be possible to locally and remotely remove all versions of datafile.dat
prior to a given git commit (call it a9f6454
)? thus intentionally breaking dvc checkout
if I do a git checkout
to a commit prior to a9f6454
.
I saw this earlier post mentioning dvc gc
about removing old versions of data files from local or remote cache.
However, when I try to use dvc gc
(maybe not with the correct options), I am not able to remove prior versions of my file datafile.dat
, that is still tracked, and I do not want to remove it and re-add it because I want to keep the recent history of modifications.
Thanks in advance