Versioning not working with azure blob that is version aware

i have a folder containing 7000 images and i keep track of them with dvc and i am using a azure blob as my remote.

The blob is version aware so when i go to the blob i can see all images in human readable file format.

i add 1000 new images to the folder and run:

dvc add folder

i go to my folder.dvc file and i can see the names of all my files.

i then run:

dvc push

and the output is that its sending those 1000 images to the blob but after that’s done and i take a look at my folder.dvc file the old file names have been removed and only the names of the new 1000 images exist. when i go to the blob i can now only see the new 1000 images and the old images are deleted. If i run dvc add folder and dvc push again it sends the old 7000 images to the remote and removes the new 1000 images and removes the new 1000 images from the folder.dvc

Am i doing something wrong or misunderstanding something when using version aware or is this some kind of bug?

If i reinitiate dvc and change remote the same thing happens

Here is my dvc doctor output

DVC version: 3.42.0 (pip)


Platform: Python 3.8.13 on Linux-6.5.0-44-generic-x86_64-with-glibc2.10

Subprojects:

    dvc_data = 3.8.0 

    dvc_objects = 3.0.6 

    dvc_render = 1.0.1 

    dvc_task = 0.3.0 

    scmrepo = 2.0.4 

Supports:

    azure (adlfs = 2024.2.0, knack = 0.11.0, azure-identity = 1.15.0), 

    gdrive (pydrive2 = 1.20.0), 

    gs (gcsfs = 2024.2.0), 

    hdfs (fsspec = 2024.2.0, pyarrow = 7.0.0), 

    http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3), 

    https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3), 

    oss (ossfs = 2023.12.0), 

    s3 (s3fs = 2024.2.0, boto3 = 1.34.131), 

    ssh (sshfs = 2024.6.0), 

    webdav (webdav4 = 0.10.0), 

    webdavs (webdav4 = 0.10.0), 

    webhdfs (fsspec = 2024.2.0) 

Config:

    Global: /root/.config/dvc 

    System: /etc/xdg/dvc 

Cache types: symlink

Cache directory: ext4 on /dev/nvme0n1p2

Caches: local

Remotes: azure

Workspace directory: ext4 on /dev/nvme0n1p2

Repo: dvc, git

Repo.site_cache_dir: /var/tmp/dvc/repo/668d5746c2a68091019bea1a109aaea0

Hi @joachim. Support for version-aware remotes is actually being deprecated, so I would suggest migrating to a traditional remote.

Version-aware remotes can be confusing as you can see. If you have the blob configured correctly, DVC should be soft deleting the old files so they are still available for recovery later: Blob versioning - Azure Storage | Microsoft Learn. Regardless, we aren’t continuing to support this use case, so I would not recommend relying on it.

thank you i will do just that