“dvc status” is crashing with the error “ERROR: unexpected error - [Errno 5] Input/output error”. I’ve tried adding “–verbose” to the command but that doesn’t print any extra information.
“dvc fetch” works, but “dvc diff” fails with the same error as above. I assume something is corrupted in the cache or workspace, but I don’t know how to figure out what. I tried running dvc under pdb but that didn’t trap the error, I guess it must be happening in a worker thread. This is a big repo (497GB) so it’s hard to check manually – do you have any tips for debugging?
Thanks for the tip, that popped up a PDB shell at the right place
> /usr/local/lib/python3.6/dist-packages/dvc/utils/__init__.py(30)_fobj_md5()
-> data = fobj.read(LOCAL_CHUNK_SIZE)
From fobj I could find the file that was causing trouble, and when I try to run “md5sum” on the command line I also get “md5sum: ${file}: Input/output error”.
So the problem is definitely some corruption on disk.
Previously I’d run DVC from within PDB, so the “–pdb” flag was what I needed – thanks!