ERROR: unexpected error - [Errno 5] Input/output error

Hi,

“dvc status” is crashing with the error “ERROR: unexpected error - [Errno 5] Input/output error”. I’ve tried adding “–verbose” to the command but that doesn’t print any extra information.

“dvc fetch” works, but “dvc diff” fails with the same error as above. I assume something is corrupted in the cache or workspace, but I don’t know how to figure out what. I tried running dvc under pdb but that didn’t trap the error, I guess it must be happening in a worker thread. This is a big repo (497GB) so it’s hard to check manually – do you have any tips for debugging?

Thanks
Peter

Should have added: DVC is version 2.8.1 - I can’t go any higher because this platform is running python 3.6

@peter_ga can you post the output for the command dvc doctor here, and then also try running:

dvc status -v --pdb

Thanks for the tip, that popped up a PDB shell at the right place

> /usr/local/lib/python3.6/dist-packages/dvc/utils/__init__.py(30)_fobj_md5()
-> data = fobj.read(LOCAL_CHUNK_SIZE)

From fobj I could find the file that was causing trouble, and when I try to run “md5sum” on the command line I also get “md5sum: ${file}: Input/output error”.

So the problem is definitely some corruption on disk.

Previously I’d run DVC from within PDB, so the “–pdb” flag was what I needed – thanks!

1 Like