are external dependencies not recommended? They seem important for large datasets, but not being able to ignore files makes them significantly less convenient to use. DVC sees that the mtime of .DS_Store files is recent and so wants to re-hash the whole folder. This is somewhat contradictory to this post Maximum data size - #4 by skshetry which implies that we won’t have to rehash large datasets often