New personal best for number of times a file is rehashed "only once"

I think I must be doing something wrong, because when adding new remote files they seem to always get rehashed “only once” several times, which really hurts because they’re big.

One example below with dvc 3.18, you can see it was hashed twice, the scp’ed, then hashed again, each of these seemed to transfer the data once, so essentially 4X slower than needed. Then there was one final local (fast) hash. Dvc file attached below.

Any way to improve this? Thanks.

stages:
  download_mockobs:
    cmd: scp scott.grid.uchicago.edu:/sptlocal/analysis/eete+lensing_19-20/resources/sims/planck2018/mockobs/3.3.1.3.1/output/flatsky/seed2/Coadd_allfields_090ghz_flatsky.g3 Coadd_allfields_090ghz_flatsky.g3
    deps:
    - ssh://scott.grid.uchicago.edu/sptlocal/analysis/eete+lensing_19-20/resources/sims/planck2018/mockobs/3.3.1.3.1/output/flatsky/seed2/Coadd_allfields_090ghz_flatsky.g3
    outs:
    - Coadd_allfields_090ghz_flatsky.g3

Hi @marius, unfortunately I don’t have a good solution for you, but it’s on our radar as one of the highest-priority performance improvements we need to make. We first need to finish up some architectural changes to make improvements possible, though. Can you clarify whether that file is inside or outside your repo and whether it is a dependency, an output, or both?