Hi, I have a script that takes hours to complete. I already run it outside dvc and collected its output. Is there a way to add a stage to the dag with that script avoiding to run it again, pointing the right input/output/dependencies? In other words, since computation takes too much time, i would like to skip it and let dvc do the hashes and complete the run without actually executing the script. Is it possible?
You could possibly try out --no-exec parameter to create dvc run (this would not execute the stages, but rather create/modify the dvc.yaml) and use dvc commit to lock your files in.
It seems the solution, but I am using multiple stages generation feature writing directly the dvc.yaml file. Is it possible to skip execution with dvc repro?
@mauro, you can just do dvc commit, which will generate the dvc.lock file and store the outputs to dvc's cache. This way, it should not run again on repro, do verify it with dvc status first though.