Once again, thanks for open sourcing the tool !
Remarks on the tutorial
- The size of the data used in the tutorial is a bit large (i.e. featurization step requires more than 8GB of RAM and is a bit unwieldy on basic laptops). You might want to consider using a smaller one.
- In the
Running in a bulksection, and possibly some other, the output of the command shows
Reproducing. However, when ran for the first time, dvc simply outputs
Running command. For consistency, you may want to fix this.
Other questions and suggestions
- When there is nothing to reproduce and we run
dvc repro, nothing happens. It would be nice to display a message stating that there is indeed nothing to be done
- When a .dvc file already exists and dvc asks :
'data/XXX.dvc' already exists. Do you wish to run the command and overwrite it? (y/n)and one replies
no, it would be better to change the message to something like
Not overwriting: 'data/XXX.dvc'
Failed to run command: 'data/XXX.dvc' already exists