The size of the data used in the tutorial is a bit large (i.e. featurization step requires more than 8GB of RAM and is a bit unwieldy on basic laptops). You might want to consider using a smaller one.
In the Running in a bulk section, and possibly some other, the output of the command shows Reproducing. However, when ran for the first time, dvc simply outputs Running command. For consistency, you may want to fix this.
Other questions and suggestions
When there is nothing to reproduce and we run dvc repro, nothing happens. It would be nice to display a message stating that there is indeed nothing to be done
When a .dvc file already exists and dvc asks : 'data/XXX.dvc' already exists. Do you wish to run the command and overwrite it? (y/n) and one replies no, it would be better to change the message to something like Not overwriting: 'data/XXX.dvc'
rather than Failed to run command: 'data/XXX.dvc' already exists
The size of the data used in the tutorial is a bit large (i.e. featurization step requires more than 8GB of RAM and is a bit unwieldy on basic laptops). You might want to consider using a smaller one.
Agreed, we are currently working on simplifying the tutorial.
In the Running in a bulk section, and possibly some other, the output of the command shows Reproducing . However, when ran for the first time, dvc simply outputs Running command . For consistency, you may want to fix this.
This is actually done on purpose, since reproduce means that it has been run once and this is why we print ‘Running command’ the first time and ‘Reproducing’ after that.
When there is nothing to reproduce and we run dvc repro , nothing happens. It would be nice to display a message stating that there is indeed nothing to be done
When a .dvc file already exists and dvc asks : 'data/XXX.dvc' already exists. Do you wish to run the command and overwrite it? (y/n) and one replies no , it would be better to change the message to something like Not overwriting: 'data/XXX.dvc'
rather than Failed to run command: 'data/XXX.dvc' already exists
I agree ! I was merely remarking that when one runs the tutorial, presumably one runs the code for the first time and the output should show Running command. However the tutorial displays Reproducing command
Ah, sorry, didn’t notice it Great catch! Thank you for the feedback! We are actually preparing an update for the tutorial and will be sure to change the messages there as we go.