We will be talking about DVC at PyCon 2019 May 4th in Cleveland ! Please come and stop by our booth on Saturday, May 4th. We will be happy to chat!
We’ve launched the DVC Patreon campaign - it’s one of the ways to support the project if you like it.
Now, let’s highlight the changes (not including bug fixes, and minor improvements) we have done in the last few months:
We received a lot of feedback that using Git branches is not always an optimal way to manage experiments. We have added an option to support Git tags (Git commits are coming). The new option
--all-tagsis supported by all DVC commands that support
Get started guide has been simplified (e.g. to use tags instead of branches) and extended. We have also prepared a Github DVC project that reflects the sequence of steps in the “get started” guide. You can now download the whole project and reproduce all the models.
dvc diff command introduced . Summary statistics for the directory/file under the DVC control. How many files were added/deleted/modified/size:
(HEAD)$ tree image (HEAD^)$ tree image images images ├── color.png └── grey.png └── grey.png $ dvc diff -t images HEAD^1 diff for 'images' -images with md5 ad0a6adcd409cae3263b28487064e1f2.dir +images with md5 283215dface0d41291482330324632fc.dir 1 file not changed, 0 files modified, 1 file added, 0 files deleted, size was increased by 15.3 MB
We’ve introduced the dvc commit command and
dvc run/repro/add --no-commitflag to give a way to avoid uncontrolled cache growth and as a way to save some
dvc reproruns. In the future we plan to have “do-not-cache-my-data” as a default mode for
SSH remotes (data storage) support - config options to set port, key files, timeouts, password, etc + improved stability and Windows support! Introduced HTTP remotes - external dependencies and as a read-only cache.
Control over where DVC files are located in your project - place them wherever you want with the
-foption supported by all relevant commands -
dvc run, and
A lot of UI improvements . Starting from the finally fixed nasty issue with Windows terminal printing a lot of garbage symbols, to using progress bars for checkouts, better metrics output, and lots of smaller things:
Performance optimizations. The most notable one is the migration from using the plain JSON file to the embedded SQLLite engine to cache file and directory checksums, another one is improved performance, stability and general user experience for the commands that navigate tags or branches (all the commands that include
There are new DVC integrations and plugins available:
- Finally there is an official Bash and Zsh completion for DVC!
- David Příhoda contributed and is developing the JetBrains IDEs plugin (PyCharm, IntelliJ, etc).
Don’t hesitate to like\star DVC repository if you haven’t yet. We are waiting for your feedback!