Restructuring docs

It is obvious that the structure of the docs needs some revising, but it is not so clear what the best structure would be. I would try these things:

  1. Create a top level section for “Contributor Guide” or “Developer Guide” and move there the sections:

    • /user-guide/development
    • /user-guide/contributing
    • /user-guide/contributing-documentation

    However the section “How to report a problem” should still be on the User Guide.
    Later, more relevant sections may be added to the Developer Guide, for example to explain the structure/architecture of the code, maybe to explain design design decisions and the reasons behind them (pointing to the relevant discussion threads, etc.)
    I am not sure yet whether DVC is extensible (whether one can develop plugins or extensions for it), but if yes, then this would be the right place to explain how this can be done.

  2. Move “Use Cases” inside the “User Guide”. I think that these “use cases” are actually useful for helping or guiding the user on how to use properly DVC, so they should a part of the User Guide. If the term “use case” sounds a bit technical (and it is actually a bit overloaded with other meanings), they may be called “Configuration Examples” or “Usage Examples”. But the term “Use Cases” is fine too.

    I think that use cases are very helpful for the user because they reduce the amount of the possibilities and options that he has to consider in order to setup his workspace properly. So, we should consider adding a few more use cases which hopefully should cover all the possible ways that DVC can be setup and used. It is OK if there is some small overlap among the use cases (for example a particular situation can be covered by more than one use case).

    If we have a comprehensive set of use cases, then for each query by the users about how to best setup DVC, we should be able to refer him to a use case that matches his situation. If we have to explain him the details of what he should do, then we should consider revising our use cases or adding a new one.

  3. Create a separate section for the tutorials and move there the examples at the of “Get Started”. Move inside it the “Tutorial” a well. Actually there is some overlap among the tutorials and some parts of them need to be updated (because the code does not work as expected). So, they need to be cleaned up and updated. Other tutorials may be added by adopting blog posts about DVC or adapting blog posts about DS/ML in general.

    Tutorials describe the solution of a particular problem (from the point of view of how to use DVC). They are practical (hands-on) and if the user follows the instructions correctly, everything should work. This is a bit different from use cases and user guides, which may have detailed explanations, but not actionable examples (they may have example commands, but you cannot copy-paste them and expect them to work).

    User Guide and the Command Reference should not try to include actionable examples inside their pages. Rather they should refer to one or more tutorials that make use of that command or that uses the concepts that are being explained. This may also be an indicator of how good the set of tutorials is. If all the command pages and user guide pages can refer to some tutorial that uses the command or the concept, then everything is fine. If there is some command or concept that is not cover by the existing tutorials, then we should consider extending some of the tutorials or adding some new ones.

  4. “Installation” should be a top level and separate section, placed before the “Tutorials”.

    By the way, “Tutorials” should be placed before the “Use Guide”, “Developer Guide” and “Command Reference”.

    Also, “Understanding DVC” should be better renamed to “About DVC” (and it is already connected to the front page, I think).

  5. The “Get Started” part should be simplified further from some detailed/advanced explanations and from links/references to more advanced sections. Actually, it can also be replaced by a set of Katacoda interactive tutorials, which should be very basic, hands-on and learn-by-doing style (https://github.com/iterative/dvc.org/issues/546).

    Their aim should be just to give a feeling to a new user about what DVC is and what it does, about its main features etc. without going to much details. If the user gets interested/excited, the next step would be to install DVC and to start with the tutorials.

  6. The sections inside the “User Guide” need to be restructured, rearranged or expanded. For example see these discussions:

    There seems to be a lot of work that can be done here, but one thing that seems obvious to me right now is that maybe the sections “external-dependencies” and “external-outputs” should be merged into the same section (called for example “external-data”) since they are closely related. This may help to make the explanations a bit more consistent. This page may also list some of the different scenarios/use-cases where external data is used, and link to the relevant use-case pages that has mode detailed explanations for each use-case.

One thing that I would like to note is that the Waterfall process does not seem to work well with the docs too (same as with software development and DS/ML development). So, we should not aim for a perfect plan and execution, but let’s start by doing and experimenting, and then correcting and fixing and reverting (if necessary), until the docs seem to be in a good shape.

During my GSoD period I may not be able to finish all of these, but I may try to do as much as possible. I would like to try them in this order:

  1. Build a set of basic interactive tutorials with Katacoda.
  2. Separate the installation page and simplify the Get Started tutorial.
  3. Fix the general structure of the docs (merge Use Cases with the User Guide, separate Contributor Guide from User Guide, make a section for the Tutorials, etc.)
  4. Work on cleaning up and improving the “external-data” of the User Guide and its related use-cases.
  5. Work on deduplicating and updating the existing tutorials.
  6. Add more use-cases and tutorials.
  7. Update the pages of the commands to refer to the appropriate tutorials (or interactive tutorials in Katacoda).
    8 . Try to describe and integrate “DVC packages” on the docs.

Please let me know if you have any comments or suggestions.

2 Likes

Hi, @dashohoxha!

Great summary, thank you!

I would suggest that we start with the User Guide and Understanding DVC refactoring as we discussed - there are tickets for this I believe.

Re the top level sections. For each of them, I would start by creating a second level section in the User Guide first. Then we will decide which of them it makes sense to move to the top level.

Please, use Github and split the plan into some manageable issues/tickets. It’s very hard to discuss and give feedback on 6 large items at once. It’s not good to have two places for this kind of discussions.

1 Like

I have already started with Katakoda, but it is still WIP:

If you clone this repo to https://github.com/iterative/katacoda-scenarios (or https://github.com/iterative/katacoda-tutorials or something like this), and give me access to it, I can continue working there. Please don’t review it yet until I am done, because I may change and restructure it several times, until I think it is in a good shape.

Please let me finish the Katacoda tutorials first, because this would also help me to get a more intimate knowledge of the details of DVC. Besides this, at the end of November I will have to submit my work to GSoD, and it seems better to submit something that is clearly my work, rather than a bunch of commits that are intermixed with the work of other people.

I think that by the end of this month the Katacoda scenarios will be done, and next month I can start with the User Guide. However it seems that there are several people involved with the docs, and it is not so clear to me what are the responsibilities of each one and what are my responsibilities. I am afraid that we may step onto the toes of each-other if we work on the same files at the same time.

I agree on this.

I will do it. Right now I am working on this one: https://github.com/iterative/dvc.org/issues/546

1 Like

If you clone this repo to https://github.com/iterative/katacoda-scenarios

Done!

Please let me finish the Katacoda tutorials first, because this would also help me to get a more intimate knowledge of the details of DVC

Sure! Np.

I think this is the case with any more or less popular/large project. At the end, it’s my job to make sure that work is split sanely. Also, it’s absolutely normal to work on a single file in certain cases - that’s what merge is for after all.

Besides this, at the end of November I will have to submit my work to GSoD, and it seems better to submit something that is clearly my work, rather than a bunch of commits that are intermixed with the work of other people.

Any open-source by definition is a collaborative work where people work on top of previous stuff. So, I would not afraid to submit something that also has someone’s else name on it. Usually, it’s not a problem to see the value of individual contributions in this “mix”.

1 Like

I also need to fix a webhook on the settings. Or I can show you how to do it.

That’s right. I don’t think that GSoD can look at the details of the work of each person, maybe they just want to make sure that people are not abusing with the stipend and with the program. But as long as the mentors say that they have worked properly, everything should be OK.

1 Like

I’ve provided you admin rights for this repo! Feel free to configure it properly.

1 Like