I’m developing a Machine Learning group with a colleagues. The local area network infrastructure will be:
- Git server (one machine)
- Computing-data server (another machine)
- Local machines for users
The way of working will be:
- Each user will have their own git-repos in their local machines (with they authentication keys in their local machines).
- Data and scripts have to be transferred to the computing server (data can be temporally in the server, only while is needed by the ML scripts)
- Users launch the ML operations from their local machines via SSH in the computing server
We are performing some tests, and from now, we are doing synchronization operations in the DVC chain: first ,code and data are synched between the local machine and the computing server via SSH. After, the training is launch via SSH. Next, the outputs are transfered back from server to local machines, so that DVC can track the changes, and son on…
Questions and issues (among others):
- The path for the *.dvc file can be specified? (one path for each model)
- Can DVC be in the computing server while users have their git-repos in the local machines?
Thank you very much for your effort, we really appreciate your work.