We are evaluating DVC to standardize our MLOps workflows, specifically for data versioning with replication in Azure.
Technical Context:
- Proposed Architecture: Exclusive use of native DVC commands (
init
,add
,push
,pull
) without integration with DVC Cloud. - Infrastructure: Storage in Azure Blob Storage (variable costs based on capacity and transfer).
Core Question:
Are there additional costs associated with using DVC in this configuration? Based on our research, DVC’s core is open-source and operates as a Git extension, but we seek confirmation:
- Licensing: Are there licensing costs when using only traditional commands?
- Dependencies: Are there mandatory external services (e.g., GitHub Enterprise) that incur expenses?
Preliminary Analysis:
Based on technical documentation, DVC imposes no direct costs beyond external storage. However, we aim to validate this interpretation with the community.
Key Elements Incorporated:
- Cost Classification:
- Variable Costs: Azure storage (usage-dependent).
- Fixed Costs: None identified for DVC in this configuration.
- Infrastructure Reference:
- Use of Azure as a remote repository, aligned with documented implementation cases.
- Technical Structure:
- Separation between native DVC components and external services (e.g., DVC Cloud).