I have to keep track of models that use the same feature engineering, the same features, the same algorithm, but a different subset of data e.g.the raw data is the same but then it’s filtered to a certain group and then make a model.
My repo structure looks like this
root |- models |- model_group1 |- model.xgb | params.json |- model_group2 |- ... |- ... |- metrics |- model_group1 |- model_eval.json |- model_group2 |- ... |- ...
AFAIK DVC assumes the pipeline produces one model, which makes total sense.
Making one model in which the group is a variant is not possible in my case.
Any suggestions? do I need multiple DVC pipelines in the same repo?