Nice. Would be good to mention that the data file itself will not be checked in to git, but rather by dvc, so it is present in .gitignore.
(post deleted by author)
Hi [deleted user], have you imported the dataset to your local machine using dvc get
? As in:
$ dvc get https://github.com/iterative/aita_dataset aita_clean.csv
I can’t tell if you’ve done this step yet. If you didn’t, then the file won’t be in your local workspace and pandas won’t be able to import it. After doing dvc get
, you should be able to open Python and run
df = pd.read_csv("aita_clean.csv")
Cool, please be in touch with any results. I always like hearing about them
Also, you might be interested in the DVC python API- you can do dvc.api.open
to load the file from DVC storage directly to your Python environment https://dvc.org/doc/api-reference/open#:~:text=Description,by%20DVC%20or%20by%20Git.