This package provides a simple wrapper around Data Version Control (DVC).
You can install the released version of dvc from CRAN with:
# install.packages("dvc") # (Not on CRAN yet)
Or install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("andrewcstewart/dvc")
This is a basic example which shows you how to solve a common problem:
library(dvc)
## basic example code
# install dvc
dvc::install_dvc()
# setup dvc in your current project
dvc::use_dvc()
# tell dvc to track a file
write.csv2(x = mtcars, file = "mtcars.csv")
dvc::add(path = "mtcars.csv")
# setup remote storage
dvc::remote_add(name = "myremote", url = "s3://my-bucket/dvc-storage")
dvc::push()
dvc::pull()
Data Version Control, or DVC, is a data and ML experiment management tool that takes advantage of the existing engineering toolset that you’re already familiar with (Git, CI/CD, etc.). See the official documentation for a full overview of DVC’s functionality.
The purpose of this package is to aid in using DVC from within R. For example, you may want to run DVC commands from an RMarkdown file as part of an analysis. The primary focus of the package currently implements the data tracking functionality of DVC.