Skip to content

A template for collaborative development and reproducible results in data science and machine learning projects.

License

Notifications You must be signed in to change notification settings

Excidion/reproML

Repository files navigation

reproML

python Copier uv Ruff ty mkdocs-material

A toolset for collaborative development and reproducible results in data science and machine learning projects.

Prerequisites

Make sure you have uv installed. For example via pip:

pip install uv

Usage

You can initialize a project from the command line. Just replace my_new_project with the name of the folder that should be created for the project.

uvx copier copy --trust gh:Excidion/reproML my_new_project

You wil then be guided through a short questionaire. Depending on your choices, it will generate a structure that looks something like this:

├── data               <- All data files belong into one of this folders subfolder
│   ├── raw            <- The original, unedited data dump
│   ├── interim        <- Intermediate data that has been or is being transformed
│   └── processed      <- The data sets used for modeling
│
├── docs               <- Project documentation
│   ├── index.md       <- Landing page, describe the project and team.
│   ├── context.md     <- Document context and goals.
│   ├── model.md       <- Document modeling from data to ML.
│   ├── notebooks/     <- Your most polished notebooks, integrated into the docs
│   ├── code/          <- Automatically generated code documentation
│   └── structure.md   <- Document tools and technical organization.
│
├── models             <- Trained and serialized models and other artifacts
│   └── logs           <- Logfiles from training and prediction
│
├── notebooks          <- Jupyter notebooks
│
├── references         <- Data dictionaries, manuals, and helper materials.
│
├── reports            <- Generated analysis as HTML, PDF, etc.
│   └── figures        <- Generated graphics and figures to be used in reports
│
├── src                <- Source code for use in this project.
│   ├── data           <- Scripts to download, process or generate data
│   ├── features       <- Functions to turn data into features
│   ├── model          <- Scripts for training and prediction
│   └── visualization  <- Scripts to create visualizations
│
├── pyproject.toml     <- Project configuration and dependencies.
│
└── README.md          <- The top-level README for developers using this project.