This repository contains code and notebooks for local ML experiments.
.gitignoreto ignore common Python, notebook, and environment files.requirements.txt(core runtime packages) andrequirements-dev.txt(dev tools).run_local.ps1— PowerShell helper to set up a local venv and install dependencies on Windows.tests/test_smoke.py— a minimal smoke test to verify the repo layout.LICENSE(MIT) andCONTRIBUTING.md.- GitHub Actions workflow at
.github/workflows/python-app.ymlthat runs lint & tests.
These commands assume you are using PowerShell (pwsh.exe) on Windows and Python 3.10+ is installed and on PATH.
- Create and activate a virtual environment
python -m venv .venv
# PowerShell activation
.\.venv\Scripts\Activate.ps1
# (if you get an execution policy error, run as Administrator and: Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned)- Install core requirements
pip install --upgrade pip
pip install -r requirements.txt
# Dev tools (optional)
pip install -r requirements-dev.txt- Run the example script or notebook
# Run the main script (if present)
python .\"phi-ai.py"
# Start Jupyter (recommended for notebooks)
jupyter notebook phi-ai.ipynb
# or
jupyter labPyTorch is not pinned in requirements.txt because the correct wheel depends on your CUDA toolkit. Follow the official instructions at https://pytorch.org/get-started/locally/ and pick the appropriate command.
CPU-only example (simple and safe):
pip install --index-url https://download.pytorch.org/whl/cpu torch torchvision --extra-index-url https://pypi.org/simpleGPU example (CUDA 11.8) — replace cu118 with the version for your GPU/driver:
pip install --index-url https://download.pytorch.org/whl/cu118 torch torchvision --extra-index-url https://pypi.org/simpleAfter installing, test GPU detection in Python:
python -c "import torch; print('torch', torch.__version__); print('cuda available:', torch.cuda.is_available())"ONNX (https://github.com/onnx) and ONNX Runtime provide a portable, often faster way to run ML models locally on Windows (CPU and GPU). The steps below cover installing ONNX runtime, exporting a PyTorch model to ONNX, and running inference with onnxruntime from PowerShell/Python.
- Install ONNX Runtime and utilities (CPU example):
pip install onnx onnxruntime onnxruntime-toolsGPU example (install the GPU-enabled ONNX Runtime that matches your CUDA version) — consult ONNX Runtime docs at https://onnxruntime.ai:
# Example (replace with the correct package / version for your CUDA toolchain):
pip install onnx onnxruntime-gpu- Export a PyTorch model to ONNX (simple example). Make sure a
models/directory exists first:
mkdir models -ErrorAction Ignore
python - <<'PY'
import torch
model = torch.hub.load('pytorch/vision:v0.14.0', 'resnet18', pretrained=True).eval()
dummy = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy, 'models/resnet18.onnx', opset_version=13, input_names=['input'], output_names=['output'])
print('Exported models/resnet18.onnx')
PY- Run inference with ONNX Runtime (Python example):
python - <<'PY'
import numpy as np
import onnxruntime as ort
img = np.random.rand(1,3,224,224).astype(np.float32)
sess = ort.InferenceSession('models/resnet18.onnx', providers=['CPUExecutionProvider'])
out = sess.run(None, {'input': img})
print('Output shapes:', [o.shape for o in out])
PYNotes and tips:
- Match preprocessing (normalization, resizing) used when the model was trained.
- Use
onnxruntime-toolsoronnxruntime.transformersto optimize, quantize, or benchmark models for better performance. - For transformer / Hugging Face models, check
transformersoroptimumexporters which can produce ONNX models ready foronnxruntime. - Keep exported ONNX files in
models/and avoid committing large model files to Git; add them to.gitignore(already present).
ONNX Runtime includes GenAI tooling that can run generative models (the "Generate" API) locally and is used in ONNX Runtime GenAI tutorials such as the Phi models guide: https://onnxruntime.ai/docs/genai/tutorials/phi3-python.html
Quick guidance for Windows users:
- Review the official tutorial (link above) for the exact packages, supported model names, and up-to-date installation steps. The GenAI features sometimes require separate runtime packages or wheels that match your CUDA / GPU drivers.
- In general you'll need:
onnxandonnxruntime(oronnxruntime-gpufor GPU),- the ONNX Runtime GenAI extension / toolkit (see the tutorial for the exact package name/version), and
- an exported ONNX model (the tutorial shows how to obtain or export Phi-family ONNX models).
Example (high-level, pseudo-code) — follow the tutorial for exact API and package names:
# Install (example — check the tutorial for exact package names/versions)
pip install onnx onnxruntime onnxruntime-tools
# or for GPU: pip install onnx onnxruntime-gpu <genai-package># PSEUDO-CODE: adapt from the ONNX Runtime GenAI tutorial (this is illustrative)
from onnxruntime import GenAIModel # (see tutorial for the correct import)
# load a Phi-family model (name and availability depend on what the tutorial / model hub exposes)
session = GenAIModel.from_pretrained("phi-3")
result = session.generate("Write a short poem about autumn", max_tokens=80)
print(result.text)Notes:
- Always use the exact install and import commands shown in the ONNX Runtime GenAI tutorial linked above — the GenAI API surface and package names can change across releases.
- Some models require credentials or specific licensing to download; follow the model provider's rules.
- For best performance, run ONNX Runtime with the appropriate execution provider (CPU, CUDA, DirectML) and consider quantization/optimization tools described in the ONNX Runtime docs.
- Keep heavy packages (like
torchwith GPU support) installed manually so you can match CUDA and drivers. - Keep data and model artifacts out of git — use
data/andmodels/folders (already in.gitignore). - If your repository's main Python file uses a dash in its name (
phi-ai.py), run it as a script (python "phi-ai.py") instead of importing it as a module.
After activating the venv and installing dev requirements:
pip install -r requirements-dev.txt
pytest -qA GitHub Actions workflow is provided to lint with flake8 and run pytest on push/PR to main.
This project is licensed under the MIT License. See LICENSE.