ExactStep: Duality, Closed-Form, and Collapse

Abstract

We present a grand equivalence between three non-iterative training paradigms that make model fitting instantaneous on a broad class of architectures: (1) Data–Model Duality encodes the dataset directly as parameters and decodes for inference; (2) Closed-Form Optima solve for the optimal weights in one shot; (3) Holistic Collapse jumps to the same fixed point via a non-local, dataset-wide update. We prove and implement numerical identities showing that, on the training support, these three views produce identical predictions, and that the collapse fixed point equals the closed-form solution. Our system validates these equivalences with executable assertions at near machine precision across linear/ridge, kernel ridge (RBF), deep linear factorization, and an ELM (random hidden layer) architecture. Practical engineering completes the bridge from theory to runtime: SPD systems are solved via Cholesky, inputs are whitened to reduce condition numbers, kernels are fully vectorized, and Chebyshev–Lobatto nodes shrink 1D interpolation error to numerical noise. A one-liner API exposes drop-in training, and a timing harness shows instantaneous methods winning wall-clock versus iterative gradient descent even at small step counts, while matching or exceeding accuracy. These results support a unifying view: training can be treated as an explicit, reversible encoding or a one-shot fixed-point computation rather than a trajectory of gradient updates.

Grand Equation of ExactStep

For all training inputs $x$:

Data–Model Duality: $D(E(D), x)$
Closed-Form Optimum: $G(F(D, A), x)$
Holistic Collapse: $G(Fix_\theta[\theta + f(D, A, \theta)], x)$

We verify numerically that:

$D(E(D), x_i) = y_i$ for all training pairs $(x_i, y_i)$
$G(F(D, A), x) = G(Fix_\theta[\theta + f(D, A, \theta)], x)$
Idempotence: reapplying any branch is a no-op, and $\theta^\star + f(D, A, \theta^\star) = \theta^\star$

Features

Data–Model Duality
- Exact dictionary encoding with nearest-neighbor fallback off-support
- 1D barycentric interpolation; Chebyshev–Lobatto nodes for near–machine-precision equality
Closed-Form Maps (F)
- Linear and Ridge regression
- Kernel Ridge Regression (RBF) with vectorized Gram/cross-kernel
- Extreme Learning Machine (ELM) with random hidden layer and closed-form output
- Deep ELM (stacked random features + closed-form head)
Holistic Collapse (f) and Fixed-Point
- Linear/ridge collapse equals closed-form optimum
- Deep linear networks via balanced SVD factorization across L layers
- Linear autoencoders via PCA; collapse equals SVD baseline reconstruction
Numerics & Stability
- SPD solves via Cholesky (SciPy cho_factor/cho_solve or NumPy cholesky with triangular solves)
- Input whitening with configurable epsilon (whiten_eps) to reduce condition numbers
- Float64 everywhere; executable assertions with tight tolerances
Performance
- Timing harness shows instantaneous methods vs. 10/50-step GD baselines; prints relative MSE gap vs. closed-form

Quick Start

# From repo root
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip numpy scipy

# Run the demo suite (saves logs to results.txt)
python instant_train.py
cat results.txt

You should see “[✔] All checks passed” and an absolute path to the generated results.txt.

Transformer Toolkit (CPU)

Run the dataset demo for the Transformer scaffold (local import via PYTHONPATH):

PYTHONPATH=$(pwd) python examples/train_on_datasets.py \
  --copy_num 64 --copy_len 128 --copy_vocab 64 \
  --br_num 64 --br_len 128 --br_depth 32 --seed 0

This will print copy-task accuracy (~1.0 with ridge head) and bracket-depth MSE.

CLI

pip install -e .

# Version
exactstep --version

# Copy dataset demo
ti copy --n 64 --L 128 --V 64

# Bracket-depth demo
ti bracket --n 64 --L 128 --depth 32

# Deep ELM sweep (plots saved under transformer_instant/examples/figures)
exactstep deep-elm-bench \
  --depths 2,3,4 \
  --hidden 128,256,512 \
  --lambdas 1e-4,1e-3,1e-2 \
  --n-train 400 --n-test 300 --d 8 --seed 7

Docker (CPU)

docker build -t exactstep:cpu .
docker run --rm exactstep:cpu

# Save generated figures to host
CID=$(docker create exactstep:cpu); docker start -a "$CID"; \
  mkdir -p docker_figures; \
  docker cp "$CID":/app/transformer_instant/examples/figures ./docker_figures; \
  docker rm "$CID"

# Or bind-mount a host directory for figures
mkdir -p docker_figures
docker run --rm -v "$(pwd)"/docker_figures:/app/transformer_instant/examples/figures exactstep:cpu

Docs, Tests, Coverage

# Docs (Sphinx)
pip install -e .[docs]
sphinx-build -b html docs docs/build/html

# Tests & coverage
pytest -q
coverage run -m pytest && coverage report -m

One-Liner API

from instant_train import instant, ipredict
import numpy as np

X = np.random.randn(200, 5).astype(np.float64)
y = (np.sin(X[:,0]) + X[:,1]**2).astype(np.float64)
Xtest = np.random.randn(20, 5).astype(np.float64)

# Closed-form linear
Y_lin = instant(X, X @ np.random.randn(5,2), Xtest, arch="linear", fit_intercept=False)

# Ridge
Y_ridge = instant(X, X @ np.random.randn(5,2), Xtest, arch="ridge", lambda_reg=1e-2)

# Kernel ridge (RBF)
Y_krr = instant(X, y, Xtest, arch="kernel_ridge", length_scale=0.8, variance=1.0, lambda_reg=1e-3, whiten_eps=1e-12)

# ELM (closed-form) — set random_seed for parity with collapse
Y_elm = instant(X, y, Xtest, arch="elm", hidden_units=256, activation="tanh", lambda_reg=1e-3, random_seed=7)

# Duality dictionary (exact memory + NN fallback off-support)
Y_dual = instant(X, X @ np.random.randn(5,2), Xtest, arch="dict")

# Collapse variants
Y_coll_lin = ipredict(X, X @ np.random.randn(5,2), Xtest, arch="collapse:linear")
Y_coll_ridge = ipredict(X, X @ np.random.randn(5,2), Xtest, arch="collapse:ridge", lambda_reg=1e-2)
Y_coll_krr = ipredict(X, y, Xtest, arch="collapse:kernel_ridge", length_scale=0.8, variance=1.0, lambda_reg=1e-3, whiten_eps=1e-12)
Y_coll_elm = ipredict(X, y, Xtest, arch="collapse:elm", hidden_units=256, activation="tanh", lambda_reg=1e-3, random_seed=7)

CLI Demo and What It Proves

instant_train.py executes a battery of checks with numpy.testing.assert_allclose:

Duality: D(E(D), x_i) = y_i on train support
Linear/ridge: closed-form = collapse on train and random test matrices
Kernel Ridge (RBF): closed-form = collapse (same dual α)
ELM: closed-form = collapse given identical random_seed/activation/scale
Deep Linear: SVD-based L-layer factorization equals optimal end-to-end map
Autoencoder: collapse reconstruction equals PCA (SVD) baseline
Noisy data sanity: closed-form ≡ collapse; duality returns NN fallback off-support
Timing harness: prints runtime and relative MSE gap of GD-10/GD-50 vs. closed-form

At completion, the script prints “[✔] All checks passed” and the absolute path to results.txt.

Architecture Modes (arch)

Duality: dict, barycentric_1d
Closed-form: linear, ridge, kernel_ridge, elm, deep_elm
Collapse: collapse:linear, collapse:ridge, collapse:kernel_ridge, collapse:elm, collapse:deep_elm, collapse:autoencoder, autoencoder

Key kwargs:

lambda_reg: small L2 stabilizer (e.g., 1e-6 to 1e-3)
length_scale, variance: RBF kernel params
hidden_units, activation (tanh|relu), weight_scale, random_seed: ELM params
whiten_eps: epsilon floor for per-feature std in whitening

Numerics

SPD Cholesky solves for normal equations (linear/ridge), KRR dual, and ELM output
Vectorized RBF Gram and cross-kernel; no Python loops
Whitening reduces condition numbers; all computations in float64
Barycentric 1D uses Chebyshev–Lobatto nodes in demos; uniform-node tests use a looser tolerance

Repository Structure

instant_train.py: Facade, one-liner API, demo assertions, timing harness
closed_form_training.py: F(D, A) closed-form solvers (linear/ridge, KRR, ELM)
holistic_update.py: non-local collapse (linear/ridge; deep linear SVD; PCA autoencoder)
data_model_duality.py: exact dictionary model; 1D barycentric with robust errors
exactstep/: thin wrapper package re-exporting the public API and CLI entrypoint
transformer_instant/: hook-based, model-agnostic Transformer scaffold (CPU)
- utils.py: whitening, SPD solves, ridge/KRR/ELM, SVD helpers
- closed_form_head.py: frozen features → closed-form head (ridge/KRR/ELM)
- attention_solver.py: explicit Q/K/V, A=softmax(QK^T/√d), Z=AV, solve W_O
- lora_svd.py: one-shot LoRA via truncated SVD (rank-r update)
- pipeline.py: trainers for head, block collapse, and LoRA; attention wrapper
- hooks.py: optional PyTorch forward-hook utilities
- datasets/: long-range synthetic datasets (copy, bracket depth, reverse, running sum)
- examples/train_on_datasets.py: CPU demo training on the datasets
tests/test_transformer_instant.py: unit tests for the Transformer scaffold

Limitations & Future Work

Exploring broader kernels and structured features without losing vectorization
Integrating cross-validation for regularization and kernel parameters in one shot

License

This project is released under the PolyForm Noncommercial 1.0.0 license. You may use, modify, and distribute the code for noncommercial purposes only. Commercial use requires a separate license from the authors.

See LICENSE for details and definitions: https://polyformproject.org/licenses/noncommercial/1.0.0/

For common questions, see LICENSE-FAQ.md. For commercial licensing, contact itsparedezadrian@outlook.com.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github		.github
docs		docs
exactstep		exactstep
examples		examples
tests		tests
transformer_instant		transformer_instant
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
LICENSE-FAQ.md		LICENSE-FAQ.md
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
closed_form_training.py		closed_form_training.py
coverage.toml		coverage.toml
data_model_duality.py		data_model_duality.py
environment.yml		environment.yml
holistic_update.py		holistic_update.py
instant_train.py		instant_train.py
mkdocs.yml		mkdocs.yml
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

ExactStep: Duality, Closed-Form, and Collapse

Abstract

Grand Equation of ExactStep

Features

Quick Start

Transformer Toolkit (CPU)

CLI

Docker (CPU)

Docs, Tests, Coverage

One-Liner API

CLI Demo and What It Proves

Architecture Modes (arch)

Numerics

Repository Structure

Limitations & Future Work

License

About

Licenses found

Uh oh!

Releases

Packages

Languages

License

Licenses found

paredezadrian/ExactStep

Folders and files

Latest commit

History

Repository files navigation

ExactStep: Duality, Closed-Form, and Collapse

Abstract

Grand Equation of ExactStep

Features

Quick Start

Transformer Toolkit (CPU)

CLI

Docker (CPU)

Docs, Tests, Coverage

One-Liner API

CLI Demo and What It Proves

Architecture Modes (arch)

Numerics

Repository Structure

Limitations & Future Work

License

About

Resources

License

Licenses found

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages