This is the Python codebase for MiCo Framework.
MiCo is an end-to-end framework that helps you train/explore/deploy mixed precision quantized models.
conda create -n mico_env python=3.10
conda activate mico_env
conda install pygmo
# You can check the file to select which packages to install
pip install -r requirements.txtIf you encounter ModuleNotFoundError when trying to import local packages here, before you run the code:
export PYTHONPATH=$PYTHONPATH:.To run Mixed Precision Search:
Check examples, run the training code first and use the trained model for MPQ Search.
For example:
python examples/lenet_mnist.py # Train LeNet on MNIST
python examples/lenet_mnist_search.py # MPQ Search on trained LeNetGeneral Scripts:
# For General Script Usage
python examples/mpq_train.py -h
python examples/mpq_search.py -hTo use the CodeGen, check the code to change the models/datasets/precisions:
python MiCoCodeGen.pyTo compile the inference code after generating the model header with the CodeGen:
git submodule update --init
cd project
make clean
make MAIN=main TARGET=<host, vexii, rocket, spike> OPT=<unroll, simd>To run the inference on your host machine after compilation:
make run-hostTo run the inference simulation on the VexiiRiscv after compilation:
Check the VexiiRiscv document, load the elf from project to the simulator.
| Model | Layers | MPQ Search | MPQ Deploy (C) | MPQ Deploy (DNNWeaver) |
|---|---|---|---|---|
| MLP | Linear | Supported | Supported | Supported |
| HARMLP | Linear | Supported | Supported | Supported |
| LeNet | Linear, Conv2D | Supported | Supported | Supported |
| AlexNet | Linear, Conv2D | Supported | Supported | Supported |
| CmsisCNN | Linear, Conv2D | Supported | Supported | Supported |
| VGG | Linear, Conv2D | Supported | Supported | Supported |
| ResNet | Linear, BottleNeck (Conv2D) | Supported | Supported | Supported |
| MobileNetV2 | Linear, BottleNeck (Conv2D) | Supported | Supported | Supported |
| SqueezeNet | Linear, Conv2D | Supported | Supported | Supported |
| ShuffleNet | Linear, Conv2D | Supported | Supported | Not Yet |
| LLaMa | Transformers (Linear) | Supported | Supported | Not Yet |
| ViT | Transformers (Linear) | Supported | Not Yet | Not Yet |
| HuggingFace Models | Transformers (Linear) | Supported | Not Yet | Not Yet |
| M5 | Linear, Conv1D | Supported | Supported | Not Yet |
| KWSConv1d | Linear, Conv1D | Supported | Supported | Not Yet |
| DS CNN | Linear, Conv2D | Supported | Supported | Supported |
Currently MiCo includes the following datasets:
- MNIST
- Fashion MNIST
- CIFAR-10
- CIFAR-100
- ImageNet (requires external download)
- TinyStories
- UCI HAR (wearable sensors)
- SpeechCommands (keyword spotting)
- WikiText / WikiText-2 / WikiText-103
- HuggingFace Text Datasets (W.I.P.)
(SpeechCommands requires a few more packages and libraries to install)
Here are the main components/modules of MiCo.
Basics
MiCoUtils: Utilities for MiCo framework, including layer replacing, exporting, etc..MiCoModel: Basic model class of MiCo, offering unified training/testing methods, and the layer-wise bitwidth assignment method.MiCoQLayers: Fundamental quantized layer classes for MiCo models and quantization functions.MiCoEval: Model evaluation for MiCo models, evaluating accuracy, BOPs, MACs, end-to-end latency results.MiCoAnalysis: Various statistics for quantized models.MiCoDatasets: Dataset loaders for all supported datasets.MiCoLLMEval: Evaluation utilities for LLM models, including token agreement, generation comparison, and perplexity analysis.DimTransform: Dimension transformation utilities for reducing search space in large models.
Codegen
MiCoCodeGen: C code generator for MiCo models, with automatic memory pool optimization.MiCoGraphGen: DNN Weaver Op graph generator for MiCo models.MiCoLLaMaGen: C code generator for MiCo TinyLLaMa models.MiCoRegistry: Registry pattern for extensible PyTorch operation handlers in code generation.
Searchers
searchers.MiCoSearcher: Main MPQ searcher of MiCo framework (Random Forest / XGBoost / Bayesian regressors).searchers.BayesSearcher: Bayesian Optimization searcher with Gaussian Process.searchers.HAQSearcher: Hardware-Aware Quantization searcher.searchers.NLPSearcher: NLP-inspired searcher.searchers.RegressionSearcher: Generic regression-based searcher.searchers.DDPG: DDPG reinforcement learning based searcher.searchers.Ensemble: Ensemble Bayesian optimization models.
Hardware-Aware
SimUtils: Invoke simulations for BitFusion or VexiiRiscv hardware.MiCoProxy: Hardware latency proxy models for MiCo, BitFusion, and host targets.
examples: Example scripts for MPQ training/searching.deploy: Example scripts for hardware-aware end-to-end MPQ search-deploy flow.searchers: Implementations of MPQ searching algorithms.models: MPQ models.profiler: Scripts for hardware profiling and adaptive sampling (require hardware submodules).project: C project templates for MPQ inference on CPUs (require MiCo Library submodule).benchmark_results: Profiled hardware kernel datasets for hardware-aware proxy models.hw: Hardware-specific implementations (VexiiMico, BitFusion).tests: Unit and integration tests.predict: Experiment scripts for MPQ prediction and ablation studies.doc: Documentation for advanced features and integrations.TinyStories: TinyStories dataset utilities and tokenizer.
- Chipyard Integration
- Registry Pattern Usage
- Memory Optimization
- MiCo Searcher Algorithm
- LLM Evaluation Methods
Please refer to the paper for details.
@inproceedings{jiang2025mico,
title={MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI},
author={Jiang, Zijun and Lyu, Yangdi},
booktitle={Proceedings of 2025 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)},
pages={1--9},
year={2025},
organization={IEEE}
}ucb-bar/Baremetal-NN (For Codegen with Torch FX Interpreter):
https://github.com/ucb-bar/Baremetal-NN
mit-han-lab/haq (For HAQ Searcher Implementation):
https://github.com/mit-han-lab/haq
weiaicunzai/pytorch-cifar100 (For torch models):
https://github.com/weiaicunzai/pytorch-cifar100
karpathy/llama2.c (For LLaMa2 and TinyStories scripts and c codes):
https://github.com/karpathy/llama2.c
Check our Roadmap to see what's on the plan!
Generated with Gemini-2.5 Flash.