ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network

Official PyTorch implementation of ViStream (CVPR 2025)

ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network
Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He
Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025

Abstract

Visual streaming perception (VSP) involves online intelligent processing of sequential frames captured by vision sensors, enabling real-time decision-making in applications such as autonomous driving, UAVs, and AR/VR. However, the computational efficiency of VSP on edge devices remains a challenge due to power constraints and the underutilization of temporal dependencies between frames. While spiking neural networks (SNNs) offer biologically inspired event-driven processing with potential energy benefits, their practical advantage over artificial neural networks (ANNs) for VSP tasks remains unproven.

In this work, we introduce a novel framework, ViStream, which leverages the Law of Charge Conservation (LoCC) property in ST-BIF neurons and a differential encoding (DiffEncode) scheme to optimize SNN inference for VSP. By encoding temporal differences between neighboring frames and eliminating frequent membrane resets, ViStream achieves significant computational reduction while maintaining accuracy equivalent to its ANN counterpart. We provide theoretical proofs of equivalence and validate ViStream across diverse VSP tasks, including object detection, tracking, and segmentation, demonstrating substantial energy savings without compromising performance.

Demo videos showcasing ViStream's tracking performance on various scenarios are available in the demo_videos/ directory.

The core ViStream implementation can be found in model/spike_quan_layer.py and model/spike_quan_wrapper.py.

Model Checkpoint

The model checkpoint file is hosted on Hugging Face due to its large size (292MB).

Download Instructions

You can download the checkpoint file using one of the following methods:

Method 1: Using wget/curl

# Download the checkpoint file
wget https://huggingface.co/AndyBlocker/ViStream/resolve/main/checkpoint-90.pth

Method 2: Using Hugging Face Hub

# Install huggingface_hub if not already installed
pip install huggingface_hub

# Download using Python
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='AndyBlocker/ViStream', filename='checkpoint-90.pth', local_dir='.')"

Method 3: Using Git LFS (after cloning the HF repo)

# Clone the Hugging Face repository
git clone https://huggingface.co/AndyBlocker/ViStream
# Copy the checkpoint to your project directory
cp ViStream/checkpoint-90.pth ./

After downloading, make sure the checkpoint file is placed in the root directory of this project.

Running Experiments

To run inference experiments, use the eval.sh script. The script contains various test commands for different tracking tasks:

VOS (Video Object Segmentation): Uncomment the test_vos.py lines
MOT (Multiple Object Tracking): Uncomment the test_mot.py lines
MOTS (Multiple Object Tracking and Segmentation): Uncomment the test_mots.py lines
Pose Tracking: Uncomment the test_posetrack.py lines

Usage: Uncomment the desired experiment lines in eval.sh, then run:

bash eval.sh

Citation

If you find this work useful for your research, please cite:

@inproceedings{you2025vistream,
  title={VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network},
  author={You, Kang and Wei, Ziling and Yan, Jing and Zhang, Boning and Guo, Qinghai and Zhang, Yaoyu and He, Zhezhi},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={8796--8805},
  year={2025}
}

Acknowledgments

This project is based on UniTrack with improvements for energy-efficiency. The energy consumption and SOP evaluation code is adapted from syops-counter.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
config		config
core		core
data		data
demo		demo
detector/YOLOX		detector/YOLOX
docs		docs
energy_consumption_calculation		energy_consumption_calculation
eval		eval
model		model
test		test
tools		tools
tracker		tracker
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval-MOT16.sh		eval-MOT16.sh
eval.sh		eval.sh
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network

Abstract

Model Checkpoint

Download Instructions

Method 1: Using wget/curl

Method 2: Using Hugging Face Hub

Method 3: Using Git LFS (after cloning the HF repo)

Running Experiments

Citation

Acknowledgments

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network

Abstract

Model Checkpoint

Download Instructions

Method 1: Using wget/curl

Method 2: Using Hugging Face Hub

Method 3: Using Git LFS (after cloning the HF repo)

Running Experiments

Citation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages